008 revised notes intro to datapresentation

50
Biological Weapons Proliferation Prevention Program Biological Threat Reduction Program Introduction to Data Presentation TRNEPI-00152

Upload: dhanush-pillai

Post on 08-Jul-2016

223 views

Category:

Documents


0 download

DESCRIPTION

bb

TRANSCRIPT

Page 1: 008 Revised Notes Intro to DataPresentation

Biological WeaponsProliferation Prevention ProgramBiological Threat Reduction Program

Introduction to Data Presentation

TRNEPI-00152

Page 2: 008 Revised Notes Intro to DataPresentation

2

Learning objectives

Define different types of variables Create and interpret one and two variable

tables Create and interpret a line graph Create and interpret one and two variable bar

charts Describe when to use each type of table,

graph, and chart

Page 3: 008 Revised Notes Intro to DataPresentation

3

Why organize data?

Many records Look for trends and relationships Get familiar with data before analysis Catch errors Communicate findings to others

Page 4: 008 Revised Notes Intro to DataPresentation

4

How to organize data

Identify what type of data you have Determine what you need to communicate with

the data Summarize using tables, graphs, and/or charts

Page 5: 008 Revised Notes Intro to DataPresentation

5

Variable: definition

What is observed or measured in the way people differExamples:

age height hair color smoking

Page 6: 008 Revised Notes Intro to DataPresentation

6

Continuous(real-valued)e.g. height

Discrete(count data)e.g. number

of admissions

Ordinal(ordered)

e.g. response to treatment

Nominal(not ordered)e.g. ethnic

group

Quantitativemeasurement

Variable

Qualitativeor categorical

Types of Variables

Page 7: 008 Revised Notes Intro to DataPresentation

7

Types of VariablesCategorical

Nominal OrdinalSex Nationality Status  M Yemen MildM Jordan ModerateF Yemen SevereM Jordan MildF Sudan ModerateF Yemen MildM Sudan ModerateM Iran SevereF Jordan SevereM Iran MildF Yemen ModerateF Sudan ModerateM Iran MildM Yemen Severe

Quantitative

Discrete ContinuousChildren Weight 1 56.41 47.82 59.93 13.11 25.71 23.02 30.03 13.72 15.42 52.51 26.61 38.21 59.02 57.9

Page 8: 008 Revised Notes Intro to DataPresentation

8

Why Does it Matter?

Categorical and quantitative variables are statistically summarized and presented in different ways

Variable Type Data Presentation

Quantitative Graphs, Tables

Categorical Charts, Tables

Page 9: 008 Revised Notes Intro to DataPresentation

Biological WeaponsProliferation Prevention ProgramBiological Threat Reduction Program

Tables

Page 10: 008 Revised Notes Intro to DataPresentation

10

Tables: Characteristics

Data is arranged in rows and columns Presentation is simple and self-explanatory

Title Label each row and column Show totals for rows and columns Include units of measure (yrs, mg/dl) Explain codes in footnote

Page 11: 008 Revised Notes Intro to DataPresentation

11

Simple Frequency Distribution

Age group (years) Number of Cases<14 230

15-19 437820-24 1040525-29 961030-34 864835-44 690145-54 2631>55 1278

Total 44081

Primary and secondary syphilis morbidityby age, United States, 1989

Page 12: 008 Revised Notes Intro to DataPresentation

12

Determining Class Intervals

The intervals must be mutually exclusive and encompass all data.

For preliminary analysis a large number of intervals (4-8) is used. These intervals can then be consolidated.

Use standard or frequently applied intervals (for instance, up to the age of 19, 20-24 years, 25-29 years, etc.).

A category must be provided to accommodate unknown values (for instance “age unknown.”)

Page 13: 008 Revised Notes Intro to DataPresentation

13

Two Variable Table

Page 14: 008 Revised Notes Intro to DataPresentation

14

Format for 2 X 2 Table

Ill Well TotalExposed a bUnexposed c dTotal

Page 15: 008 Revised Notes Intro to DataPresentation

15

Format for 2 X 2 Table

Dead Alive Total

Diabetic 100 89 189

Non-diabetic 811 2340 3151

TotalTotal 911911 24292429 3340

Follow-up status among diabetic and nondiabetic white men NHANES, 1982-1984

Page 16: 008 Revised Notes Intro to DataPresentation

Biological WeaponsProliferation Prevention ProgramBiological Threat Reduction Program

Graphs and Charts

Page 17: 008 Revised Notes Intro to DataPresentation

17

Charts and Graphs: Advantages

Easier to understand and interpret Get a good feel for the data before formal

analysis Reveal patterns in data

Used to generate hypothesis

Page 18: 008 Revised Notes Intro to DataPresentation

18

Graphs: Types

Arithmetic-scale line graphs In-set graphs Histograms Frequency Polygons Cumulative Frequency Curve Scatter diagram

Page 19: 008 Revised Notes Intro to DataPresentation

19

Graphs

0

0.5

1

1.5

2

2.5

3

3.5

1 2 3 4 5 6 7

Independent Variable

Depe

nden

t Var

iabl

eTitle

Page 20: 008 Revised Notes Intro to DataPresentation

20

Types of Variables

Dependent Describe outcome of interest

Examples: Dead, cancer, ill

Independent May cause or contribute to variation of the

dependent variable Not influenced by dependent variable

Examples: Time, age, packs of cigarettes, cholesterol levels

Page 21: 008 Revised Notes Intro to DataPresentation

21

Arithmetic-Scale Line Graph

Source: CDC, National Notifiable Diseases Surveillance System

40

30

20

10

01950 1960 1970 1980 1990

Incidence of Hepatitis A, United States, 1952-1993

Rat

e /1

00,0

00

Year

Page 22: 008 Revised Notes Intro to DataPresentation

22

Arithmetic-Scale Line Graph: Characteristics

Method of choice for plotting rates over time Set distance on graph represents same quantity

anywhere on the axis Horizontal graph x:y ratio is 5:3

Y-axis should start with 0 Determine largest value of Y needed to plot Round off that number and divide into

intervals

Page 23: 008 Revised Notes Intro to DataPresentation

23

Arithmetic-Scale Line Graph

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

40.0

45.0

50.0

<1 1-4 5-910

-1415

-1920

-2425

-2930

-3435

-3940

-4445

-4950

-5455

-5960

-64 65+

19961997199819992000

Registered Death Rates by Age and Year, 1996-2000

Rat

e pe

r 100

0 po

pula

tion

Age Categories (Years)

Page 24: 008 Revised Notes Intro to DataPresentation

24

Inset Graph

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

40.0

45.0

50.0

<1 1-4 5-910

-1415

-1920

-2425

-2930

-3435

-3940

-4445

-4950

-5455

-5960

-64 65+

19961997199819992000

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

1-4 5-910-14

15-1920-24

25-2930-34

35-3940-44

45-49

Registered Death Rates by Age and Year, 1996-2000

Rat

e pe

r 100

0 po

pula

tion

Age Categories (Years)

Page 25: 008 Revised Notes Intro to DataPresentation

25

Inset Graph: Characteristics

A magnified portion of the larger, or host, graph

Can see data in better detail Smaller graph is “inset” into the larger

graph Variables remain the same

Independent data points do not change (e.g. age categories will remain in 5-year segments)

Page 26: 008 Revised Notes Intro to DataPresentation

26

Histograms

Frequency of measles by week of onset Dec 6, 2000 to May 16, 2001

Page 27: 008 Revised Notes Intro to DataPresentation

27

Histograms: characteristics

Graph of the frequency distribution of a continuous variable

Columns are adjoining

Area of each column is proportional to number of observations in that interval

Page 28: 008 Revised Notes Intro to DataPresentation

28

March 13 March 14Onset (3-hour periods)

Page 29: 008 Revised Notes Intro to DataPresentation

29

Page 30: 008 Revised Notes Intro to DataPresentation

30

Histograms using continuous data

Page 31: 008 Revised Notes Intro to DataPresentation

31

Frequency Polygon

0

10

20

30

40

50

60

70

80

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Week

Cas

es

Cases

Cases-FP

Example of a Frequency Polygon

Page 32: 008 Revised Notes Intro to DataPresentation

32

Frequency Polygon: Characteristics

Graph of entire frequency distribution of a continuous variable

Number of events in interval plotted at midpoint of interval

Straight line connects points Useful to compare two or more

distributions on the same axis

Page 33: 008 Revised Notes Intro to DataPresentation

33

Frequency Polygon

Relative frequency of serum cholesterol level by age

Page 34: 008 Revised Notes Intro to DataPresentation

34

Cumulative FrequencyCumulative incidence of hepatitis B virus infection

by duration of high-risk behavior

0102030405060708090

100

0 1 2 3 4 5 6 7 8 9 10 11 12Years at Risk

Perc

ent i

nfec

ted

with

HB

V

IV Drug Users Homosexual Men Heterosexuals - multiple parters

Page 35: 008 Revised Notes Intro to DataPresentation

35

Scatter Diagram

Relationship between age in years and heavy metal X exposure

Page 36: 008 Revised Notes Intro to DataPresentation

36

Charts: Types

Appropriate for categorical data Bar charts

Simple Grouped Stacked

Pie charts

Page 37: 008 Revised Notes Intro to DataPresentation

37

Simple Bar Chart Annual Death Rates by Govornorate, 1996-2000

0 50 100 150 200 250 300 350 400

AQABA

ZARQA

AMMAN

JARAS

MAFRQ

MADAB

TAFEL

IRBID

BALQA

KARAK

AJLON

MAANN

Gov

orna

orat

e

Rate per 100,000 population

Page 38: 008 Revised Notes Intro to DataPresentation

38

Bar Charts: Characteristics

Display data from one-variable table Each variable is represented by a bar Bars are proportional to the number of events Can be presented vertically or horizontally

Page 39: 008 Revised Notes Intro to DataPresentation

39

Vertical Bar Chart Qualitative Ordinal Variable

0

5

10

15

Mild Moderate Severe

Distribution of Cases by Clinical Status

Cas

es

Clinical Status

Page 40: 008 Revised Notes Intro to DataPresentation

40

Grouped Bar Chart

Race

Freq

uenc

y

Treatment completion and cure of disease X in various racial groups, 1994-2000

0200400600800

1000120014001600

Race A Race B Race C Race D

CasesCompletionCure

Page 41: 008 Revised Notes Intro to DataPresentation

41

Grouped Bar Chart: Characteristics

Illustrate data from two variable or three variable tables

Bars within groups are usually adjoining Bars between groups have a space Limit number of bars within group to less than

four

Page 42: 008 Revised Notes Intro to DataPresentation

42

Stacked Bar Chart

0100200300400500600

1992 1993 1994 1995 1996

OthersFalciparum

Cases of malaria in a region, 1992-1996

Time

Case

s

Page 43: 008 Revised Notes Intro to DataPresentation

43

Pie Chart

Page 44: 008 Revised Notes Intro to DataPresentation

44

Anti-HAV Prevalence

High

Intermediate

Low

Very Low

Geographic Distribution of Hepatitis A Virus Infection

Page 45: 008 Revised Notes Intro to DataPresentation

45

Page 46: 008 Revised Notes Intro to DataPresentation

46

Selecting the Right Presentation Method (1)

Type of Graph or Diagram Application

Arithmetic Scale Graph

Inset Graph

Histogram

Data or indicator trends over time.

View a larger image of a portion of the host

graph

1.Frequency distribution for a continuous variable.

2. Number of cases during an epidemic (epidemic curve) or over time.

Page 47: 008 Revised Notes Intro to DataPresentation

47

Selecting the Right Presentation Method (2)

Type of Graph or Diagram Application

Frequency Polygons

Cumulative Frequency Curve

Scatter Plot

Simple Bar Charts

Frequency distribution of a continuous variable for displaying components

Display cumulative frequency of a quantitative variable

Plot the relationship between 2 variables – looking for any correlation.

Compare the size or frequency of different categories of the same variable.

Page 48: 008 Revised Notes Intro to DataPresentation

48

Selecting the Right Presentation Method (3)

Type of Graph or Diagram Application

Grouped Bar Chart

Stacked Bar Chart

Pie Chart

Compare the sizes or frequencies of different categories across 2-4 data sets

Compare totals and display component parts for several data groups

Display parts of a whole

Page 49: 008 Revised Notes Intro to DataPresentation

49

Selecting the Right Presentation Method (4)

Type of Graph or Diagram Application

Spot Map

Area Map

Display locations of cases or occurrences

Display occurrences or indicators as they correspond to geographic divisions

Page 50: 008 Revised Notes Intro to DataPresentation

50

Summary

Tables, charts, and graphs are effective tools for organizing, summarizing, and communicating data

In order to effectively communicate data, the correct presentation method must be selected