eas 293 data library, rutherford north 1 st floor chuck humphrey data library october 14, 2008

32
EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Post on 22-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

EAS 293

Data Library, Rutherford North 1st FloorChuck Humphrey Data Library October 14, 2008

Page 2: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Outline Statistics and data

Distinction between statistics and data Statistics are derived from data Statistics are about definitions Census characteristics

E-STAT access Online demonstration of access to

CANSIM and the 2006 Census of the Population

Page 3: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Numeric Information

Statistics numeric facts/figures created from data, i.e,

already processed presentation-ready

Data numeric files created

and organized for analysis/processing

requires processing not display-ready

Page 4: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Numeric Information

Six dimensions or variables in this tableThe cells in the table are the number ofestimated smokers.

Geography

Region

TimePeriods

Unit of Observation Attributes

Smokers

Education

Age

Sex

Page 5: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Statistics are about definitions!

Statistics are dependent on definitions. You may think of statistics as numbers, but the numbers represent measurements or observations based on specific definitions.

Tables are structured around geography, time and content based on attributes of the unit of observation. These properties all need definitions.

Page 6: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Statistics involve classifications!

ClassificationsSex

Total

Male

Female

Periods

1994-1995

1996-1997

Page 7: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Some classifications are based on standards while others are based on convention or practice.

For example, Standard Geography classifications

Statistics involve classifications!

Page 8: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

WHERE ARE THE DATA!

Page 9: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Microdata

Page 10: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Stories are told through statistics

The National Population Survey had over 80,000 respondents in 1996-97 sample and the Canadian Community Health Survey in 2005 had over 130,000 respondents. How do we tell the stories about these people?

We use statistics to create summaries of these life experiences.

Data enable us to construct the tables or analyses to tell these summarized stories.

Page 11: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Methods producing data

Observational Methods

Experimental Methods

Computational Methods

Focus is on developing observational instruments to collect data

Focus is on manipulating causal agents to measure change in a response agent

Focus is on modeling phenomena through mathematical equations

Correlation Causation Prediction

Replicate the analysis (same data or similar)

Replicate the experiment

Replicate the simulation

Statistics summarize observations

Statistics summarize experiment results

Statistics summarize simulation results

Page 12: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Summary

Statistics are derived from observational, experimental or simulated data .

A table is a format for displaying statistics and presents a summary or one view of the data.

Tables are structured around geography, time and attributes of the unit of observation.

Statistics are dependent on definitions and classifications.

Statistics summarize individual stories into common or general stories.

Page 13: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

The Census The Census is one of the most important sources

of statistical information about Canada. It is the largest survey conducted in Canada and, consequently, is the primary source for small area statistics.

To use data from the Census, you must know: The aggregate characteristics from the Census

available for the various spatial units; The variety of spatial units used to disseminate

Census results; and The codes used to represent the various Census

spatial units.

Page 14: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Census of Population Two forms are used to collect the Census: 2A,

which goes to 80% of the households, and 2B, which goes to the other 20%.

In 2006, the 2A form contained 8 questions while the 2B form had these 8 and 53 additional questions.

Long history of specific questions (see the Census Handbook.)

You need to understand the content of the Census to know what statistics are possible from the Census.

Page 16: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Microdata and aggregate data

Microdata• from observational

methods• created from the

respondents in a survey

Aggregate Data• statistics organized in a

data file structure• derived from microdata

sources• used in GIS & time

series analysis

Page 17: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Spatial Unit

Geo-code

Page 18: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Geo-referenced data

The unit analysis makes up the rows in the data file and is the object being

described by the other variables the file. The values for this variable are geo-

codes for Census tracts.

Page 19: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Geo-referenced data

This case in the data file represents Census Tract 0023.00, which was shown

in the image two slides earlier.

Page 20: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

The variety of spatial units Statistics Canada groups the variety of spatial

units associated with the Census into two groups:

Source for the graphics: Illustrated Glossary, 2006 Census Geography, Statistics CanadaSource: Illustrated Glossary, 2006 Census Geography, Statistics Canada

Page 21: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Administrative areas

Source: Illustrated Glossary, 2006 Census Geography, Statistics Canada

Page 22: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Statistical areas

Page 23: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008
Page 24: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Census geo-codes

Statistics Canada has two categories of geo-code systems: Standard Geographic Classification (SGC) Other geographic entities

Source for the graphic: Illustrated Glossary, 2006 Census Geography, Statistics Canada

Page 25: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Standard geographic classification

Source: Illustrated Glossary, 2006 Census Geography, Statistics Canada

Page 26: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008
Page 27: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Standard geographic classification, 2006The link to Definitions, data sources and methods on the main page of the Statistics Canada website provides a link to Standard Classifications, which includes Geography.

Page 28: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Other geographic entities

Census Metropolitan Areas

Source for the graphic: Illustrated Glossary, 2006 Census Geography, Statistics Canada

Metropolitan Areas 2006 Map of Edmonton CMA

Page 29: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

CANSIM CANSIM is a very large database containing

socio-economic statistics for Canada. There are currently over 38 million time series organized in approximately 2,800 tables.

The statistics in CANSIM come from surveys (e.g., the Labour Force Survey), administrative data (e.g., crime and justice) and simulations or

models (e.g., population projections). Geography, content and time are basic to

retrieving time series from CANSIM.

Page 30: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

E-STAT E-STAT is a free portal to retrieve Census

results and CANSIM holdings, which is Statistics Canada’s large time series database.

You can access more Census results from the Statistics Canada website, but E-STAT provides a wider variety of output formats for Census data.

You can also access CANSIM from the Statistics Canada website, but you must pay $3.00 per time series.

Page 31: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

E-STAT E-STAT is available from the Library’s

homepage: http://www.library.ualberta.ca Go to the list of Databases for access

Page 32: EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008

Data Library

If you need assistance, the Data Library is located in Rutherford North on the first floor next to the main staircase.

Hours: 9:00 to noon and 1:00 to 4:30 M-F Phone: 492-5212