statistics for social sciences i (e563) statistics for social sciences i (e563) statistics for...
Post on 20-Jan-2016
240 Views
Preview:
TRANSCRIPT
StatisticsStatistics for Social Sciences for Social Sciences I (E563)I (E563)
StatisticsStatistics for Social Sciences for Social Sciences I (E563)I (E563)
Prof. Sudip Ranjan BasuProf. Sudip Ranjan Basu, , Ph.DPh.D
25 September 200825 September 2008
Lecture 2-Sudip R. Basu 2
« A statistical tie »Think about these bar diagrams…
Lecture 2-Sudip R. Basu 3
Measurement in Statistics
• Concepts of measurement:• Measurement: a very specific process to assigning number to a
variable– Assignment by category (categorical/qualitative-attributes)– Assignment by amount
» assignment of a person to a particular category or a variable
– Validity: • to describe the objective and accurately reflect the concept• to measure by a particular scale or index
– Face validity/Content validity/Criterion validity/Construct validity
– Reliability: • to have consistency of the data collected• likelihood that the scale is actually measuring what it is supposed to
measure• Free of measurement errors
– Split-half reliability/test-retest reliability
Lecture 2-Sudip R. Basu 4
Forms of ‘variable’
• Variables: Concepts that vary, or change, from one observation to another in a sample or population
• Measurement scale differs
• Different statistical methods to apply to Quantitative and Qualitative variables
Variable
Quantitative: measurement scale
has numerical values, imply amounts-annual income
Categorical/Qualitative:
measurement scale is a set of categories,
not imply amounts-marital status
Lecture 2-Sudip R. Basu 5
Sales of measurement
•Quantitative variable:•Interval scale
Annual income (chf 50 and chf 30= chf 20)
•Qualitative variable:•Unordered/nominal scale
Primary mode of transportation (Bus, tram, bicycle, walk)
•Qualitative variable:•Ordered/ordinal scale
•Involves a rank order or other orderingPolitical philosophy
(Liberal, moderate. conservative)
Lecture 2-Sudip R. Basu 6
Quantitative aspects of ordinal data
• Interval scale: – Class interval: An interval that indicates the space
between two end points– Qualitative
• vary in magnitude
• Nominal scale: – Qualitative
• vary in quality not in quantity
• Ordinal scale: – quantitative-qualitative
• vary in quality not in quantity– Each level has a greater or smaller magnitude – Numerical scale by assigning numerical scores to
categories– Interval than nominal– Sensitivity analysis
Lecture 2-Sudip R. Basu 7
Discrete and Continuous
• Discrete: A set of values form separate numbers, such as 0,1,2,….
• Unit of measurement cannot be subdivided» Number of siblings » Number of visits to a physician last year
• Categorical variables-nominal or ordinal• Quantitative variables-discrete (Number of siblings) or
continuous (age)
• Continuous: An infinite continuum of possible real number values
• Any real number possible between two values» Height» Weight
Lecture 2-Sudip R. Basu 8
Summarize types of variables
Lecture 2-Sudip R. Basu 9
Describing data• Categorical data:
– Frequency : headcounts or tallies indicating the number of cases in particular category or the total number of cases measured/the number of observations
– Scores: Numbers that are used to represent amounts or rankings– Relative frequency
• The proportion (# of observations in a category divided by the total number of observations) or percentage (proportion multiplied by 100) of the observations that fall in that category
• Sum of proportions equals to 1.00– Frequency distribution
• A tabulation that lists possible values for a variable, together with the number of observations at each level.
– Relative frequency distribution• A listing of possible values together with their proportions or
percentages
• Quantitative data:– Frequency distribution
• Intervals of values in frequency distributions are usually of equal width• Mutually exclusive intervals
Lecture 2-Sudip R. Basu 10
Bar graphs
11
18
15
20
6
05
10
15
20
Native L
anguage S
peakers
Asian EU-other English French GermanSource: Statistics Class 1, SRBasu
by languagesBar diagram of native language speakers, E563
Lecture 2-Sudip R. Basu 11
Comparing groups
• Compare: Same variable and different groups
• Relative frequency distributions• Histograms• Stem-and-leaf plots
Lecture 2-Sudip R. Basu 12
Population and sample distribution
• Sample distribution is a ‘blurry’ picture of the population distribution– As the sample size increases, the
sample proportion in any interval gets closer to the true population proportion
• Sample distribution population distribution
Lecture 2-Sudip R. Basu 13
Shape of a distribution• Shapes of distributions differ
Symmetric
Skewed
Lecture 2-Sudip R. Basu 14
SESSION 2 of Lecture 2
Lecture 2-Sudip R. Basu 15
Working with STATAstata@stata.com
http://www.ststa.com
Lecture 2-Sudip R. Basu 16
Getting started with STATA
• The first four windows open automatically after clicking STATA icon:
• The most visible window is the Results Window, which shows results from commands you have typed in the Command Window.
• The Command Window is below Results Window where all your commands are typed.
• The Review Window lists all typed commands that have been entered from the Command Window. When you click on a command from Review Window, it is pasted into the Command Window.
• The Variables Window lists all working variables in the file. Once you click on a variable, and it will appear in the command window.
Lecture 2-Sudip R. Basu 17
STATA window
Lecture 2-Sudip R. Basu 18
Simpel commands• The data editor allows you to enter, view, or
edit your working data file. Caution: This window must be closed in order to run commands in STATA.
• The do-file editor allows you to write, edit, and save STATA commands. STATA commands can be run from the do-file editor. -- files are called do files because they have the file extension .do
• Note: STATA treats lines that begin with an asterisk * or text between a pair of /* and */ as comments.
Lecture 2-Sudip R. Basu 19
Save-Close files• Open/Save/Close data file using the icons at
the top of the screen-“file” or via commands in the Command Window.
• The STATA dataset is saved in the .dta format.
• You can use a separate programme called Stat Transfer to translate the dataset from its current format into STATA format.
• For large dataset, researchers prefer to use this program. This program retains any variable or value labels from the original file.
Lecture 2-Sudip R. Basu 20
Help-Search• Memory allows you to handle a large datasets. For example,
you can set a memory size of 20m by the following command in the Command Window.
.set memory 20m• Help/Search facilities in the STATA allow looking for any
command. You can use the help command by simply typing help in the Command Window or using the drop-down Help menu icon, which will open a separate window. You can also type findit command for more information.
• However, if you do not know the STATA command name you can use the Search facility using the drop-down Help menu icon. For example, if you want help with describe, then you type:
.help describe• STATA programme uses simple language syntax. Almost all
commands follow the structure: .command variable (variable variable…) , options
Lecture 2-Sudip R. Basu 21
Creating a new dataset
• The easy way to create a dataset is to type values for each variable, in columns that STATA automatically calls var1, var2, etc in the Data Editor. Thus, var1 contains names of students; var1 statistics competency; and so forth.
• Rename: .rename var1 students
.label variable students “Students in Statistics, 2008-2009”
• After typing in the information, you close the window and save data, say
.stat2.dta
. save stat2
Lecture 2-Sudip R. Basu 22
Working with Sample
• Specifying Subsets of the data: You can restrict to a subset of the data by adding an in or if qualifier, such as using only the 1st through 20th observation, type
.list in 1/25 .sort origin .list origin program in 1/25 • The if qualifier also has broad applications,
but it selects observations based on specific variable values, such as
.summarize if stat==1
Lecture 2-Sudip R. Basu 23
Describing data• Frequency Tables and Two-Way
Cross Tabulations: You can work on Categorical variables for tabulation. Use the dataset stat to tabulate the categorical variable programme:
.tabulate programme• You can do cross-tabulation of
programme by stat: .tabulate programme stat• You can get column percentages, type .tabulate programme stat, column
Lecture 2-Sudip R. Basu 24
Data tabulation• Multiple Tables and Multi-way Cross-Tabulations: You
can work on many different variables, type .tab1 origin programme stat .tab1 programme – education • You can get multiple two-way tables, such as cross-
tabulations of every two-way combinations of the listed variables, type
.tab2 origin programme stat• To produce multi-way tables, if we do not need percentages
or statistical tests, type .table programme , contents (freq)• To produce two-way frequency table or cross-tabulation, type . table origin programme , contents (freq)• To produce a more complicated tables, type . table origin programme , contents (freq) by (stat)
Lecture 2-Sudip R. Basu 25
GRAPHS with STATA• You can draw bar charts, type: .graph bar stat, over (programme) blabel(bar) bar (1,
bcolor(gs10)) .graph bar stat, over(programme) legend( label(1 "Frequency")) ytitle("Native Language Speakers") title("Bar diagram of native language speakers, E563") subtitle("by languages") note("Source: Statistics Class 1, SRBasu")
.graph bar stat word, over (programme) blabel(bar) bar (1, bcolor(gs10)) bar (2, bcolor (gs7))
• You can draw horizontal bar charts, type: .graph hbar stat, over (programme) blabel(bar) bar (1,
bcolor(gs10)) .graph hbar stat word, over (programme) blabel(bar)
Lecture 2-Sudip R. Basu 26
Working with datasetsSee Week 2 web-course material
1) Assignment_1 Datasets:
2) Week2_Students Profile3) Week2_World Socio-economic data
Lecture 2-Sudip R. Basu 27
Week 3-2 October• Descriptive Statistics
» Measures of Central Tendency and Dispersion, Moments, Skewness, and Kurtosis
• Readings: » AF-Chapter 3 (p.39-60)» MS-Chapter 4, MS-Chapter 5
• Assignment: Assignment 2» Students should turn in his/her own paper in
hardcopies to teaching assistant at Rigot Office No. 31 or in class on Thursday 9 October-Week 4.
Note
Lecture 2-Sudip R. Basu 28
top related