sjtu cmgpd 2012 methodological lecture day 3 position and status variables

15
SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Upload: whitney-blankenship

Post on 03-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

SJTU CMGPD 2012Methodological Lecture

Day 3

Position and Status Variables

Page 2: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Variables for position

• The basic and analytic files include a variety of indicator variables for whether a male holds position

• These are based on the statuses recorded in the registers– File with hanyu pinyin for raw occupations has been released

• DS 6

– Occupations with original Chinese characters are released as PDF

• Turned out to be difficult to include Chinese characters in the released data

Page 3: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Variables for position

• In the original data, entries included the official positions held by males.

• Coders assigned a numeric code to each new position, and entered the code into the dataset.– Codes started again for each new dataset

• Transcribed the original Chinese into a codebook• Can use DATASET and POSITION_CODE to look up original

Chinese in the appendix to the Analytic release codebook• DS 6 allows merging of hanyu pinyin for code, if you want

to create your own position variables from the originals.

Page 4: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Position variables

• We have provided a variable of flag variables identifying different kinds of position

• We have a separate file that for each combination of dataset and numeric position code specifies the hanyu pinyin and Chinese characters.

• This file provides flag and other variables describing characters of positions.

• These flags are merged back into the main file to provide variables for analysis.

Page 5: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Created Position Variables

• HAS_POSITION– Any salaried official position or purchased title– Doesn’t include miding, piding, etc. Those were

statuses, not salaried official positions• ESTIMATED_INCOME

– Imputed income based on stipends associated with the position(s) held by an individual

• RANK– Bureaucratic rank, based on specification of pin in the

position

Page 6: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Position variables

• BI_TIE_SHI, ZHI_SHI_REN, and flags for specific positions

• JUAN, DING_DAI etc. for presence of modifiers• EXAMINATION for any examination-related

title• NO_STATUS indicates that no status at all was

recorded for a male, even though we would have expected one.

Page 7: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Name variables

• HAS_SURNAME• DIMINUTIVE_NAME• RUSTIC_NAME• NON_HAN_NAME• NUMBER_NAME

Page 8: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Creating New Variables

• DS-6 contains pinyin for positions• DATASET and POSITION_CODE are the basis of a

merge back to the data files• POSITION_PINYIN is the ‘raw’ position, as

transcribed by the coders• POSITION_CORE is a stripped down version that

includes modifiers• Chinese characters are in an appendix to the

Analytic File codebook

Page 9: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Creating new variables

• STATA lets you search strings for particular values, and return an indicator if a string is fine.

• Can use this for occupations of special interest• For example,

– generate artisan = index(POSITION_PINYIN,"jiang") > 0– generate juanna = index(POSITION_PINYIN,”juan na”) >

0• Can code positions manually using Chinese

characters in the appendix of the Analytic File codebook

Page 10: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Studying attainment

• We have mainly used event-history– Determinants of chances of attaining position by

next register– Allows for consideration of time-varying

characteristics• Characteristics of kin

• An alternative would be to look at determinants of attaining a position by a specific age, with one observation per person

Page 11: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Creating variables to identify attainment of position by next register

generate at_risk_position = SEX == 2 & PRESENT & NEXT_3 & HAS_POSITION == 0

bysort PERSON_ID (YEAR): generate next_position = at_risk_position & HAS_POSITION[_n+1]

bysort AGE_IN_SUI: egen total_at_risk_position = total(at_risk_position)

bysort AGE_IN_SUI: egen total_next_position = total(next_position)generate p_next_position = total_next_position/total_at_risk_position bysort AGE_IN_SUI: generate first_in_age = _n == 1twoway line p_next_position AGE_IN_SUI if AGE_IN_SUI >= 1 &

AGE_IN_SUI <= 80 & first_in_age, ytitle("Proportion attaining position by next register") scheme(s1mono)

Page 12: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

bysort

• bysort groups the records in the dataset according to the values of the specified variables.

• Each set of records defined by a unique value of the specified variables is treated as a distinct block of records when the command is executed.

• If a variable is in parentheses, the data is sorted on that variable, but not divided according to the unique values of that variable.

• [ ]allows access to values from other observations in the same block. [1] says to draw the value of a variable from the first record in the block, [_N] from the last record, [_n+1] the next record and so forth

• _n refers to the location of the current record within the block

Page 13: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

• Create a variable with the record number within x:– bysort x (y): generate a = _n

• Create a flag identifying the first record within x:– bysort x (y): generate b = _n == 1

• Create a flag identifying the last record within x:– bysort x (y): generate c = _N == _n

• Create a variable with the total number of records with that unique value of x:– bysort x (y): generate d = _N

• Create a variable with the y from the next record within x:– bysort x (y): generate e = y[_n+1]

x y1 31 71 81 122 152 212 222 -53 -103 104 84 2

Page 14: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

Results

x y a b c d e1 3 1 1 0 4 71 7 2 0 0 4 81 8 3 0 0 4 121 12 4 0 1 42 -5 1 1 0 4 152 15 2 0 0 4 212 21 3 0 0 4 222 22 4 0 1 43 -10 1 1 0 2 103 10 2 0 1 24 2 1 1 0 2 84 8 2 0 1 2

Page 15: SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables

0.0

02.0

04.0

06.0

08P

rop

ortio

n at

tain

ing

posi

tion

by

nex

t re

gist

er

0 20 40 60 80Age in Sui