creating dummy variables in spss 20 · 2015. 2. 4. · approach 1 using “employee data.sav”...

13
Formerly SPSS Ireland Creating Dummy Variables in SPSS 20 Conor McCarthy Services Consultant

Upload: others

Post on 23-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Formerly SPSS Ireland

Creating Dummy Variables

in SPSS 20

Conor McCarthy

Services Consultant

Page 2: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

What are Dummy Variables

Also known as Indicator Variables

Used in techniques like Regression where there is an assumption

that the predictors measurement level is scale

Dummy coding get’s around this assumption

Take a value of 0 or 1 to indicate the absence (0) or presence (1)

of some categorical effect

k -1 dummy variables required for a variable with k categories

2

Page 3: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

An Example

Suppose you have a nominal variable with more than two

categories that you want to use as a predictor in a linear

Regression analysis i.e. Job Category

Then you will need to create 2 dummy variables (i.e. the

number of categories – 1) and include these new dummy

variables in your regression model

3

Page 4: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Considerations

Number of dummy variables – straight forward = k-1, where

k is the number of categories

Choose a reference category – this is the category that you

will compare all the other categories against

Often the reference category will be the first or last category

4

Page 5: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Doing this in SPSS 20

Built into the Logistic Regression procedures, needs to be

created manually for Linear Regression/Discriminant

Analysis

No single function available

Best to do this using syntax

5

Page 6: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Approach 1

Using “Employee Data.sav” located in

C:\Program Files\IBM\SPSS\Statistics\20\Samples\English

For variable jobcat create two dummy variables: jobcat1 and

jobcat2

Initially set each variable to 0 and then specify that each will

take on a value of 1 for job categories 1 and 2

In this way category number 3 is set to be the reference

category

6

Page 7: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Approach 1

7

Page 8: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Approach 1

8

Page 9: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Approach 2

Using the VECTOR and LOOP – END LOOP commands

Use the Vector Command to create the required number of

dummy variables i.e. 2 in this case

Use the LOOP – END LOOP command to loop through each

of the dummy variables that are created using the VECTOR

command

9

Page 10: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Approach 2

10

Page 11: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Approach 2

This approach will make the last category the reference

category as we are only looping through categories 1 and 2

in COMPUTE jobcat(#i) = ( jobcat = #i).

To make the first category the reference category you could

modify the COMPUTE statement in the syntax as follows:

COMPUTE jobcat(#i) = ( jobcat = #i +1).

11

Page 12: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Dealing with missing values

Modify compute statements in Approach 1 to just:

• IF (NOT MISSING(jobcat)) jobcat1=0.

• IF (NOT MISSING(jobcat)) jobcat2=0.

This ensures missing values are still missing in the dummy

variables

Approach 2 will deal with missing values implicitly

12

Page 13: Creating Dummy Variables in SPSS 20 · 2015. 2. 4. · Approach 1 Using “Employee Data.sav” located in C:\Program Files\IBM\SPSS\Statistics\20\Samples\English For variable jobcat

Approach 1 modified to account for missing values

13