spss1 finding and managing data for the social sciences

Upload: gura1999

Post on 05-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    1/55

    1

    SPSS1 Finding and Managing Data for the Social Sciences

    Tutorial Goal: Introduction to the fundamentals of searching and managing data in SPSS for statistical analysisParticipants learn how to find data on the Internet, understand variables, and manipulate data. Workshop coversdescriptive statistics, charts, and tables. Different data formats are discussed.

    Lets start with the basics of data. For statistics, there are four kinds of levels of measurement for the variable.All your analyses extend from what kind of level your variable is. They are NOIR.

    (N)ominal(O)rdinal(I)nterval(R)atio

    Lets talk about each one.

    Nominal means that the number simply represents a category of objects. There is no measured different amongtheobjects or people. Some examples are giving states numbers (N.Y. 1, Connecticut 2, R.I. 3), assigning anumber for gender (male 1, female 2), or designating college major (History 1, Business 2, Sociology 3). You

    are just assigning a number to something.

    Ordinal means the larger number for the object is truly larger in some sort of amount. This typically meansrank. Some examples are 1st, 2nd, and 3rd places in a contest, or preferences for different movies. However,there is no exactly measured difference among the objects. We dont know definitively how much larger orbetter 1

    stis compared to 2

    nd. We just know 1

    stis somehow larger than 2

    nd.

    Interval means, like Ordinal, that there is a rank for the objects or people, but there is also a measurement forthe ranking. Some examples are degrees Celsius or Fahrenheit. We know that the different between 98 and 99degrees is the difference of the amount of mercury in a thermometer. Also, the difference between 42 and 43degrees is the same amount between 98 and 99. However, there is no true zero, which stands for a complete

    lack of the object being measured. 0 degree does not mean there is no mercury, for example.

    Ratio means, like Interval, that there is a measurement for the ranking, but there is also a true zero. A true zeromeans that there is lack of the quality being measured. Some examples are income, where the differencebetween $10,000 and $11,000 is known and zero means complete lack of income.

    These levels are very important and we will be discussing them more as we go on. Nominal and Ordinal arecalled Nonparametric Data, and Interval and Ratio are called Parametric Data. Basically, if the data canhave a mean, they are parametric. The statistical analyses that you can use are dependent on what level yourdata are.

    Now, lets get some data from online. When you work with data, you need to remember that they come indifferent digital formats. SPSS has its own format, but there are others out there, such as Excel and SAS. Youneed to know how to import data sets from a different format into SPSS. The data come in big spreadsheets,where variables run left to right, and cases up and down. A variable is something being measured, such asincome, time, or height. A case is the observation, such as a person, country, or company. Sometimes, whenthe names are too long, the labels of the variables are given a code, and sometimes these codes areindecipherable. In order to read the variables, you need a codebook to explain what each variable means.

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    2/55

    Variables

    Cases

    First, be organized. You might download dozens of files and you need to keep everything in order. So, letscreate one folder to store everything that you download in.

    1. Go to your storage device. In my case, Im going to use my C drive. Right-click onto the screen, and in thepop-up menu, select New and left-click on Folder. Call the new folder DATA. Close the storage device andreturn to the desktop.

    STOP! Keep track of how much space your

    files are and how much space your storagedevice has free. If you fill up your storagedevice, computations wont work

    Ok, lets start a project. Lets say were interested in survey data about American opinions about obesity. Forexample, how many calories people consume, if they diet or not, etc. We can find these data by using theSocial Sciences Data Resource Page. Please use Internet Explorer as your browser.

    2

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    3/55

    1. In the Library Resource Guides (http://dl.lib.brown.edu/gateway) under Special Subject Guides on theright, go to Statistics and Data.

    2. Under onto Data By Subject, you can search for data by the subject or university department. We wantsocial data, so start looking in Sociology.

    Remember, this resource page is constantly

    being updated. If you cant find what yourelooking for, you can contact me, Tom, [email protected] call 863-7978.

    3

    http://dl.lib.brown.edu/gatewaymailto:[email protected]:[email protected]:[email protected]://dl.lib.brown.edu/gateway
  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    4/55

    3. On the Sociology page, we are interested in the Inter-university Consortium for Political and SocialResearch (ICPSR), whichhas a huge archive of 6,000 studies covering many subjects, such as urban studies,the environment, social indicators, etc.

    4. On the ICPSR homepage, Advanced Search option under Search. This option gives you more possibilities

    for your search and is a good place to start.

    4

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    5/55

    5. On the Search Full Data Holdings, you can search just like in a library catalog, where you search for wordsin data records. In the dropdown menu, you can choose what field youre interested in. Lets just leave thedefault any field.

    6. What we want is data on surveys on obesity done in the United States. In the first empty field type inobesity. In the second empty field type in United States. ClickSearch.

    5

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    6/55

    7. In the Search Results page, you can see what data match your search criteria. You see how many resultswere found on top. The results are sorted by relevance according to our search criteria, but you can rearrangethe list by Sort by Title or Score by Date which weights the relevance by date.

    8. If you look closer at the citation, you have a number of options. Lets look at the first one, ABC News/TimeMagazine Obesity Poll. Click on description, which contains the metadata for this data.

    Metadata is data aboutdata. So, its a detaileddescription about the data,like a catalog record, andshould accompany thedata.

    Here you have a record of this data set.

    6

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    7/55

    If you scroll down, you can see 1) a detailed summary of the file and 2) subject terms, which contain termnames that this file is categorized under. If youre not satisfied with this data set, check through the other datasets that share similar subject terms for other comparable files.

    1

    2

    9. 1) Scroll down to Access and Availablity and 2) click on summary of holdings. Here the metadata saywhat format the data are in.

    2

    7

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    8/55

    10. There can be many different formats, but this file is a SPSS portable file. This file is already prepared to bedirectly imported into SPSS. So, all we need to do is download, unzip and open it into SPSS.

    11. Click back to the Description & Citation page. Scroll to the top of the page and click on the Download tab.

    12. On the Log In page, if youre a New User, click on Create Account and follow the procedure. If not, login like usual.

    8

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    9/55

    13. After creating an account, you will be prompted to log on through your email. When you do, you mightneed to search for the data again. You can search for Study No. 4040 and go to the Download page. Here atStep 1, you can select the files you want. Today leave it as All Files, which means we can get the SPSSPortable and Documentation. Leave Step 2 at All datasets. In Step 3 Add to data cart, click on Add to DataCart, which prepares the data for download.

    14. You can ignore Step 4, which simply reviews what we have already done. In Step 5. Download cartcontents, you see that we added 2 files. Click on Download Data Cart.

    9

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    10/55

    15. You must click on the I Agree button to accept the Terms of Use agreement.

    16. In the File Download dialog, you are prompted to save your file. ClickSave. Remember, there are newerversions of zip, so your dialogs might not look the same. Close out of the browser when the file is downloaded.

    17. In the Save As window, navigate to our DATA folder and clickSave.

    10

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    11/55

    18. In the Download complete dialog, clickOpen Folder (If the Save As window closes and it doesnt giveyou a new window, simply go to your DATA folder in your storage device).

    19. You now have the DATA folder window. In it, you see the folder with a number. In this example, 5165062was randomly assigned. This folder has been zipped, which is a compression data format to squish largeamounts of data into a smaller amount. So, we have to unzip it to get to our data. Right-click onto the folder

    and then left-click onto Extract All.

    20. You now get an Extraction Wizard. ClickNext.

    Wizard:Instructional helpin an application orsystem developmentenvironment that

    guides the userthrough a series ofmultiple choicequestions toaccomplish a task.(http://www.techweb.com/encyclopedia/)

    11

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    12/55

    21. In the next window, Select a Destination, you can rename the folder if you wish. Leave it at 5165062 andclickNext.

    22. In the Extraction Complete window, clickFinish.

    23. You now have the unzipped folder for our obesity survey data. There are many files within this folder, solets navigate to the one we need. If you click folder, you get the1) ICPSR_04040 folder. Click on theICPSR_04040 folder, and you have the 2) DS0001 folder. This contains our data, so click onto it.

    1

    12

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    13/55

    2

    24. In the DS0001 folder, you have two files, the 04040-0001-Codebook and the 04040-0001-Data.por file.

    First, you have the 04040-0001-Codebook. Codebooks are extremely important for data sets because theycontain the metadata, or the data about data. They sometimes have the instrument, or the original surveydocument, and an explanation of each variable. Double left-click on the Codebook to open it. You see on theright the Table of Contents (TOC), which should explain every detail of the data. In the TOC, click on Poll

    Here you see the actual instrument that was used to collect the data. Spend a minute and look at some of thissurvey and see how our data was generated. Then you can close the codebook.

    13

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    14/55

    For SPSS, we need the 04040-0001-Data.por file. This contains our data. It is a .por file, which is a dataformat specifically used to transfer survey data into SPSS. You can now close this DATA folder and the ICPSR

    window.

    Now, that were organized for our data, lets start SPSS.

    1. Left-click onto Start from your Desktop and move your cursor over All Programs, which give you a menuoff all the programs.

    2. Put your cursor overSPSS Inc

    , which brings up a pop-up menu and then move the cursor over theSPSS

    16.0 pop-up. To start the program, left-click onto SPSS 17.0 (SPSS is available on the CIS computers at theRockefeller Library under the Computational menu).

    14

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    15/55

    3. You now receive the SPSS Data Editor window. Here you display your data and your variable informationIn the SPSS for Windows, Open an existing data source is selected. Make sure More Files is highlightedand slick OK.

    4. In the Open File, navigate to where you are keeping the obesity survey in the DS0001 folder. Its set bydefault to .sav, which is another type of tile. So, In Files of type, select All Files. Here you see the 04040-0001-Data.porfile. Double-click onto the .por file and open it.

    Files of type is a good place to lookif you have a file that youre notsure if you can import it. If the filetype is listed here, you can convertit into an SPSS file.

    15

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    16/55

    5. Two windows open quickly, first the Output and then the Data Editor. The Output window is a separatefile from the data file that contains the results from our statistical analyses and notes. You need to togglebetween the two windows. This file has two parts 1) Table of Contents (TOC), and 2) the View. The TOCwill list everything and the View will show the results. A Log is immediately started of all the commandsperformed.

    12

    In the TOC, you can control whats being viewed in the View simply by clicking the to close the results or the+ to open the results. Click the + sign again and open the log in the display. Remember, all your results for theanalyses that you are going to do now appear here in the Output. Open the Output book for this tutorial. Beprepared to toggle between the windows as the Output window will open with the execution of each command.

    The Log will keep making notes of everycommand. You can turn it off by going intothe Edit pull down, and left click on Options.

    In the Options window, click on the Viewer tab.You see in the bottom left a box for Display thecommands in log. Deselect it and press OK.

    16

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    17/55

    6. You will come to the output window later. Save it for now. 1) Click on the Save and navigate to whereyoure keeping the data. 2) Call the file obesity. In the Save as type field, notice that its being save asViewer Files .spv. You will see three types of data files in this tutorial. The spv extention is for results. 3)Click on Save and go back to the Data Viewer.

    3

    2

    You now have the new obesity survey data in SPSS. In the SPSS Data Editor, you have two views: the DataView and the Variable View. You can see the tabs for each view on the bottom left. The Data View is thespreadsheet with all the cases and variables. For example, this obesity data set has 1202 cases and 114variables. The Variable View has the information about each one of our variables. Lets explore some of thedata management functions in SPSS. Lets start with the Variable View.

    17

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    18/55

    Click on the Variable View tab. In the Variable View you have 10 columns of attribute information about yourvariables.

    If you click onto any of the boxes, you see the border becomes bolded, meaning that box is active and you canchange its contents.

    But lets just look at some of the important attributes.

    Name, as the name implies, is the name of the variable.

    Type defines what kind of variable it is.

    1. If you click on a box in the column to modify an entry, you see a little box with three dots appears.This means there is a dialog box, or a window with a number of options, that comes for this information.Click onto it.

    2. Next you are given the Variable Type dialog. Here you can choose how your numbers are formatted.

    18

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    19/55

    Width sets the number of characters before the decimal that is shown in the column. The actual number can bemore, but only the specified number shows in the column. When you click onto the box, you get a pair of scrollarrows. Simply scroll to the desired amount.

    Decimals sets the number of characters after the decimal that is shown in the column. The actual number canbe more, but only the specified number shows in the column. Clicking on the box brings up a scroll menu tochoose a number.

    Label is very important since you can define what the name of the variable actually means. Expand the columnto see that this data set is managed very well. They wrote out the whole question that was asked for thisvariable.

    STOP! Believe it or not, sometimes youmight get a poorly organized data set thatleaves the labels blank. From the data file, you

    have no idea what any of the variables mean.If thats the case, immediately turn to thecodebook or metadata for the explanations.

    If you want to expand a column, move the cursorover the border in the label box. The cursor changesinto an arrow.

    Left-click and hold down on the mouse and expand thewidth.

    Values allows you to code your answers. Remember back to the level of measurement section. Everything isassigned a number and you need to keep track of what those numbers mean.

    19

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    20/55

    1. Click onto the Values box for the 3rd variable, tzone. This brings up the three dot box for a dialog.Click onto the three dots box.

    2. In the Value Labels dialog, you can see what each number means. You can also add a number in theValue field and a definition in the Value Label.

    STOP! If you are designing your own data set, pleasetake the time and make proper labels and values for yodata. You understand everything now, but when youcome back to this data set in a few years times, will younderstand everything then? You never know.

    Measure sets the level of measurement for our numbers. This is very important because SPSS reads themeasure and only allows you to perform tests that are appropriate for the data.

    1. Click on the box and bring up the scroll arrow.*We are not going to work with theattribute information Missing, Columnsand Align in todays lesson, so pleaseexplore them on your own.

    2. In the dropdown menu, you can choose one of three levels: Scale, Ordinal and Nominal. SPSS has

    combined Interval and Ratio in the Scale level (Parametric Data).

    20

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    21/55

    Ok, that was the Variable View part of SPSS. Now lets look at Data View and see how we can manage andmodify our data set. Click onto the Data View tab. First, lets save our data set.

    1. Left-click on the File menu, and then left-click on Save.

    STOP! Its a good habit tosave your work often.

    Remember, if the softwarecrashes, you lose all youwork from the last timeyou saved.

    2. In the Save Data As, 1) you see that this file is being save in the file we downloaded from ICPSR. 2) InFile name, type Obesity as the name of the file. 3) Look in Save as type. This file is being saved as a .savfile, which is a data file for SPSS. 4) Click on Save.

    2

    1

    4

    3

    Now, lets look at some basic functions in SPSS.

    Value Labels allows you to see what all the numbers in the Data View mean. Simply click on the iconand the numbers are given the verbal explanation from the values column in Variable View. Click on theicon again to switch it back.

    In SPSS, you can also create new variables. This comes in handy especially if youre creating your owndata set.

    21

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    22/55

    1. In the Data View, scroll right to the end of the variable columns.

    2. Right-click onto the label box and bring up the pop-up menu. Left-click onto Insert Variables.

    3. You see in the Data View that a new variable has been created. You can populate the variable with thevalues of whatever variable you create. Type in 1 for case 1, 2 for case 2 and 2 for case 3.

    22

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    23/55

    4. If you double left-click very rapidly on a title bar of the variable in the Data View, it immediatelyswitches onto that variable in the Variable View.

    5. Lets change the name of the new variable. Double left-click onto the entry for Name for variablevar0001. Lets name our new variable new. ClickEnter. You can double-left click on the gray 115 bar totake you back to the Variable View.

    Clear allow you to delete a variable.

    1. Back in Data View, lets get rid of this unneeded variable. Right-click on the label box and then left-clickon Clear, which deletes the variable.

    23

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    24/55

    STOP! Remember, if you make a mistake, youcan always go back a step and Undo. In the Editmenu, left-click on Undo. It shows you yourlast step after the word Undo, and it takes thedata set back to before that last function.

    Move Variables or cases allows you to literally move the variables around on the spreadsheet. This isespecially good if you have a large data set. You might only be interested in a few variables and want themclose together so you can work more easily with them.

    1. In our obesity data set, lets imagine we want to move the variable q45 (the participants weight), which isnear the end of the variables towards the right,closer to the beginning of the spreadsheet near the variablerespno (participant number). Left-click on the label box once and you get an arrow. Left-click on thelabel box twice and you see a box next to the cursor. Be sure to hold the cursor over the title boxes. This

    means that the variable column can be moved.

    2. Keeping your finger on the left side of the mouse, move the cursor left on the spreadsheet. You see a redline appear, which tells you that the variable is going to placed to the right of that line. Drag the variable allthe way down to respno. The line at the right of the respno column turns red when you put the cursor overit. This means the variable is going to be placed in the column immediately to its right.

    24

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    25/55

    3. Release your finger on the left and the column drops into the spot.

    You can also move a variable by inserting variable and then cutting and pasting the variable into thatcolumn. Lets move gender (q921) next to weight.

    1. Select the column where you want the new variable by left-clicking on it. Lets put q921 to the right ofq45. Then right-click to bring up the pop-up menu. Left-click on Insert Variable and put in a new variable.

    2. 1) Right-click on the label box of q921 to bring up the pop-up menu. Then left-click on Cut and removethe column. 2) Back next to q45, right-click onto the label box in the new empty variable column to bringup the pop-up menu. Left-click on Paste to place the cut variable there.

    1 2

    25

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    26/55

    Sorting the cases in the variable is another useful function. You can sort the variables from high to low orlow to high.

    1. Right-click on q45 and then left-click on Sort Ascending. This sorts the variables from high at top downto low. Obviously, Sort Descending sorts from low at top down to high.

    There are also several good data management functions in the Data menu. Lets explore some of them.

    Weight Cases makes cases more important to compensate for over- or under-sampled groups. For exampleif your sample is small, but you know that a certain region represents 40% of the population, you can weightthose cases from that region so the number of those cases is higher. This function literally multiplies thevalue of one variable by the frequency of another.

    1. Left-click the Data menu and then left-clickWeight Cases.

    2. In the Weight Cases dialog, you can weight the cases by a variable. 1) In the Current Status: display yousee the data is already weighted by the variable weight - Weight cases by weight. 2) If you had to do thismanually, scroll down in the variable menu to Weight and select it. The researchers calculated the variableweight according to the population of each census region. Click on the Weight cases by radio button.

    Click the arrow and move the Weight variable into the Frequency Variable field. 3) Since the weightis already set, just clickCancel.

    26

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    27/55

    You can also see at the bottom right corner of the Data View that Weight On is displayed.

    2

    1

    3

    27

    Remember, all calculations andanalyses will be based on theweight. Whenever you modifyyour data, all further operationswork on that modification.

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    28/55

    Select Cases allows you to select a number of cases within a variable(s) and create a new variable from them.For example, from our two variables for gender and weight, we want to select only men who weigh more than100 lbs.

    1. In the Data menu, left-click on Select Cases.

    2. In the Select Cases dialog, select If condition is satisfied. Click the Ificon.

    3. In the Select Cases: Ifcalculator, we need to build a conditional statement that will select only men who areover 100 lbs. We need to select the variables from the left, move them into the calculator on the right, and

    then set up the conditional. First, select Q921 on the left and click the arrow icon to move it into the

    calculator. Click the equal sign and then 1, which is the code for men. So, that says choose men from

    the variable gender. Next, we need to add the second part of our statement. Click on the ampersand .

    28

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    29/55

    Select Q.45 Respondent WEIGHT IN POUNDS [q45] and click on the arrow to move it to thecalculator. Type in the > (greater than) sign and then 100 for weight. This part of the statement says andchoose from weight and cases over 100 lbs. Then clickContinue.

    Another good data managementtechnique is to right-click on thevariable menu in the dialog. Thisgives you many choices of listing

    the variables, for example byvariable name, label, alphabeticallyor file order.

    4. You see back in the Select Cases dialog, the statement has been set. ClickOK.

    So, explore the results and scroll down the page. You see that those cases that did not meet the condition arecrossed out. If you scroll to the end of the variables to the right, you see a new variable has been created whichhas initially been labeled filter_$ (You can change the name in the Variable View). In this variable, 1 meansthe case met the condition men over 100 lbs and 0 means the case did not meet the condition. You can nowdo statistics with this new variable.

    29

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    30/55

    5. Please go back into the Select Cases dialog and clickReset , which will set it back to before weperformed this function. ClickOK. The new variable remains, but the crosses from the cases are removed.

    Recode allows you to transform your code in the variable without manually changing the number for each case.For example, q1 asks how the participant rates his/her health. The answers were 1 Excellent, 2 Good, 3 Not sogood and 4 Poor. Lets say we want to collapse these four answers into two groups good and bad. So, weneed to recode answers 1 and 2 into 1 for Good and answers 3 and 4 into 2 for Bad.

    1. In the Transform menu, left-click on Recodeinto Different Variables. Different Variables makes a newvariable with our recoded data and is the safest option if you make a mistake. You can delete the new variableand start again. Recode Into Same Variables changes the actual variable.

    2. In the Recode into Different Variables dialog, 1) select q1 in the variable list on the left and move it toNumeric Variable > Output Variable; 2) in Name, type in the name of the new variable, Health; 3) in Label,type in the description Participants health statement; 4) clickChange to set the new variable. Now, weneed to set our values. 5) Click on Old and New Values.

    2

    3

    1

    4

    5

    30

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    31/55

    3. In the Recode into Different Variables: Old and New Values dialog, 1) select Range and type in the rangeof the value that is being changed 1 through 2; 2) In Value, type in the new value 1; 3) click on Add and setthe value recoding.

    1

    2

    3

    4. Please follow the same procedure for range 3 through 4, set the new value at 2, and add it. Then clickContinue.

    5. Back in the Recode into Different Variables, clickOK.

    31

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    32/55

    If you scroll all the way down to the end of the variables, you see there is a new variable with our new code forhealth.

    6. Now, we need to set the values. Click onto the Variable View. 1) In the new Health variable, in the Values

    column, click on the three dots icon and bring up the Value Labels dialog. 2) Set the first value. InValue, type in 1. 3) Set the label for the new value. In Value Label, type in Good Health. 4) ClickAdd.

    1

    4

    2

    3

    7. Follow the same procedure for value 2. In Value type in 2 and in Value Label type Bad Health. Click Add.ClickOK to finalize the new values.

    32

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    33/55

    Computing Variable allows you to calculate a new variable from the values of variables you already have in adata set. For example, in our data set, variables 4_1 through 4_8 are dealing with the same subject matter; theyare all questions on the survey asking about how much of a health problem certain things like AIDS, drugabuse, and obesity are in this country according to the participant (Explore the questions in the Variable View).The lower the number means the more important. Lets combine these eight variables into one to try to measurethe participants consideration of Public Health issues. We will call our new variable Pubhel.

    1. In the Transform menu, left-click on Compute.

    2. In the Compute Variable calculator, 1) type public_health in the Target Variable; 2) construct thefollowing equation in the Numeric Expression box: (q4_1 + q4_2 + q4_3 + q4_4 + q4_5 + q4_6 + q4_7

    +q4_8) / 8. You need to start with the parentheses , and then move each variable over one at a time

    while clicking a plus sign in between each. Then a backslash , which is division, after theparentheses followed by an eight. So, in this equation, we are added up all these variables that make up thePublic Health subject on our survey and then dividing them by the number of variables, or, in other words,finding the mean. We are doing this for each case. 3) ClickOK.

    1 2

    3

    3. Now we have a new variable, public_health. The lower the number, the more the person is concerned withPublic Health issues.

    33

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    34/55

    Creating a New Variable From Two Variables allows you to subsort two variables into one. We want tochange q45 (gender) and q921 (weight) into one variable, newweight. Specifically, in our new variable, wewant to make a combined variable that has four values: 1 is men under 200 lbs, 2 is men 200 lbs and over, 3 iswomen under 200 lbs, and 4 is women 200lbs and over. In order to do this, we have to write a syntax file andrun it. Syntax in SPSS is scripting, and SPSS allows you to do many operations with scripting.

    1. We need to create a new syntax file. In the File pulldown menu, select New and then left-clickSyntax.

    2. In the Syntax file, there are 1) a navigation panel that lists the functions and the conditions, and 2) the viewwhere you type the commands.

    12

    34

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    35/55

    3. You have to create our four new values using If statements as given below. There are two parts, thecondition and the result. Note: Sometimes as you type these statements, a pop-up menu appears. This offers alist of all commands. If you keep typing, it disappears.

    Syntax What its sayingif (q45 < 200 & q921= 1) newweight = 1. If sets up the condition and the parentheses are the

    parameters of the condition. The parameters are findthe cases in q45 that are under 200 and the cases in

    q921 that equal 1 (men under 200 lbs). Give thevalue of 1 in the new variable newweight for thecases that meet those conditions

    if (q45 >= 200 & q921 = 1) newweight = 2. If sets up the condition and the parentheses are theparameters of the condition. The parameters are findthe cases in q45 that are equal to or greater than 200,and the cases in q921 that equal 1 (men 200 lbs andover). Give the value of 2 in the new variablenewweight for the cases that meet those conditions

    if (q45 < 200 & q921 = 2) newweight = 3. If sets up the condition and the parentheses are theparameters of the condition. The parameters are find

    the cases in q45 that are under 200 and the cases inq921 that equal 2 (women under 200 lbs). Give thevalue of 3 in the new variable newweight for thecases that meet those conditions

    if (q45 >= 200 & q921 = 2) newweight = 4. If sets up the condition and the parentheses are theparameters of the condition. The parameters are findthe cases in q45 that are greater than or equal to 200,and the cases in q921 that equal to 2 (women 200 lbsand over). Give the value of 4 in the new variablenewweight for the cases that meet those conditions

    execute. Denotes the end of the script.

    Dont forget the period and the end of the statements!

    35

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    36/55

    2) In the pull-down menu Run, left click on All.

    3. You see in our new variable newweight, values have been added according to the parameters we set.

    4. You can save syntax files, too, and use them later. 1) In the Syntax1 SPSS Syntax Editor, use thepulldown menu File, and left-clickSave. 2) In the Save As window, save it as obesity.SPS. SPS is theextention for syntax files. You can close the syntax file.

    2

    1

    Now, we have modified our data set and we are interested in some descriptive statistics, which describes andor summarizes the scores from our data set. Usually descriptives deal with central tendency and variance.Lets explore these essential ideas for a moment.

    36

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    37/55

    (Graphic from http://www.maximumiq.com/iq-tests-stats.php)

    An IQ test is a perfect example of central tendency and variance. Your result on an IQ test is literally thecomparison of your result with everybody elses who has taken the test. Millions of people take these tests.Very few people would score low, and there a very few geniuses around who would score high. The majorityof us have average IQs. As seen in the graphic above, IQ results, when plotted out, take a normal distributionwhere the majority of results cluster in the middle and results that are lower and higher are infrequent and lessen

    the farther away from the center of the results.

    The central tendency is measured usually by the mean (All cases added and then divided by the number ofcases). So, a score of 100 on an IQ is the mean. Its an average intelligence. Remember, the results of themajority of people bunch around 100. Variance is how far the score falls from the mean. If most of the scorescluster around the mean, then there is low variance. It looks like a bell curve, where most of the results are inthe middle taking the shape of a bell. If the variance is high, the curve in the middle is not as high and theresults are more spread out.

    Now, lets get some descriptive statistics for the variable q45, our participants weight measurements.Frequencies gives you the number of cases reporting a certain amount of a variable.

    1. In the Analyze menu, select Descriptive Statistics and left-click on Frequencies.

    2. In the Frequencies dialog, select our variable, Q.45 Respondent WEIGHT IN POUNDS, and move it into

    the Variable box with the arrow and clickOK.

    37

    http://www.maximumiq.com/iq-tests-stats.phphttp://www.maximumiq.com/iq-tests-stats.phphttp://www.maximumiq.com/iq-tests-stats.php
  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    38/55

    Scroll down to the Frequency results, where you have a few important numbers. 1) The path to the data file isgiven at the top of any results. 2) In Statistics chart, N denotes the number of cases. Valid gives the numberof cases that were calculated in the Frequencies. Missing are those cases that did not have any value and werenot calculated. 3) In Q.45 Respondent WEIGHT IN POUNDS chart, the very left column gives you thevalue in the cells in the weight variable. That is, it gives you every weight that was reported. 4) Frequencygives you the number of cases reporting this number. 5) Percent gives you the percent of all cases reporting

    this number. 6) Cumulative Percent gives you the total amount of cases that reported this number and thosebelow. So, .9 percent of the cases reported Dont Know. The second entry is No Answer with a CumulativePercent of 4.3, the percent of cases reporting Dont Know and No Answer added up to 4.3. As you go down thechart, youre adding all the percents together and the number increases until it reaches 100% at the end. Pleaseclose the results in the TOC after each analysis so you dont get overwhelmed.

    38

    Crosstabs gives a frequency count of one variable by another variable. Lets get the frequency of weight bygender.

    1. In the Analyze dropdown menu, select Descriptive Statistics and then left-clickCrosstabs.

    STOP! When you do many differentanalyses, you get a ton of results. Dontget overwhelmed by all the numbers. Inthe results, there tend to be only a fewnumbers that you really need to report

    for the final product of your analyses. Inthe tutorials, we concentrate only on thenumbers you need.

    6

    3

    5

    2

    4

    1

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    39/55

    2. In the Crosstabs dialog, 1) select Q.45 Respondent WEIGHT IN POUNDS and move it to the Row field.2) Select Q921. GENDER and move it to the Column field. 3) ClickOK.

    1

    2

    3. As you can see in the second chart, you have a frequency count of weight by gender.

    3

    Descriptives gives you a lot information about descriptives, such as the mean.

    39

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    40/55

    1. In the Analyze menu, select Descriptive Statistics and left-click onto Descriptives.

    2. In the Descriptives dialog, select our variable, Q.45 Respondent WEIGHT IN POUNDS, and move it into

    the Variable box with the arrow and clickOK.

    3. In Output, you receive the Descriptives results. You have four new numbers here that you dont receive inthe Frequencies result. First, you are given the range of your results. 1) Minimum is the lowest case value2) Maximum is the highest case value. So, you know all your cases range from -7 to 540. 3) Mean is themean of all the case values. 4) Std. Deviation, which stands for Standard Deviation, is a measure ofvariance and is explained below.

    4321Standard Deviation says where a certain amount of cases lie. Basically, if you have a normal distribution (abell-shaped curve as seen in our IQ graphic below), 68% of the cases fall within +/ 1 std. deviation from themean, a total of 95% of the cases fall within +/ 2 std. deviation from the mean, and a total of 99% of the cases

    40

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    41/55

    fall within +/ 3 std. deviation from the mean. So, in our IQ example again, 68% of people fall about 15 IQpoints away from the mean (range IQ 85 to 115).

    Back to our weight data set. From our descriptives, we see that the mean of our sample is approximately 164.14lbs with a standard deviation of approximately 55 lbs. So, we know 68% of our cases range approximatelyfrom 109 lbs to 211 lbs. You can also look and see how large the standard deviation is. The smaller thenumber, the less variance.

    Ok, we have a problem. Lets look back at our Descriptive Statistics. Whats wrong?

    Look at the Minimum. The values -7 and -5 (Dont know and No answer) are being included in our results as ifthey were genuinely pounds. So, we need to deselect all the cases that have these values. Refer back to SelectCases on pg. 29. In SPSS, click on Select Cases, select if condition is satisfied, and in the Select Cases: Ifcalculator type the statement q45 > 0, which means select cases in the variable weight only if theyre over 0lbs. Then go back and run the Descriptives for the variable Q.45 Respondent WEIGHT IN POUNDS.

    41

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    42/55

    These results are more accurate. The N, number of cases, dropped from 1202 to 1150 since the cases with -7and -5 werent included. The lowest case for weight, 50 lbs, was included as our minimum. Also, look at ourmean. It jumped 7 lbs to 171.78 lbs. The Std. Deviation decreased to 43.131, showing even less variance in ourdata. Do not reset the Q45 variable, and leave it with cases selected.

    Split File allows you to split a variable into groups and then run descriptive on that variable compared toanother variable. So, for example, we want to run descriptives on the weight of men and women. With Split

    File, we can split the variable gender into men and women and then run the descriptives to get the results foreach.

    1. In the Data menu, select Split File.

    2. In the Split File dialog, 1) select Compare groups so when you present the descriptives in one chart. 2)select the variable that you want to split into groups. Q921 Gender. 3) ClickOK.

    42

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    43/55

    1

    2

    3

    3. In the Analyze menu, select Descriptive Statistics and left-clickDescriptives.

    4. In the Descriptives, move the variable over to the Variable field that you want to descriptives for in thegroups of men and women, in this case Q.45 Respondent Weight. ClickOK.

    5. In the Descriptives result, you see the statistics for weight by male and female. Please remember to go backto the Split File dialog and reset.

    43

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    44/55

    Explore gives you many of the basics descriptives plus some nice graphics.

    1. In the Analyze menu, select Descriptive Statistics and then left-clickExplore.

    2. In the Explore dialog, select our variable, Q.45 Respondent WEIGHT IN POUNDS, and move it into the

    Variable box with the arrow . ClickOK.

    3. In the View, you get a number of results. The Descriptives chart gives you much of the important statisticsthat we discussed in Frequencies and Descriptives.

    44

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    45/55

    Stop!Positively skeweddata is when the long tail ofyour distribution going upon your scale. Negativeskewed is the long tailgoing down. Your data areconsidered skewed with a

    Skewness result over 1.

    The Boxplot is a good graphic as well. It is a depiction of the cases if they were lined up lowest to highest. 1)The beginning and the end of the range is the start and finish of the I figure. 2) The bottom and top of the red

    box are considered the 25th and 75th percentile. So, the 25th denotes that 25% of the cases occur below the lineand the 75th denotes that 75% of the cases occur below that line. 3) The thick line in the middle of the boxdenotes the median, or the exact middle case if lined up lowest to highest. The numbers outside the I figure

    are considered outliers, which are cases that are extreme and dont fit into the normal distribution.

    12

    3

    SPSS offers some other nice functions to visualize data through its Graphs menu. Lets explore some of them.First, lets make a histogram of our weight data.

    1. In the Graphs menu, left click on Chart Builder.

    45

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    46/55

    2. If you get a Chart Builder, its simply reminding you that you need to set the level of measurementscorrectly or you charts wont look right. Please select Dont show this dialog again and press OK.

    A histogram shows the number of caseswhich fall within each interval.

    If youre uncertain about something, go to

    the Help menu and left-clickTopics.

    Click on the Index tab and then you can typein the subject. The results can give youdefinitions and info on how to use thefunction.

    3. In the Chart Builder, 1) click on Histogram the firstgraphic. 2) Then double-left click on the first graphSimple Histogram in the bottom middle of the window.

    2

    1

    4. 1) The Element Properties window opens on the left, which contains more controls of graph. You can move

    it out of your way. 2) Graphs in SPSS are built by dragging and dropping variables from the Variables menuin the Chart preview area. Move the Q.45 respondent WEIGHT in POUNDS over to the Chart in the X-Axis? slot on the bottom.

    46

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    47/55

    2 1

    5. It is also very descriptive to show the Normal Curve for the histogram. 1) In the Element Properties, selectDisplay normal curve. 2) Finish your graph by pressing OK in the Chart Builder.

    1

    2

    47

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    48/55

    6. In the Output window, you have a histogram of the respondents weight.

    Another good graphic is a pie chart, where each pie slice represents a value within a variable. Lets make a piechart of the percentage participants by gender.

    1. Go back to Data Editor and open up the Chart Builder. 1) In the Gallery menu, left-clickPie. 2) Double-left click on Pie Chart icon, and the pie format appears in the chart preview area. 3) Drag and drop the gendervariable into the Slice by? slot.

    3

    12

    48

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    49/55

    2. You also need to select what values are used for the slices. You want to show only men and women, notvalues such as No Answer or Dont know. 1) In the Element Properties window, select GroupColor (Polar-Interval1) in the Edit Properties ofmenu. 2) In the Order menu, you see all the labels appear. Select each of

    the labels and click the exclude button except Male and Female. 3) ClickApply. 4) Back in the ChartEditor, clickOK.

    1

    2

    3

    4

    3. In the Output window, you see a pie chart for Male and Female. Lets be more descriptive and put thepercentage that each slice of the whole is.

    49

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    50/55

    4. 1) Double-left click on the pie chart and bring up the Chart Editor. 2) Left-click on the Show Data Labels

    button and bring up the Properties window with Data Value Labels tab selected.

    50

    5. 1) In the Not Displayed menu, select the Percent variable and left-click on Move Variable to Contents

    to move the variable into the Displayed menu. 2) In the Displayed menu, select the Count variable and

    left-click on exclude to move it down into Not Displayed. 3) Left-click on Apply. The values in eachslice of the pie chart are now percentages. Close out of Properties and Chart Editor.

    2

    2

    1

    1

    3

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    51/55

    A bar chart is also very useful.

    1. Back in the Data View, go to theGraphs menu and left-click on Chart Builder(You can also do this fromthe Output Viewer).

    2. In the Chart builder, 1) left-click on Bar. 2) Double-left click on Simple Bar. Q921 Gender should still bein the Slice by? slot.

    21

    3. In the Element Properties window, 1) in the Statistic pull-down menu, select Mean. Each gender bar withshow the mean weight. 2) ClickApply.

    1

    2

    51

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    52/55

    4. Now select the values that you want to chart. 1) In Edit Properties of, left-clickX-Axis1 (Bar1). 2) In theOrder menu, exclude all values except Male and Female. 3) ClickApply. 4) Move the Q45 RespondentWEIGHT into the weight slot.. 5) Back in the Chart Editor, clickOK.

    1

    4

    2

    3

    5

    5. In the Output window, you have your bar chart of men and women showing. Do you remember how toshow the mean weight number? (In the Chart Editor)

    Another good function of Output is that you can copy and paste results into a word document. So, if yourewriting an essay, you can just place your stats results right into your document.

    1. Right-click on our histogram and then left-click on Copy.

    52

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    53/55

    2. Open up a Word Document.

    3. In our Word Document, left-click on the screen and then right-click on Paste. The histogram appears andyou can position it as you wish. Please close Document1 and dont save it.

    Finally, you can print the results you want to use. Output allows you to close some of the results you dontwant, and print the rest. Lets leave the Graph for the histogram open, close the rest. We can then print theresults we left open.

    1. There are two ways to print.

    The first way is by clicking on the print icon .

    53

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    54/55

    The second way is by 1) going into the File menu and left-clicking on Print.

    2) In the Printer pulldown, make sure youre printing at the right printer. Click on OK.

    Remember, some of your results can be very long. For example, our Frequency chart for weight is very longand would not fit on one page. Sometimes you need to play with the data and your results, and look at themfrom different angles to see what best for you. In our next lesson, we will explore how to do inferential

    statistics in SPSS. With this form of statistics, you can form and test hypotheses about the whole populationfrom our sample. For now, you can do some further study on what we have just learned.

    54

  • 8/2/2019 SPSS1 Finding and Managing Data for the Social Sciences

    55/55

    Finding and understanding DataMilner Library: Finding Statistics - Understanding Statistics

    lb.ilstu.edu/learn/stat/understanding4.htm(http://www.m )

    statistics/understanding.html

    Baker Library Guide: Statistics: Understanding Statistics(http://www.library.hbs.edu/guides/ )

    Descriptive StatisticsIntroduction to Descriptive Statistics (http://www.mste.uiuc.edu/hill/dstat/dstat.html)

    yperStat Online: Descriptive Statistics (http://davidmlane.com/hyperstat/A28521.htmlH )

    aterials/c4_descriptive_statistics/

    School of Psychology: University of New EnglandChapter 4: Analysing the Data; Part 4: Descriptive Statistics(http://www.une.edu.au/WebStat/unit_m )

    SPSS

    SPSS Home Site(http://www.spss.com/)

    Raynalds SPSS Tools(http://www.spsstools.net/)

    questions, contact me at:ieve

    [email protected]

    If you have anyThomas St863-7978T

    Thomas Stieve

    http://www.library.hbs.edu/guides/statistics/understanding.htmlhttp://www.library.hbs.edu/guides/statistics/understanding.html