spss tutorial i

Upload: raquel-campbell

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 SPSS Tutorial I

    1/8

    Cathy Chen Marketing Research Term 1, 2014-15

    1

    SPSS Tutorial I (Lecture 6)

    1. Opening up Datasets

    The first thing you will probably want to do is to work with a dataset. For this, first

    download the datasets from SMU eLearn to a directory that you have access to (e.g.D:\yourwork). Make sure you remember where you save your file because you will needto get SPSS to see this again.

    Once your dataset has been downloaded, return to SPSS. If you see the screen as follows,then select more files from the list and click on Ok.

    If you do not see this dialog box you will need to select File from the main menu,select Open and then the dialog box should open up. You will also do this when youopen up new datasets. You can have multiple datasets open, and you can close each of

    these datasets without exiting SPSS. If, however, you close the final dataset then you riskalso closing SPSS entirely a dialog box will appear warning you that you should not dothis.

    Each dataset can be worked on individually. The output will be stored in a single sheet.For example, if you run Descriptive Statistics from the Boston Dataset, then the variablesfor the Boston dataset will appear in the dialog box.

  • 8/11/2019 SPSS Tutorial I

    2/8

    Cathy Chen Marketing Research Term 1, 2014-15

    2

    The organization of data in SPSS

    Data in SPSS is handled and organized very much like an Excel spreadsheet. Each cellrepresents a data point. The data can be nominal (i.e. categories such as male/female,purple/green/blue.., interval (e.g. 1,2,3,4.. for a Likert type scale), or ratio (think of this

    as continuous, such as 122.523). It is also possible to have many other formats, such asfree-form text, date/time, currency and so on.

    Typically we will organize the data so that each column represents a variable and eachrow represents a sampling unit. Sometimes we will use multiple rows for repeatedobservations on the same sampling unit.

    Another example, if you want to insert a variable, then right click a column. Insertingrows you can do by right clicking on the grey cell next to the row of interest (see below):

    If you want to change the number in a cell you can double click on it to edit, or, as inExcel, just press the F2 function key.

  • 8/11/2019 SPSS Tutorial I

    3/8

    Cathy Chen Marketing Research Term 1, 2014-15

    3

    2. Use of Output

    Output windows in SPSS are organized like trees or directories with many levels ofsubdirectories. You can export this output to other applications by right clicking on theobject you want, then select Export. When you run one of the applications discussed

    below, look for the output in the output window. You may need to scroll down to find it(it will typically be the last one run).

    You can also delete specified output sections for example if you wish to rerun theanalysis and dont want the old copy lying around then delete the one you dont want byselecting it from the sub-directories, and press the delete key once.

    3. MKTG103 Lecture Material

    3.1 Visualizing Data (Lecture 6)

    Much of the graphical presentation of data is handled via an interactive tool called ChartBuilder. To use this tool, go to the main menu bar, select Graphics and click onChart Builder. The way this works is that you can use your mouse to drag and dropdifferent objects to be graphed together. There are a wide variety of ways to graph datahere, and they are organized in categories (such as line graphs, bar graphs,boxplots etc). See the illustration:

  • 8/11/2019 SPSS Tutorial I

    4/8

    Cathy Chen Marketing Research Term 1, 2014-15

    4

    3.2 Trying some graphics.

    Try the following graphics we did in class:

    1) Scatter-plot open up the file lecture6-boston. From the Graphics menu, selectChart-Builder and click once on the category called Scatter/Dot. You should see anumber of variables. Select MEDV for the Y-Axis (the vertical one) and Age forthe X-Axis (the horizontal one). You can add annotations such as legends, colors,titles, labels for the axes etc. with the tabs on the left. Play with some of these and createyour own unique scatter plot. When you are done, consider what is the relationshipbetween the two variables.

    Harder: try to do a scatter plot of CRIM (crime rate) versus MEDV (median value ofhomes). What do you see? Try a log transformation of the crime rate variable.

    2) Frequency distribution: Run a histogram on the variable MEDV. Select bar-charts.Drag the variable MEDV to the X axis. When you are ready to view it, press ok.Interpret the distribution.

    3) Density plot: plot Y=MEDV as a density plot. This is located under scatter/dot,but select the plot with the vertical columns of circles. This is very similar to thehistogram, but with finer divisions for the bins. Interpret this plot what do you see thatis not obvious in the histogram?

    3.3 Descriptive Statistics

    Another way to get a feel for the data is to run some basic descriptive statistics. SPSShas these organized by type under the Analyze menu. Select Descriptive Statisticsfrom this menu and select the one that you want to use.

    4) Present the descriptive statistics for the variable MEDV.

    3.4 Q-Q Plot: visualizing if the data come from a known (e.g. Normal) distribution.

    You can examine a histogram and get a rough sense of whether the data come from anormal distribution or not. A more formal, theoretically sound way of doing this is to usea Q-Q plot.

    The Q-Q plot (Quantile-Quantile) allows you to visually see if the data appears to befrom a normal distribution or not. The way it does this is systematically compare howthe data deviates from what you may expect that data to look like if it were normallydistributed. The technique plots the order statistics of the observed data (e.g. each of thepercentiles) against the quantiles (think of them as order statistics) of the normal

  • 8/11/2019 SPSS Tutorial I

    5/8

    Cathy Chen Marketing Research Term 1, 2014-15

    5

    distribution. Where do the parameters (mean and standard deviation) of this normaldistribution come from? They are estimated from the same sample of data.

    Another version of this is the P-P plot it plays the same role but plots the probabilitiesinstead of the quantiles. It is also possible to diagnose different distributions.

    To run the Q-Q plot, it is actually not one of the graphics directly available in SPSSsChart Builder. You need to go to Analyze->Descriptive Statistics-> Q-Q plots. Youshould see the following:

    5) Run the Q-Q plot on MEDV first. Later select AGE. What are your conclusionsabout the distribution of these two variables?

    Harder: Now select CRIM and run a Q-Q plot. What do you see? Can you think of away to transform and rerun the Q-Q plot? Is there a way to do this in one step?

    3.5 One Way and Two Way Chi-Squared Relationship between Nominal Variables

    (Lecture 6)

    Command 1:

    a) One Way Chi-Square Test :

    a. Analyze- NonParametric Test Chi-Square

  • 8/11/2019 SPSS Tutorial I

    6/8

    Cathy Chen Marketing Research Term 1, 2014-15

    6

    This is useful if you have a nominal variable, and you want to test hypotheses about howthis data is distributed.

    Exercise: Download the file lecture6_chisq1 (Canopy of Care Charity) and run the oneway Chi-Square test on pressure. Interpret the results what are your conclusions?

    3.6 Lecture 6 (Hypothesis Testing) - Both Means and Proportions

    3.6.1: Test against a theory

    We now move on to hypothesis testing for ratio (continuous, or not nominal) variables.For example, income, sales, height, weight, distance from an object, annual precipitationetc. We may have some prior belief about what the value of this should be. For example,

    we expect that the salary of a brand manager for a Fortune 500 firm be $100,000 perannum.

    Command :

    Exercises:

    1. Open the Lecture6-QualityMotors dataset. Quality motors is an automobiledealership that regularly advertises in its local market area. It has claimed that a certain

    make and model of car averages 30 miles to a gallon of fuel and mentions this figure mayvary with driving conditions. A local consumer group wishes to verify the advertisingclaim. To do so, it selects a sample of recent purchasers of this make and model ofautomobile. It asks them to drive their cars until two tanks of gasoline are used up andrecord the mileage. The group then calculates and records the miles per gallon for eachcar. The dataset contains the results.

    a. Formulate a statistical hypothesis to test the consumer groups purpose.b. Calculate the mean average miles per gallon. Compute the sample variance and

    sample standard deviation.c. According to your hypothesis, construct the appropriate statistical test using a .05

    significance level.d. What is the p value? At what level of confidence do you reject the null

    hypothesis?

    2. Open the Lecture6_tvshare dataset. A TV station is trying to determine theirmarket share for a talk show program so that they can accurately price commercials fortheir station. If their market share is the same as the past level of 8%, they will leave

    a) One Sample t-test (Test of a single mean and proportion)a. Analyze Compare Means One Sample t-test

  • 8/11/2019 SPSS Tutorial I

    7/8

    Cathy Chen Marketing Research Term 1, 2014-15

    7

    prices unchanged. If, however, market share is greater they will increase pricesaccordingly.

    a. Can you formulate a hypothesis to help the TV station decide whether to modifytheir price? In SPSS can you test this hypothesis?

    b.

    What will be your conclusion based on this test? First state the hypothesis.Remember the difference between one and two-tailed tests. Which one is this?

    3.6.2: Test through comparison of two means

    We now examine the question of whether two samples have the same mean or proportion.

    Command 1:

    The assumptions for the two-sample means test are:a. The samples are independent.b. Have equal variances across samples.c. Are both drawn from a Normal distribution.

    Command 2:

    Exercises:

    1. Open the Lecture6_internet usage dataset. A data of 30 respondents was collectedto examine the Internet usage behavior. The data includes the sex (1=male, 2=female),familiarity with the Internet (1 = very unfamiliar, 7 = very familiar), Internet usage inhours per week, attitude toward the Internet and attitude toward technology, bothmeasured on a 7-point scale (1 = very unfavorable, 7 = very favorable), and whether therespondents have done shopping or banking on the Internet ( 1= Yes, 2 = No).

    a. Suppose we want to determine whether Internet usage was different for males ascompared to females, what analysis should be conducted?

    b. Suppose we are interested in determining whether the respondents differed in theirattitude toward the Internet and attitude toward technology, what analysis should beconducted?

    b) Differences in means (Test of differences in means and proportions)a. Anal ze Com are Means Inde endent Sam les t-test

    c) Differences in means (Test of differences in means and proportions)a. Anal ze Com are Means Paired Sam les t-test

  • 8/11/2019 SPSS Tutorial I

    8/8

    Cathy Chen Marketing Research Term 1, 2014-15

    8

    2. Open the Lecture6-ad campaign dataset. A new and an old Ad campaign wereshown to two independent groups. Each group consisted of 64 participants. Within thegroup who watched the old campaign, 25 indicated they would like to buy the productbased on the campaign. For the group who watched the new campaign, 32 indicated theywould buy the product. Is the response to the two ad campaigns significantly different

    from each other? Can you formulate the hypothesis and test it in SPSS?