case reyem affair

30
REYEM AFFAIR REGRESSION CASE QUANTITATIVE METHODS II TO PROF. ARNAB BASU ON OCTOBER 21, 2011 BY GROUP NO. 5 AKSHAY RAM (1111004) ARUN PRABU (1111010) BHARTI VISHAL (1111016) DHANASHREE VINAYAK SHIRODKAR (1111022) GHULE NILESH VISHNU (1111028) AMOL DEVNATH KUMBHARE (1111034) MUDAVATH SWETHA (1111040) SUPREET KUMAR(1111046) RAJA SIMON J (1111052) SAGAR BEHERA (1111058) SHREYA SETHI (1111065) SWATI MURARKA (1111071)

Upload: vishal-bharti

Post on 13-Sep-2014

364 views

Category:

Documents


47 download

TRANSCRIPT

Page 1: Case Reyem Affair

REYEM AFFAIR

REGRESSION CASEQUANTITATIVE METHODS II

TO

PROF. ARNAB BASU

ONOCTOBER 21, 2011

BYGROUP NO. 5

AKSHAY RAM (1111004)ARUN PRABU (1111010)

BHARTI VISHAL (1111016)DHANASHREE VINAYAK SHIRODKAR (1111022)

GHULE NILESH VISHNU (1111028)AMOL DEVNATH KUMBHARE (1111034)

MUDAVATH SWETHA (1111040)SUPREET KUMAR(1111046)RAJA SIMON J (1111052)SAGAR BEHERA (1111058)SHREYA SETHI (1111065)

SWATI MURARKA (1111071)AJUSAL SUGATHAN (1111077)

INDIAN INSTITUTE OF MANAGEMENT, BANGALORE

Page 2: Case Reyem Affair

Table of Contents

S.No Particulars Pages1. Executive Summary 3-42. Understanding of the Problem 43. Model Description 5-13

Model 1Prediction interval Vs Confidence IntervalStep wise Regression: A closer lookTest of Model: Analysis of Results

5-8678

Model 2Test of Model: Analysis of Results

9-1311-13

Other Models 134. Conclusions and Recommendations 145. Appendix

1. Variables Entered/Removed2. Model Summary3. ANOVA4. Coefficients5. Residual Statistics

15

Page 3: Case Reyem Affair

Executive SummaryReyem Affiar has recently found the below described condominium in Mid-Cambridge that he wants to

purchase.

Street Address : 236 Ellery Street

Last Price : $169000

Area & Area Code : M/9

Bed : 2

Bath : 1

Rooms : 5

Interior : 1040

Condo : $175

Tax : $1121

RC : 1(Restrictions on monthly rent that owner may charge)

Even though Affiar is monetarily capable of paying the asking price of $169000, generally negotiations

from buyer’s agent keeps the selling price lower than the last asking price. Given the above information,

based on the data that Reyem Affiar has on condominiums sold in Cambridge the past five years, we

need to help Reyem Affiar to decide on a fair offer price.

Solution Approach

An estimate for selling price of the above condominium needs to be made. Hence selling price is clearly

the dependent variable ‘Y’ for the regression model. Clearly first date, close date and number of days

between the two (Days) cannot be part of the independent variable set since we do not have these

information for the 236 Ellery Steet Condominium yet (since the sale has not taken place yet). Further

the condominium of interest lies in area M (9), hence one could possibly analyze only the data on the

111 condominiums from the same area and ignore the rest. On the other hand, if we can set up

independent dummy variables for the area/area codes, these can be incorporated into our regression

model and then we will have a bigger sample of 456 data-points to make a better and more accurate

prediction for Affiar. This will be explained in detail in the model description. Stepwise regression in

SPSS has been adopted for variable selection. This method, being a combination of forward selection

Page 4: Case Reyem Affair

and backward elimination techniques for variable selection, avoids the errors in regression model that

can be committed due to multi-collinearity.

Figure 11.45 from Pg 571

Understanding of the Problem

Selection of independent variables is the key to arriving at a good regression model. On first look at the

given data, one can clearly see that the possible independent variables that may be affecting the selling

price could be first price, last price, number of days between first and last date, location (Area), number

of bedrooms, number of bathrooms, number of rooms, interior space, condominium taxes, yearly

property tax and rent control. But we have assumed that the given asking price of $169000 for the

Ellery Street condominium is the last price since the transaction could possibly happen on the next day

(May 4, 1994). This means we don’t have information on the first price for the Ellery Street

condominium, hence we remove first price from our possible independent variable list. As stated before

in section 1.1, we cannot have number of days between first and last date as an independent variable

either since the sale of condominium has not happened and we don’t have information on the first date

the condominium was put on sale. Finally, we can intuitively see that there will be a positive correlation

between interior space and number of rooms, bathrooms and bedrooms. Since interior space can be

representative of all, to avoid the issue of multi-collinearity, interior space can very well act as a good

proxy in our regression model for number of rooms, bathrooms and bedrooms. We will also show this

through the output generated in the model description section. Further, one can also expect last price

Page 5: Case Reyem Affair

and interior space to have positive coefficients while condominium taxes, property taxes and RC to have

negative coefficients. Effect of the other dummy variables for area/area codes need to be explored by

running the regression model.

Model DescriptionModel 1

The model(Appendix) can be described as follows (Exhibit 1):

Sale Price = 0.333*Last Price + 35.947*Tax + 44.967*Interior + 105.108*Condo + 10992.327*RC +

12290.704*A2 + 29804.817*A5 – 27984.595*A12 – 12447.291*A16 - 15967.736

Where A2, A5, A12 and A16 are the dummy variables associated with areas Avon Hill, East Cambridge,

Porter Square and West Cambridge respectively. They will take values of 1 or 0 depending on whether

we are to predict the price of a condominium in that area. For 236 Ellery Street Apartment, we have

Sale Price = 0.333*169000 + 35.947*1121 + 44.967*1040 + 105.108*175 + 10992.327*1 –

15967.736 = 156757.758

95% prediction interval for the Selling price of 236 Ellery Street Condominium is given by:

= 156757.758 ±t[0.025,(456-10)](30268.701252 + 9.162 * 108)0.5

= 156757.758 ± 1.9653 *(30268.701252 + 9.162 * 108)0.5

= 156757.758 ± 84127.57

= {72630.188, 240885.328}

The standard error and MSE are taken from the regression output table (Appendix).

Now, a 95% Confidence Interval for the Selling Price (conditional mean) of 236 Ellery Street

Condominium would be given by:

= 156757.758 ±t[0.025,(456-10)](4021)

= 156757.758 ± 1.9653 *(4021)

Page 6: Case Reyem Affair

= 156757.758 ±7902.471

= {148855.29, 164660.23}

The standard error of mean predicted value is taken from the Residual Statistics table (Appendix).

Exhibit 1: Regression Model Coefficients

Prediction interval Vs Confidence Interval

We have calculated the prediction interval and confidence interval for E(Sale Price) for the Ellery street

condominium for the given input independent variables (Section 1). While the predicted value and the

estimate of the mean value of Y(Sale Price here) are equal, the prediction interval is wider than a

confidence interval for E(Y) using the same confidence level. There is more uncertainty about the

predicted value than there is about the average value of Y given the values of X i. Based on the

confidence interval, the recommendation for Affiar would be to not bid more than the upper limit value

of $165,717 since he can be confident to a level of 97.5% (100% – 5%/2) that the final selling price

(mean) of the condominium would be below this number. So $164,660 is the maximum that he should

bid on the condominium. If he were to be more conservative in his bid, then he can go by the prediction

interval. Since the upper limit of the prediction interval $240,885 is greater than the asking price of

$169000, his bid should be $169,000 in this case. The maximum he can afford to bid for the house with

a 95% confidence level would be $240,885.

Page 7: Case Reyem Affair

Step-wise regression: A closer look

Given the possible set of 23 independent variables (Last Price, Bed, Bath, Rooms, Interior, Condo, Tax,

RC, A1,A2, A3, A4, A5, A6, A7, A8, A10, A11, A12, A13, A14, A15, A16), the algorithm starts by finding the most

significant single-variable regression model. So Last Price with the highest F-value and hence a p-value <

pin enters the regression model (note pin = 0.05). Now the other 22 variables left out of the model are

checked via a partial F-test, and the most significant variable, Tax, is now added to the model.Now the

original variable Last Price is reevaluated to see if it meets the preset significance standard of p-value <

pout(note pin = 0.10). Since it meets this criterion, the variable is retained in the model. Now again the

other 21 variables outside the model are checked via a partial F-test, and the most significant variable,

now Interior, enters the model. All variables in the model, namely Last Price and Tax are now checked

again for staying significance. The procedure continues until there are no variables outside that should

be added to the model and no variables inside the model that should be out. On 9 th iteration, this

happens for Model 1 as shown in Appendix. To illustrate how the issue of multi-collinearity is inherently

taken of in this Step-wise regression technique, a regression analysis was done between rooms and

interior variables and it was found that these two were highly correlated (Appendix). Obviously, the

step-wise regression took the more significant variable “Interior” in the final regression model

eliminating the lesser significant highly correlated “Rooms” variable from the final regression model.

Let us check if the model’s regression assumptions are satisfied through Residual Analysis:

From the normality histogram for residuals shown in the figure below, it is clear that the normality

assumption is satisfied since the residuals (standardized) seem to be normally distributed. The normal

P-P graph also confirms the same. Lastly homoscedasticity can be seen from the residual scatter plot

where the residuals are scattered around the mean 0 in a random fashion with no observable pattern or

heteroscedasticity. Finally the independence assumption between the independent variables is

inherently taken care of in the step-wise regression technique which checks for multi-collinearity after

each stage (as shown in Figure 1) with a P in = 0.05 and Pout = 0.10. Hence the algorithm automatically

kicks out of the model variables that are correlated to each other and keeps only the most significant

independent variables inside the model. The individual residual plots of residual error Vs each

independent variable is shown in Appendix.

Page 8: Case Reyem Affair

Test of Model: Analysis of Results

Significance of model: From Appendix, ANOVA table shows that that F-value for model 2 is 7828 with a

significant p-value of 0. Since p-value < 0.05, we reject the null hypothesis (β1= ……..= β11 = 0) and hence

there is atleast one βi that is significant. We will look at the coefficients table to ensure the coefficients

are significantly different from zero. As we can see from the coefficients table for Model 1, the p-values

for coefficients are lesser than 0.05 (alpha value). Hence we reject the null-hypothesis for each βi(i.e. βi

= 0) and thus the coefficients are significant. Finally we look at the Adjusted-R2 (since this accounts for

the increase in R2 due to an increase in number of independent variables) values for goodness-of-fit test.

A high Adjusted R2 value of 0.886 in this case (Appendix) suggests that 88.6% of the variation in Sale

Price is explained by the regression model.

Page 9: Case Reyem Affair
Page 10: Case Reyem Affair

Model 2:

In Model 1, we have clearly accounted for the areas/area codes of condominiums by starting with the 15

dummy variables for our step-wise regression analysis. One could very well argue that condominiums

outside of Mid-Cambridge should not be considered for analysis. Hence step-wise regression was run

with only the 111 data points from Mid-Cambridge condominiums. The step-wise regression was

started with the input independent variables including Last Price, Bed, Bath, Rooms, Interior, Condo, Tax

and RC. But Last Price and RC were the only independent variables that seem to have a significant

impact on the Selling Price. The step-wise regression with a P in = 0.05 and Pout = 0.10 was carried out, as

we can see from Appendix, Last Price and RC were the only independent variables with a significant

impact (based on step-wise partial F-test) on Selling Price. The model can be summarized as below:

Selling Price = 0.96 * Last Price + 1935.903 * RC – 2181.178

For the Ellery Street condominium, we have:

Selling Price = 0.96 * 169000 + 1935.903 * 1 – 2181.178

= $161,994.725

Similar to model 1, 95% prediction interval for the Selling price of 236 Ellery Street Condominium is

given by :

= 161,994.725±t[0.025,(111-3)](4422.9452 + 1.956 * 107)0.5

= 161,994.725± 1.98217 *(4422.9452 + 1.956 * 107)0.5

= 161,994.725 ± 12398.064

= {149596.661, 174392.7892}

The standard error and MSE are taken from the regression output table (Appendix).

Page 11: Case Reyem Affair

Now, a 95% Confidence Interval for the Selling Price (conditional mean) of 236 Ellery Street

Condominium would be given by:

= 161,994.725±t[0.025,(111-3)](698.994)

= 161,994.725 ± 1.98217 *(698.994)

= 161,994.725±1385.525

= {160609.2,163380.25}

The standard error of mean predicted value is taken from the Residual Statistics table (Appendix).

As explained for model 1, there is more uncertainty about the predicted value than there is about the

average value of Y given the values of X i. Based on the confidence interval, the recommendation for

Affiar would be to not bid more than the upper limit value of $163,380 since he can be confident to a

level of 97.5% (100% – 5%/2) that the final selling price (mean) of the condominium would be below this

number. So $163,380 is the maximum that he should bid on the condominium. If he were to be more

conservative in his bid, then he can go by the prediction interval. Since the upper limit of the prediction

interval $174,393 is greater than the asking price of $169000, his bid should be $169,000 in this case.

The maximum he can afford to bid for the house with a 95% confidence level would be $174,393.

Page 12: Case Reyem Affair

Coefficientsa

Model

Unstandardized Coefficients

Standardized

Coefficients

t Sig.B Std. Error Beta

1 (Constant) -544.824 1357.461 -.401 .689

LastPrice .958 .008 .996 123.128 .000

2 (Constant) -2181.178 1541.383 -1.415 .160

LastPrice .960 .008 .998 124.529 .000

RC 1935.903 909.479 .017 2.129 .036

a. Dependent Variable: SalePrice

Let us check if the model’s regression assumptions are satisfied through Residual Analysis:

From the normality histogram for residuals shown in the figure below, it is clear that the normality

assumption is satisfied since the residuals (standardized) seem to be normally distributed. The normal

P-P graph also confirms the same. Lastly homoscedasticity can be seen from the residual scatter plot

where the residuals are scattered around the mean 0 in a random fashion with no observable pattern or

heteroscedasticity. Finally the independence assumption between the independent variables is

inherently taken care of in the step-wise regression technique which checks for multi-collinearity after

each stage (as shown in Figure 1) with a P in = 0.05 and Pout = 0.10. Hence the algorithm automatically

kicks out of the model variables that are correlated to each other and keeps only the most significant

independent variables inside the model. The individual residual plots of residual error Vs each

independent variable is shown in Appendix.

The step-wise regression method adopted works the same way as it was explained for model-1. Here

only 2 iterations were required to arrive at the final model as shown in Appendix.

Test of Model: Analysis of Results

Significance of model: From Appendix, ANOVA table shows that that F-value for model 2 is 7828 with a

significant p-value of 0. Since p-value < 0.05, we reject the null hypothesis ( β1= β2= β3 = 0) and hence

there is at least one βi that is significant. We will look at the coefficients table to ensure the coefficients

are significantly different from zero. As we can see from the coefficients table for Model 2, the p-values

for coefficients are lesser than 0.05 (alpha value). Hence we reject the null-hypothesis for each βi(i.e. βi

= 0) and thus the coefficients are significant. Finally we look at the Adjusted-R2 (since this accounts for

Page 13: Case Reyem Affair

the increase in R2 due to an increase in number of independent variables) values for goodness-of-fit test.

A high Adjusted R2 value of 0.993 in this case (Appendix) suggests that 99.3% of the variation in Sale

Price is explained by the regression model.

Page 14: Case Reyem Affair

Other Models:

In addition to the above 2 best-fit models, a number of other regression models with different

combinations of input independent variables were tried. For instance, areas based on location (with the

help of the map provided) were grouped to form lesser number of dummy variables (e.g., grouping

Page 15: Case Reyem Affair

Agassiz, Harvard Square and Radcliffe). Multiple such combinations were formed to see how area can

be best-fit into the model. ‘Rooms’ was tried as proxy for interior (due to their high correlation as seen

in Appendix). Best fit test for each model based on R2 values, significance of coefficients, residual plots

was conducted and the best 2 models have been presented in the case solution. Also in each model, the

given price for the Ellery street condominium has been assumed as the Last Price as stated before.

Conclusions and recommendationsTwo regression models were presented to fit the given data in order to predict the sale price for the 236

Ellery Street condominium. The summary of the offer price that Affiar should be making on the

condominium based on the two models is shown in the table below:

Mean Selling

Price ($)Prediction Interval ($) Confidence Interval ($)

Recommend

ed bid price

($)

Max.

Conservativ

e bid price

($)

Model

1

156757.758 {72630.188,240885.328} {148855.29,164660.23} 164,660 240,885

Model

2

161,994.725 {149596.661,174392.789} {160609.2,163380.25} 163,380 174,393

Comparing the Adjusted R2 values of the two models, we see that Model 2 is able to explain 99.3% of

variation in Sale price against Model 1’s 88.6%. Hence one might be tempted to use Model 2. But on a

closer look at the independent variables in model 2, Last Price and RC are the only independent

Page 16: Case Reyem Affair

variables used. In this case there is not a large difference between the recommended prices for Affiar

using model 1 or model 2, but in reality buyer can’t base his/her offer just by the seller’s stated Last

price. Obviously a number of other factors like interior space, tax, apartment maintenance fee, area,

etc., need to be considered. From the given data, model 1 has made a comprehensive attempt to form

the best possible regression fit by use of maximum data points. Hence the recommendation would be

to go by model 1, but in this specific case of the Ellery Street house, since the variation for the predicted

selling price from the two models is not much, it is left to Affiar to either make an initial offer of

$164,660 or $163,380.

AppendixModel 1

Variables Entered/Removed

ModelVariables

Entered

Variables

RemovedMethod

1 Last Price .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

2 Tax .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

3 Interior .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

4 Condo .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

5 A12 .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

6 A5 .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

Page 17: Case Reyem Affair

7 RC .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

8 A16 .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

9 A2 .Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-

remove >= .100).

a. Dependent Variable: Sale Price

Model Summary

Mod

elR

R

Square

Adjusted

R Square

Std. Error

of the

Estimate

Change Statistics

R Square

ChangeF Change df1 df2

Sig. F

Change

1.872a .760 .759

44066.104

45.760 1437.412 1 454 .000

2.925b .856 .855

34200.573

11.096 300.700 1 453 .000

3.930c .866 .865

33044.264

74.010 33.258 1 452 .000

4.935d .875 .873

31966.947

53.009 31.979 1 451 .000

5.938e .880 .879

31308.345

91.005 20.174 1 450 .000

6.940f .884 .882

30851.682

38.004 14.420 1 449 .000

7.941g .886 .884

30574.873

47.002 9.167 1 448 .003

Page 18: Case Reyem Affair

8.942h .887 .885

30404.434

34.002 6.037 1 447 .014

9.943i .889 .886

30268.701

25.001 5.018 1 446 .026

a. Predictors: (Constant), Last Price

b. Predictors: (Constant), Last

Price, Tax

c. Predictors: (Constant), Last Price, Tax,

Interior

d. Predictors: (Constant), Last Price, Tax, Interior,

Condo

e. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12

f. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12, A5

g. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12, A5,

RC

h. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12, A5, RC, A16

i. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12, A5, RC, A16, A2

j. Dependent Variable: Sale Price

ANOVA

Model Sum of Squares df Mean Square F Sig.

1

Regression 2.791E12 1 2.791E12 1.437E3 .000a

Residual 8.816E11 454 1.942E9

Total 3.673E12 455

2 Regression 3.143E12 2 1.571E12 1.343E3 .000b

Page 19: Case Reyem Affair

Residual 5.299E11 453 1.170E9

Total 3.673E12 455

3

Regression 3.179E12 3 1.060E12 970.531 .000c

Residual 4.935E11 452 1.092E9

Total 3.673E12 455

4

Regression 3.212E12 4 8.030E11 785.781 .000d

Residual 4.609E11 451 1.022E9

Total 3.673E12 455

5

Regression 3.232E12 5 6.463E11 659.386 .000e

Residual 4.411E11 450 9.802E8

Total 3.673E12 455

6

Regression 3.245E12 6 5.409E11 568.279 .000f

Residual 4.274E11 449 9.518E8

Total 3.673E12 455

7

Regression 3.254E12 7 4.649E11 497.265 .000g

Residual 4.188E11 448 9.348E8

Total 3.673E12 455

8

Regression 3.260E12 8 4.074E11 440.754 .000h

Residual 4.132E11 447 9.244E8

Total 3.673E12 455

9

Regression 3.264E12 9 3.627E11 395.860 .000i

Residual 4.086E11 446 9.162E8

Total 3.673E12 455

a. Predictors: (Constant), LastPrice

b. Predictors: (Constant), LastPrice, Tax

c. Predictors: (Constant), LastPrice, Tax, Interior

d. Predictors: (Constant), LastPrice, Tax, Interior, Condo

e. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12

f. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5

g. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5, RC

h. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5, RC, A16

i. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5, RC, A16, A2

Page 20: Case Reyem Affair

j. Dependent Variable: SalePrice

Coefficients

Model

Unstandardized

Coefficients

Standardize

d

Coefficients t Sig.

95% Confidence

Interval for B

Collinearity

Statistics

B Std. Error BetaLower

Bound

Upper

Bound

Toleranc

eVIF

1 (Constant

)38849.701 4052.438 9.587 .000 30885.838 46813.564

LastPrice .720 .019 .872 37.913 .000 .683 .758 1.000 1.000

2 (Constant

)23233.629 3271.562 7.102 .000 16804.307 29662.951

LastPrice .416 .023 .504 18.154 .000 .371 .461 .414 2.416

Tax 47.547 2.742 .481 17.341 .000 42.158 52.935 .414 2.416

3 (Constant

)2954.638 4728.282 .625 .532 -6337.506 12246.782

LastPrice .385 .023 .466 16.921 .000 .341 .430 .391 2.555

Tax 42.937 2.767 .434 15.516 .000 37.499 48.375 .379 2.636

Interior 33.058 5.732 .127 5.767 .000 21.793 44.323 .614 1.628

4 (Constant

)-8782.305 5022.983 -1.748 .081 -18653.660 1089.051

LastPrice .351 .023 .424 15.336 .000 .306 .396 .363 2.753

Tax 33.071 3.195 .335 10.350 .000 26.792 39.351 .266 3.755

Interior 43.555 5.848 .167 7.448 .000 32.062 55.047 .552 1.810

Condo 123.044 21.759 .148 5.655 .000 80.284 165.805 .405 2.467

5 (Constant

)-6822.788 4938.803 -1.381 .168 -16528.769 2883.192

LastPrice .365 .023 .442 16.136 .000 .321 .410 .356 2.809

Tax 31.758 3.143 .321 10.104 .000 25.581 37.935 .264 3.788

Interior 42.998 5.729 .165 7.506 .000 31.740 54.256 .552 1.811

Condo 118.486 21.334 .143 5.554 .000 76.559 160.414 .404 2.472

Page 21: Case Reyem Affair

A12 -

37401.5498327.091 -.074 -4.492 .000 -53766.363 -21036.736 .974 1.026

6 (Constant

)-4031.904 4921.946 -.819 .413 -13704.814 5641.006

LastPrice .348 .023 .421 15.299 .000 .303 .393 .342 2.923

Tax 32.865 3.111 .332 10.564 .000 26.752 38.979 .262 3.821

Interior 43.420 5.646 .167 7.690 .000 32.324 54.516 .552 1.812

Condo 99.625 21.602 .120 4.612 .000 57.171 142.078 .383 2.610

A12 -

35648.2708218.611 -.071 -4.338 .000 -51799.991 -19496.550 .971 1.029

A5 24339.409 6409.480 .068 3.797 .000 11743.104 36935.714 .802 1.247

7 (Constant

)

-

14313.1815943.400 -2.408 .016 -25993.587 -2632.776

LastPrice .342 .023 .414 15.101 .000 .297 .386 .339 2.948

Tax 34.463 3.128 .349 11.018 .000 28.316 40.610 .254 3.933

Interior 44.305 5.603 .170 7.907 .000 33.293 55.317 .550 1.817

Condo 104.991 21.481 .126 4.888 .000 62.774 147.207 .380 2.628

A12 -

28972.7468438.023 -.058 -3.434 .001 -45555.768 -12389.725 .905 1.105

A5 29784.464 6601.659 .084 4.512 .000 16810.400 42758.528 .742 1.347

RC 10435.164 3446.591 .056 3.028 .003 3661.669 17208.658 .740 1.352

8 (Constant

)

-

14679.6365912.150 -2.483 .013 -26298.698 -3060.575

LastPrice .337 .023 .407 14.883 .000 .292 .381 .336 2.974

Tax 35.642 3.147 .361 11.325 .000 29.457 41.827 .248 4.027

Interior 44.354 5.572 .170 7.960 .000 33.404 55.304 .550 1.817

Condo 104.674 21.362 .126 4.900 .000 62.691 146.656 .380 2.628

A12 -

29037.9018391.027 -.058 -3.461 .001 -45528.663 -12547.139 .905 1.105

A5 29005.214 6572.515 .081 4.413 .000 16088.348 41922.080 .741 1.350

RC 11485.488 3453.935 .062 3.325 .001 4697.521 18273.455 .729 1.373

A16 -

13478.4685485.758 -.040 -2.457 .014 -24259.547 -2697.390 .951 1.052

Page 22: Case Reyem Affair

9 (Constant

)

-

15967.7365913.780 -2.700 .007 -27590.071 -4345.402

LastPrice .333 .023 .403 14.763 .000 .289 .377 .335 2.988

Tax 35.947 3.136 .364 11.462 .000 29.783 42.110 .248 4.035

Interior 44.967 5.554 .173 8.097 .000 34.052 55.882 .549 1.821

Condo 105.108 21.268 .127 4.942 .000 63.311 146.906 .380 2.629

A12 -

27984.5958366.791 -.056 -3.345 .001 -44427.826 -11541.364 .902 1.108

A5 29804.817 6552.903 .084 4.548 .000 16926.416 42683.218 .738 1.354

RC 10992.327 3445.556 .059 3.190 .002 4220.785 17763.869 .726 1.378

A16 -

12447.2915480.634 -.037 -2.271 .024 -23218.366 -1676.216 .944 1.059

A2 12290.704 5486.742 .036 2.240 .026 1507.625 23073.784 .967 1.034

a. Dependent Variable: SalePrice

Residuals Statistics

Minimum Maximum Mean Std. Deviation N

Predicted Value 2.1894E4 7.3736E5 1.7108E5 84699.37571 456

Std. Predicted Value -1.761 6.686 .000 1.000 456

Standard Error of Predicted

Value1971.030 2.458E4 4.021E3 1982.252 456

Adjusted Predicted Value 1.6813E4 1.1794E6 1.7253E5 95574.81320 456

Residual -3.59573E5 1.37644E5 .00000 29967.84529 456

Std. Residual -11.879 4.547 .000 .990 456

Stud. Residual -20.352 4.861 -.017 1.268 456

Deleted Residual -1.05539E6 1.57295E5 -1.45182E3 55783.52632 456

Stud. Deleted Residual -76.135 4.990 -.139 3.664 456

Mahal. Distance .932 298.983 8.980 16.348 456

Cook's Distance .000 80.153 .179 3.753 456

Centered Leverage Value .002 .657 .020 .036 456

a. Dependent Variable: SalePrice

Page 23: Case Reyem Affair

Interior Vs Rooms – Regression results showing correlation

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.775952808R Square 0.60210276Adjusted R Square 0.601226335Standard Error 217.7411745Observations 456

ANOVA df SS MS F Significance F

Regression 1 32571418.053257141

8686.998

1 6.7719E-93Residual 454 21524693.45 47411.22 Total 455 54096111.51

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept -76.7538578 42.08789622 -1.82366 0.068861 -159.4651166 5.957400971Rooms 235.8872688 8.999672999 26.21065 6.77E-93 218.2010847 253.5734529

0 20 40 60 80 100 1200

4000

Normal Probability Plot

Sample Percentile

Inte

rior

1 2 3 4 5 6 7 8 9 10-1000

0

1000

2000Rooms Residual Plot

Rooms

Resid

uals