what decides the price of used cars?
DESCRIPTION
What decides the price of used cars?. Group 1 Jessica Aguirre Keith Cody Rui Feng Jennifer Griffeth Joonhee Lee Hans-Jakob Lothe Teng Wang. How we got data. Collected from kbb.com (Kelley Blue Book) Used random number generator - PowerPoint PPT PresentationTRANSCRIPT
What decides the price of used cars?
Group 1
Jessica AguirreKeith CodyRui FengJennifer GriffethJoonhee LeeHans-Jakob LotheTeng Wang
How we got data
Collected from kbb.com (Kelley Blue Book) Used random number generator
First collected 140 sets of data from various types of cars
Then collected 160 sets of data from Toyota Camrys
Brand Population
Models
Average Selling Price by Brand
Assumptions
Random sample is representative of population
All prices are the selling price Residuals are homoskedastic Residuals are normally distributed The variables we choose affect the price of
used cars: age, color, etc
Preparations
Created dummy variables e.g. Transmission, automatic = 0, manual = 1 Color Type Engine
(V4 = 4, V8 = 8, etc)
All Cars: Regression of price against independent variables (age, color, engine, miles and transmission)Dependent Variable: PRICE
Method: Least Squares
Date: 11/29/10 Time: 16:43
Sample: 1 140
Included observations: 140
Variable Coefficient Std. Error t-Statistic Prob.
AGE -671.2805 191.5316 -3.504803 0.0006
COLOR 151.6366 156.6386 0.968067 0.3348
ENGINE 1793.689 292.1268 6.140105 0.0000
MILES -3798.259 590.6794 -6.430323 0.0000
TRANSMISSION 1462.702 1248.129 1.171916 0.2433
C 48055.05 6425.417 7.478900 0.0000
R-squared 0.555991 Mean dependent var 16859.54
Adjusted R-squared 0.539424 S.D. dependent var 6831.670
S.E. of regression 4636.365 Akaike info criterion 19.76316
Sum squared resid 2.88E+09 Schwarz criterion 19.88923
Log likelihood -1377.421 F-statistic 33.55917
Durbin-Watson stat 1.676795 Prob(F-statistic) 0.000000
All Cars: Regression of price against significant independent variables (p<0.05)Dependent Variable: PRICE
Method: Least Squares
Date: 11/29/10 Time: 16:40
Sample: 1 140
Included observations: 140
Variable Coefficient Std. Error t-Statistic Prob.
AGE -630.8791 190.0333 -3.319834 0.0012
ENGINE 1751.229 289.7728 6.043457 0.0000
LNMILE -4013.936 576.0638 -6.967866 0.0000
C 51372.11 6106.150 8.413175 0.0000
R-squared 0.547257 Mean dependent var 16859.54
Adjusted R-squared 0.537271 S.D. dependent var 6831.670
S.E. of regression 4647.190 Akaike info criterion 19.75407
Sum squared resid 2.94E+09 Schwarz criterion 19.83812
Log likelihood -1378.785 F-statistic 54.79716
Durbin-Watson stat 1.655862 Prob (F-statistic) 0.000000
Price = -631.9880*AGE + 949.8378* ENGINE -0.051251* MILEAGE + 1977.688*TRIM + 18866.11
Some reasons why this model fails
Color is randomly assigned a number (red = 9, blue = 7, etc) Engines: e.g. 4 cylinder = 4, V8 = 8 assumes the V8 is
twice the price of 4 cylinder We suspect that many models leads to low R-Square
Our solution: New model
New model where we look at one model and brand (Toyota Camry), only two engines (4 cylinder and 6 cylinder), and disregard color
Dummy variable for engine: 6 cylinder = 1, 4 cylinder = 0 We also introduce a new variable called trim Dummy variable for trim: luxury = 1, standard = 0
Toyota Camryo Most Popular Car in America*
* Motor Trendhttp://www.motortrend.com/features/auto_news/2010/112_1004_america_top_10_best_selling_vehicle_comparison_2009_2000/index.html
Camry Price Histogram
Toyota Camry: Regression of price against independent variables (age, engine, mileage, trim and transmission)Dependent Variable: PRICE
Method: Least Squares
Date: 11/29/10 Time: 20:09
Sample: 1 160
Included observations: 160
Variable Coefficient Std. Error t-Statistic Prob.
AGE -625.4328 64.45118 -9.703978 0.0000
ENGINE 917.0942 324.9508 2.822256 0.0054
MILEAGE -0.051027 0.005406 -9.438689 0.0000
TRIM 1972.208 309.7351 6.367400 0.0000
TRANSMISSION 967.9415 1104.742 0.876170 0.3823
C 17888.66 1141.740 15.66789 0.0000
R-squared 0.828216 Mean dependent var 14937.87
Adjusted R-squared 0.822638 S.D. dependent var 3587.486
S.E. of regression 1510.845 Akaike info criterion 17.51551
Sum squared resid 3.52E+08 Schwarz criterion 17.63082
Log likelihood -1395.240 F-statistic 148.4947
Durbin-Watson stat 1.275033 Prob(F-statistic) 0.000000
Toyota Camry: Regression of price against independent variables (age, engine, mileage and trim)Dependent Variable: PRICE
Method: Least Squares
Date: 11/29/10 Time: 20:10
Sample: 1 160
Included observations: 160
Variable Coefficient Std. Error t-Statistic Prob.
AGE -631.9880 63.96748 -9.879833 0.0000
ENGINE 949.8378 322.5527 2.944753 0.0037
MILEAGE -0.051251 0.005396 -9.497941 0.0000
TRIM 1977.688 309.4398 6.391189 0.0000
C 18866.11 242.7181 77.72850 0.0000
R-squared 0.827359 Mean dependent var 14937.87
Adjusted R-squared 0.822904 S.D. dependent var 3587.486
S.E. of regression 1509.713 Akaike info criterion 17.50798
Sum squared resid 3.53E+08 Schwarz criterion 17.60408
Log likelihood -1395.638 F-statistic 185.7048
Durbin-Watson stat 1.286429 Prob(F-statistic) 0.000000
Price = -631.9880 * AGE + 949.8378 * ENGINE -0.051251 * MILEAGE + 1977.688 * TRIM + 18866.11
All Cars: mileage against price
0
20000
40000
60000
80000
100000
120000
0 10000 20000 30000 40000
PRICE
MIL
EG
AE
R-Square ≈ 22%
Toyota Camrys: mileage against price
R Square ≈ 66%
0
50000
100000
150000
200000
0 5000 10000 15000 20000 25000
PRICE
MIL
EA
GE
Alternative Model PRICE^(1/2) = -0.0002263673136*MILEAGE + 4.59824795*ENGINE - 2.952776402*AGE + 7.704044111*TRIM + 139.1536581
Dependent Variable: NewPRICE
Method: Least Squares
Date: 11/30/10 Time: 12:16
Sample: 1 160
Included observations: 160
Variable Coefficient Std. Error t-Statistic Prob.
MILEAGE -0.000226 2.23E-05 -10.16426 0.0000
ENGINE 4.598248 1.331262 3.454051 0.0007
AGE -2.952776 0.264011 -11.18429 0.0000
TRIM 7.704044 1.277142 6.032253 0.0000
C 139.1537 1.001764 138.9087 0.0000
R-squared 0.871118 Mean dependent var 121.1827
Adjusted R-squared 0.847276 S.D. dependent var 15.94425
S.E. of regression 6.230994 Akaike info criterion 6.527700
Sum squared resid 6017.920 Schwarz criterion 6.623799
Log likelihood -517.2160 F-statistic 221.5239
Durbin-Watson stat 1.222753 Prob(F-statistic) 0.000000
New Price vs Original Price
-20
-10
0
10
20
60
80
100
120
140
160
20 40 60 80 100 120 140 160
Residual Actual Fitted
-4000
-2000
0
2000
4000
6000
0
5000
10000
15000
20000
25000
20 40 60 80 100 120 140 160
Residual Actual Fitted
Conclusions
As expected, older, higher mileage cars are worth less than newer cars.
Bigger engines and nicer levels of trim cost more
Our model explains 82% of price variations
What we learned from this project Communication can be difficult EViews is amazingly fun and can be useful in
analyzing social and economic phenomena
Thanks!