creating graphs on saturn goptions device = png htitle = 2 htext = 1.5 gsfmode = replace;
Post on 31-Jan-2016
17 Views
Preview:
DESCRIPTION
TRANSCRIPT
Creating Graphs on Saturn
GOPTIONS DEVICE = png HTITLE=2 HTEXT=1.5 GSFMODE = replace;PROC REG DATA=agebp; MODEL sbp = age; PLOT sbp*age;RUN;
This will create file sasgraph.png1. Transfer file to PC (binary mode)2. Open Word3. Choose Insert picture from file
PROC REG DATA=agebp LP; MODEL sbp = age; PLOT sbp*age;RUN;
Multiple Linear Regression
• More than 1 independent variable– See how combinations of several variables are associated
with and can predict the dependent variable. How much of the total variability can be explained?
– Control for confounding (interested in the effect of one variable but want to “adjust” for another variable)
– Explore interactions
PROC REG DATA=datasetname ; MODEL depvar = x1; MODEL depvar = x1 x2; MODEL depvar = x1 x2 x3;RUN;
Question Explored Using Multiple Regression
• How much of the variation in test scores among school districts can be explained by several district characteristics?
• Is calcium intake related to BP independent of age?• Is the relationship between age and BP the same for
men and women.
Reminder
• Y variable is continuous and is normally distributed for each combination of X’s with the same variability
• X variables can be continuous or indicator variables and do not need to be normally distributed
2 Factors
1. Y = 0 + 1X1
2. Y = 0 + 2X2
3. Y = 0 + 1X1 + 2X2
• Do you get the same slope in models 1 and 3
Control for confounding
Both SLR models for each cohort significant
Overall not significant
(negative confounding)
The equation that describes how the mean The equation that describes how the mean value of value of yy is related to is related to xx11, , xx22, . . . , . . . xxpp . .
yy = = 00 + + 11xx1 1 + + 22xx2 2 + . . . + + . . . + ppxxpp
Multiple Regression Equation Multiple Regression Equation
=Mean of y when all x variables are equal to 0
i = change in mean y corresponding to a 1 unit change in xi considering all other predictors fixed
Implied: The impact of x1 is the same for each of the other values of x2, x3, … xp
Multiple Regression ModelMultiple Regression Model
The equation that describes how the dependent The equation that describes how the dependent variable variable yy is related to the independent is related to the independent variables variables xx11, , xx22, . . . , . . . xxpp and an error term is and an error term is called the called the multiplemultiple regression modelregression model..
yy = = 00 + + 11xx11 + + 22xx2 2 ++ . . . + . . . + ppxxpp + +
reflects how individuals deviate from others reflects how individuals deviate from others with the same values of x’swith the same values of x’s
The The estimated multiple regression equation is:estimated multiple regression equation is:
yy = = bb00 + + bb11xx1 1 + + bb22xx2 2 + . . . + + . . . + bbppxxpp
Estimated Multiple Regression EquationEstimated Multiple Regression Equation
^̂
bi estimates i
yy is estimated (or predicted) value for a set of x’s
^̂
Estimation Estimation
Least Squares CriterionLeast Squares Criterion
Computation of Coefficients ValuesComputation of Coefficients Values
The formulas for the regression The formulas for the regression coefficients coefficients bb00, , bb11, , bb22, . . . , . . . bbp p involve the use of involve the use of matrix algebra. We will use SAS to perform matrix algebra. We will use SAS to perform the calculations.the calculations.
min ( iy yi )2min ( iy yi )2^̂
Find the best multidimensional plane
Testing for Significance: Global Test Testing for Significance: Global Test
Hypotheses Hypotheses
HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0
HHaa: One or more of the parameters: One or more of the parameters
is not equal to zero.is not equal to zero. Test StatisticTest Statistic
FF = MSR/MSE = MSR/MSE Rejection RuleRejection Rule
Reject Reject HH00 if if FF > > FF
where where FF is based on an is based on an FF distribution with distribution with pp d.f. in d.f. in
the numerator and the numerator and nn - - pp - 1 d.f. in the - 1 d.f. in the denominator.denominator.
Testing for Significance: IndividualTesting for Significance: Individual ’s’s
HypothesesHypotheses
HH00: : ii = 0 = 0
HHaa: : ii = 0 = 0 Test StatisticTest Statistic
Rejection RuleRejection Rule
Reject Reject HH00 for small or large for small or large tt
Meaning: Is XMeaning: Is Xii related to Y after taking into related to Y after taking into account all other variables in the modelaccount all other variables in the model
tbs
i
bi
tbs
i
bi
PossibilitiesPossibilities
X1 is related to Y alone but after adjusting for X1 is related to Y alone but after adjusting for X2, then X1 is no longer related to YX2, then X1 is no longer related to Y
X1 is not related to Y alone but after adjusting X1 is not related to Y alone but after adjusting for X2, then X1 is related to Yfor X2, then X1 is related to Y
Relation of X1 with Y1 gets stronger after Relation of X1 with Y1 gets stronger after adjusting for X2adjusting for X2
Relation of X1 with Y gets weaker after Relation of X1 with Y gets weaker after adjusting for X2adjusting for X2
Pulmonary Function Example
• Dependent Variable: Forced Expired Volume (FEV1.0)
• Independent Variables:– Age of person
– Smoking status of person
• Questions:– Is age related to FEV independent of smoking status
– Is smoking status related to FEV independent of age
– How much of the variability in FEV is explained by age and smoking combined
Model for FEV Example
Y = 0 + 1X1 + 2X2
X1 = smoking status (1=smoker, 0=nonsmoker)
X2 = age
SmokersFEV = 0 + 1 + 2age
Non Smokers FEV = 0 + 2age
Interpretation of Parameters
SmokersFEV = 0 + 1 + 2age
Non Smokers FEV = 0 + 2age
1 is the effect of smoking for fixed levels of age
2 is the effect of age pooled over smokers and non-smokers
This model assumes the relation of age to FEV is the same for smokers and non-smokers
DATA fev;INFILE DATALINES;INPUT age smk fev;DATALINES;28 1 4.030 1 3.930 1 3.731 1 3.654 0 2.9
More data
PROC MEANS; VAR fev; CLASS smk;RUN;
The MEANS Procedure
Analysis Variable : fev
N smk Obs N Mean Std Dev Minimum Maximum
0 15 15 3.6000000 0.4208834 2.9000000 4.3000000
1 15 15 3.2933333 0.5257195 2.2000000 4.000000
PROC CORR DATA=fev;
Pearson Correlation Coefficients, N = 30 Prob > |r| under H0: Rho=0
age smk fev
age 1.00000 -0.12788 -0.73024 0.5007 <.0001
smk -0.12788 1.00000 -0.31620 0.5007 0.0887
fev -0.73024 -0.31620 1.00000 <.0001 0.0887
PROC REG; MODEL fev = age smk ;RUN;
Dependent Variable: fev
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > F
Model 2 SSR 4.96510 2.48255 32.08 <.0001Error 27 SSE 2.08957 0.07739Corrected Total 29 SST 7.05467
Root MSE 0.27819 R-Square 0.7038Dependent Mean 3.44667 Coeff Var 8.07136
Tests Ho: 1 = 0; 2 =0
Proportion of variance explained by both variables
PROC REG; MODEL fev = age smk ; MODEL fev = age ; MODEL fev = smk ;RUN;Parameter Estimates
Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 5.58114 0.27653 20.18 <.0001age 1 -0.04702 0.00634 -7.42 <.0001smk 1 -0.40384 0.10242 -3.94 0.0005
Intercept 1 5.24787 0.32456 16.17 <.0001age 1 -0.04382 0.00775 -5.66 <.0001
Intercept 1 3.60000 0.12295 29.28 <.0001smk 1 -0.30667 0.17388 -1.76 0.0887
R2 = .7038
R2 = .5333
R2 = .1000
PROC REG; MODEL fev = age smk; PROC REG; MODEL fev = age ; WHERE smk = 0;PROC REG; MODEL fev = age ; WHERE smk = 1;
Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 5.58114 0.27653 20.18 <.0001age 1 -0.04702 0.00634 -7.42 <.0001smk 1 -0.40384 0.10242 -3.94 0.0005
Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 5.24764 0.38050 13.79 <.0001age 1 -0.03911 0.00887 -4.41 0.0007
Intercept 1 5.50002 0.36163 15.21 <.0001age 1 -0.05508 0.00885 -6.22 <.0001
Non-smokers
Smokers
top related