regression and analysis variance linear models in r
TRANSCRIPT
Regression and Analysis Variance
Linear Models in R
What do you know already?
RegressionContinuous Dependent VariableContinuous Independent VariableAssumptions
NormalityIndependenceConstant variance N(0, 2)
Linear or curvilinear
ANOVAContinuous Dependent VariableDiscrete Independent VariableAssumptions
NormalityIndependenceConstant variance N(0, 2)Factor level variances are equal
Linear ModelsRegression and ANOVA (and in fact
ANCOVA) are all related mathematically to one another.
Exactly the same mathematics is used throughout.
The only difference is the type (and number) of independent variables that you are working with.
The base assumptions are required for all linear models.
What procedure are we going to use to analyse linear model data?
Wagga House PricesA Wagga Wagga Real Estate Agent wishes
to use data from 30 recent house sales to predict future selling prices ($ 000) from land area (m2).
The data was collected from the internet from any real estate listings that included the land size and the listing price.
Most of the included listings were for 2 bedroom, 1 bathroom and 1 garage houses.
Call:
lm(formula = Price ~ Land, data = dat)
Residuals:
Min 1Q Median 3Q Max
-169.486 -57.992 1.337 68.666 169.565
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 53.9420 56.8561 0.949 0.351
Land 0.6202 0.0931 6.662 3.14e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 92.94 on 28 degrees of freedom
Multiple R-squared: 0.6132, Adjusted R-squared: 0.5994
F-statistic: 44.39 on 1 and 28 DF, p-value: 3.141e-07
anova(dat.lm)
Analysis of Variance Table
Response: Price
Df Sum Sq Mean Sq F value Pr(>F)
Land 1 383397 383397 44.385 3.141e-07 ***
Residuals 28 241863 8638
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Bottlenose Dolphins
Neonate bottlenose dolphins produce many sounds just after birth. Prior to suckling these sounds intensify and then as the neonate prepares to feed the sounds cease, this is called a latency period (LP). It is thought that the LP is related to the suckling frequency. A study was conducted to collect information about the length of the LP and the suckling frequency, where the aim was to define this relationship if it existed.
Johne’s Disease
To eliminate Johne’s disease from an infected farm or to prevent transmission, it is essential that susceptible animals are not exposed to an environment contaminated with the virus. The virus causing Johne’s disease is capable of persisting in the environment for long periods due to the high lipid content in the cell wall and the metabolic inactivity of the organism. Factors that could influence the survival of the virus in the soil including temperature, pH, organic matter exposure to ultra violet light and moisture content were investigated under controlled conditions.
Johne’s Disease continued
This experiment involved trays of contaminated soil randomised to 12 unique treatments, involving changing the pH, UV light and the moisture content. They are uniquely defined as Treatment 1:12. The treatments were randomised to the trays of soil on a completely randomised fashion so there each treatment was replicated 5 times. The ln(number of virsus) remaining was the response measured as an indication of the effectiveness of the treatment. The aim of the experiment is to determine the “best” treatment for removing the virus from the soil.