Chapter 3- Linear Regression, Question 8

Auto= read.csv("../../datasets/Auto.csv",header=T,na.strings="?")
Auto=na.omit(Auto)

(a)¶

lm.model = lm(mpg~horsepower, data=Auto)
summary(lm.model)

Call:
lm(formula = mpg ~ horsepower, data = Auto)

Residuals:
     Min       1Q   Median       3Q      Max 
-13.5710  -3.2592  -0.3435   2.7630  16.9240 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 39.935861   0.717499   55.66   <2e-16 ***
horsepower  -0.157845   0.006446  -24.49   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.906 on 390 degrees of freedom
Multiple R-squared:  0.6059,	Adjusted R-squared:  0.6049 
F-statistic: 599.7 on 1 and 390 DF,  p-value: < 2.2e-16

(i)¶

The large F-statistic and its small p-value shows that there is a strong relationship between predictor and response. We can reject the null hypothesis that the predictor is equal to zero.

(ii)¶

mean(Auto$mpg)

The residual standard error 4.906 and mean response is 23.446 which gives a percentage error of 20.9%. The R-squared value is 0.6059. This indicates that 60.59% of the variability in the response has been explained (removed) by the model.

(iii)¶

The coefficient of horsepower is negative. The response "mpg" decreases with the increase in "horsepower." Therefore the relationship is negative.

(iv)¶

predict(lm.model,data.frame(horsepower=98))

#95% confidence interval
predict(lm.model,data.frame(horsepower=98),interval="confidence")

#95% prediction interval
predict(lm.model,data.frame(horsepower=98),interval="prediction")

(b)¶

plot(Auto$horsepower, Auto$mpg)
abline(lm.model)

(c)¶

par(mfrow=c(2,2))
plot(lm.model)

Residuals vs Fitted values graph forms a pattern which indicates nonlinearity between the response and predictor.