Chapter 3 Linear Regression, Question 10
library(ISLR)
Carseats = na.omit(Carseats)
summary(Carseats)
lm.model = lm(Sales~Price+Urban+US,data=Carseats)
summary(lm.model)
The coefficient of Price variable is statistically significant given its very small p-value. On average each $100 increase in the price decreases the sales by around 5 units. There are two dummy variables for the qualitative variables Urban and US which are UrabnYes and USYes respectively. UrbanYes = 1 if the Carseats' store is in an Urban area and 0 otherwise. USYes = 1 if the Carseats' store is in US and 0 otherwise. The p-value for the t-statistic of the coefficient of UrbanYes variable is larger than the presumed 5% critical point. Hence this coefficient is statistically insignificant. This indicates that there is no relationship between the sales of the carseats and whether the store is an urban area or not. However the p-value of that of USYes variable is a lot smaller than the 5% critical point and hence the coefficient of this variable is statistically significant. Also the coefficient is positive which means that the sales are more in the US stores by around 1201 units.
Sales = 13.04 - 0.054 Price - 0.022 UrbanYes + 1.2 USYes
For Price and USYes we can reject the null hypothesis that their coefficents are equal to 0 as the respective p-values of their t-statistic are very low.
lm.model_small = lm(Sales~Price+US,data=Carseats)
summary(lm.model_small)
anova(lm.model,lm.model_small)
According to the anova test the p-value of the f-statistic is larger than 5%. Therefore the null hypothesis stands true that both the models fit the data equally well.
confint(lm.model_small, level=0.95)
plot(predict(lm.model_small),rstudent(lm.model_small))
As the studentized residuals are between -3 and 3 we can assume there are no outliers.
par(mfrow=c(2,2))
plot(lm.model_small)
Studentized residuals vs Leverage graph shows that there are some observations with high leverage.
p=2
n=nrow(Carseats)
(p+1)/n
There are many observations with leverage way above 0.0075. All of those observations have very high leverage.