Chapter 3 Linear Regression Question 11
set.seed(1)
x=rnorm(100)
y=2*x+rnorm(100)
lm.model = lm(y~x+0)
summary(lm.model)
plot(x,y)
abline(lm.model)
summary.lm(lm.model)$coefficients
The p-value of the t-statistic of the model is near zero. This indicates that the estimated coefficient is statistically significant and the null hypothesis can be rejected. The coefficient is almost equal to the actual value of the coefficient.
lm.model_x_y = lm(x~y+0)
summary(lm.model_x_y)
summary.lm(lm.model_x_y)$coefficients
The p-value of the t-statistic is near zero. Hence we can reject the null hypothesis.
Both (a) and (b) estimate the same model in two different forms. (a)y=2*x+error (b)x = 0.5(y-error)
$\beta = \frac{\sum xy}{\sum x^{2}} \hspace{1cm} SE(\beta )=\sqrt{\frac{\sum (y-x\beta )^{2}}{(n-1)\sum x^{2}}} \\ t-statistic = \frac{\beta }{SE(\beta)} = \frac{\sum xy}{\sum x^{2}} * \sqrt{\frac{(n-1)\sum x^{2}}{\sum (y-x\beta )^{2}}} \\ = \sqrt{(\frac{\sum xy}{\sum x^{2}})^{2}} * \sqrt{\frac{(n-1)\sum x^{2}}{\sum (y-x\beta )^{2}}} \\ = \sqrt{\frac {(\sum xy)^2 (n-1) \sum x^2}{(\sum x^{2})^2 \sum (y^{2}-2xy\beta+x^2 \beta)}} \\ = \sqrt{\frac {(\sum xy)^2 (n-1)}{\sum x^{2} \sum (y^{2}-2xy\beta+x^2 \beta)}} \\ = \sqrt{\frac {(\sum xy)^2 (n-1)}{\sum x^{2} (\sum y^{2} -2\beta\sum xy + \beta^{2} \sum x^{2})}} \\ = \sqrt{\frac {(\sum xy)^2 (n-1)}{\sum x^{2} (\sum y^{2} -2\frac{\sum xy}{\sum x^{2}}\sum xy + (\frac{\sum xy}{\sum x^{2}})^{2} \sum x^{2})}} \\ = \sqrt{\frac {(\sum xy)^2 (n-1)}{\sum x^{2} \sum y^{2}-2(\sum xy)^{2} + (\sum xy)^{2}}} \\ = \sqrt{\frac {(\sum xy)^2 (n-1)}{\sum x^{2} \sum y^{2}-(\sum xy)^{2}}} \\ = \frac {\sqrt{(n-1)}\sum xy}{\sqrt{\sum x^{2} \sum y^{2}-(\sum xy)^{2}}}$
n = length(x)
X = matrix(data=x, nrow=n, ncol=1)
Y = matrix(data=y, nrow=n, ncol=1)
t = (sqrt(n-1)*sum(X*Y))/sqrt(sum(X^2)*sum(Y^2)-(sum(X*Y))^2)
print(t)
the t-statistic calculated by the formula is the same as the t-statistic shown in the summary of the model
According to the t-statistic formula given in (d) if you swap the values of x and y the t-statistic value wouldn't change.
(sqrt(n-1)sum(XY))/sqrt(sum(X^2)sum(Y^2)-(sum(XY))^2) = (sqrt(n-1)sum(YX))/sqrt(sum(Y^2)sum(X^2)-(sum(YX))^2)
lm.model1 = lm(y~x)
summary(lm.model1)
lm.model2 = lm(x~y)
summary(lm.model2)
In both regression of x onto y and y onto x the t-statistic for Beta1 is 18.56