Chapter:5-Resampling methods, q7

In [2]:
library(ISLR)
names(Weekly)
  1. 'Year'
  2. 'Lag1'
  3. 'Lag2'
  4. 'Lag3'
  5. 'Lag4'
  6. 'Lag5'
  7. 'Volume'
  8. 'Today'
  9. 'Direction'

(a)

In [4]:
glm.model = glm(Direction~Lag1+Lag2,data=Weekly,family=binomial)
summary(glm.model)
Call:
glm(formula = Direction ~ Lag1 + Lag2, family = binomial, data = Weekly)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-1.623  -1.261   1.001   1.083   1.506  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.22122    0.06147   3.599 0.000319 ***
Lag1        -0.03872    0.02622  -1.477 0.139672    
Lag2         0.06025    0.02655   2.270 0.023232 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1496.2  on 1088  degrees of freedom
Residual deviance: 1488.2  on 1086  degrees of freedom
AIC: 1494.2

Number of Fisher Scoring iterations: 4

(b)

In [22]:
cv.glm.model = glm(Direction~Lag1+Lag2,data=Weekly,family=binomial,subset=-1)

(c)

In [23]:
class = 'Down'
result=predict(cv.glm.model,newdata=Weekly[1,],type="response")
if(result>0.5){
    class = 'Up'
}
class == Weekly$Direction[1]
FALSE

The observation was not correctly classified

(d)

In [27]:
err=0
for(i in 1:nrow(Weekly)){
    glm.model = glm(Direction~Lag1+Lag2,data=Weekly,family=binomial,subset=-i)
    class = 'Down'
    prob = predict(glm.model,newdata=Weekly[i,],type="response")
    if(prob>0.5){
        class = 'Up'
    }
    if(class != Weekly$Direction[i]) err=err+1
}
err
490

The model made 490 errors

(e)

In [28]:
#test error rate estimate
err/nrow(Weekly)*100
44.9954086317723