Chapter 8 Tree Based Methods - Question 9
library(ISLR)
summary(OJ)
set.seed(1)
train = sample(1:nrow(OJ),800)
library(tree)
oj.model = tree(Purchase~.,data=OJ,subset=train)
summary(oj.model)
The tree uses the predictors LoyalCH, PriceDiff, SpecialCh and ListPriceDiff. It has 8 terminal nodes and its training error rate is 16.5%
oj.model
The node number 20 is a terminal node which is indicated by asterisk symbol * . It uses the predictor, "SpecialCH" for splitting with a splitting value of 0.5. There are 70 samples in the region where SpecialCH < 0.5. Its deviance is 60.89 and the prediction of the tree for all the samples in this region is MM. The portion of samples in this region that actually belongs to MM is 0.84 and the portion of samples belonging to CH is 0.15.
plot(oj.model)
text(oj.model)
At the root node the variable, "LoyalCh" is used for splitting the variable space into two regions. This indicates that "LoyalCH" is the most important variable in the model. Its splitting value is 0.5. Which means that the samples that have a "LoyalCH" value less than 0.5 belong to the left branch of the tree and the rest belong to the right branch. These two regions are further divided into smaller parts with certian splitting values as shown in the figure above. At the bottom of the tree there are 8 terminal nodes. The terminal nodes predict the class of the samples.
yhat = predict(oj.model,newdata=OJ[-train,],type="class")
table(yhat,OJ$Purchase[-train])
mean((yhat!=OJ$Purchase[-train]))*100
The test error rate is 22.5%
cv.oj.model = cv.tree(oj.model)
cv.oj.model
The tree with 6 terminal nodes gives the lowest deviance and hence is the best model in this case.
plot(cv.oj.model$size,cv.oj.model$dev,xlab="size",ylab="deviance",type="b")
According to the graph the tree with 6 terminal nodes gives the lowest cross-validated classification error rate.
pruned.oj.model = prune.tree(oj.model,best=6)
plot(pruned.oj.model)
text(pruned.oj.model)
summary(pruned.oj.model)
summary(oj.model)
The training error rate of the pruned tree is 0.17 which is slightly higher than that of the unpruned tree.
#test error rate of unpruned tree
yhat = predict(oj.model,newdata=OJ[-train,],type="class")
mean(yhat!=OJ$Purchase[-train])*100
#test error rate of pruned tree
yhat = predict(pruned.oj.model,newdata=OJ[-train,],type="class")
mean(yhat!=OJ$Purchase[-train])*100
The pruned tree gives a lower test error rate.