Chapter:5-Resampling method, q9

In [1]:
library(MASS)
summary(Boston)
      crim                zn             indus            chas        
 Min.   : 0.00632   Min.   :  0.00   Min.   : 0.46   Min.   :0.00000  
 1st Qu.: 0.08204   1st Qu.:  0.00   1st Qu.: 5.19   1st Qu.:0.00000  
 Median : 0.25651   Median :  0.00   Median : 9.69   Median :0.00000  
 Mean   : 3.61352   Mean   : 11.36   Mean   :11.14   Mean   :0.06917  
 3rd Qu.: 3.67708   3rd Qu.: 12.50   3rd Qu.:18.10   3rd Qu.:0.00000  
 Max.   :88.97620   Max.   :100.00   Max.   :27.74   Max.   :1.00000  
      nox               rm             age              dis        
 Min.   :0.3850   Min.   :3.561   Min.   :  2.90   Min.   : 1.130  
 1st Qu.:0.4490   1st Qu.:5.886   1st Qu.: 45.02   1st Qu.: 2.100  
 Median :0.5380   Median :6.208   Median : 77.50   Median : 3.207  
 Mean   :0.5547   Mean   :6.285   Mean   : 68.57   Mean   : 3.795  
 3rd Qu.:0.6240   3rd Qu.:6.623   3rd Qu.: 94.08   3rd Qu.: 5.188  
 Max.   :0.8710   Max.   :8.780   Max.   :100.00   Max.   :12.127  
      rad              tax           ptratio          black       
 Min.   : 1.000   Min.   :187.0   Min.   :12.60   Min.   :  0.32  
 1st Qu.: 4.000   1st Qu.:279.0   1st Qu.:17.40   1st Qu.:375.38  
 Median : 5.000   Median :330.0   Median :19.05   Median :391.44  
 Mean   : 9.549   Mean   :408.2   Mean   :18.46   Mean   :356.67  
 3rd Qu.:24.000   3rd Qu.:666.0   3rd Qu.:20.20   3rd Qu.:396.23  
 Max.   :24.000   Max.   :711.0   Max.   :22.00   Max.   :396.90  
     lstat            medv      
 Min.   : 1.73   Min.   : 5.00  
 1st Qu.: 6.95   1st Qu.:17.02  
 Median :11.36   Median :21.20  
 Mean   :12.65   Mean   :22.53  
 3rd Qu.:16.95   3rd Qu.:25.00  
 Max.   :37.97   Max.   :50.00  

(a)

In [2]:
mean(Boston$medv)
22.5328063241107

(b)

In [3]:
sd(Boston$medv)/sqrt(length(Boston$medv))
0.408861147497535

(c)

In [4]:
library(boot)
set.seed(2)
mean.function = function(data,index){
    d = data$medv
    return(mean(d[index]))
}
bootstrap = boot(data=Boston,statistic=mean.function,R=1000)
bootstrap
ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = Boston, statistic = mean.function, R = 1000)


Bootstrap Statistics :
    original      bias    std. error
t1* 22.53281 -0.01969308   0.4192166

The standard error of the mean using bootstrap is close to standard error estimate in (b)

(d)

In [5]:
#95% confidence interval
print("lower bound")
22.533-2*0.419
print("upper bound")
22.533+2*0.419
[1] "lower bound"
21.695
[1] "upper bound"
23.371
In [6]:
t.test(Boston$medv)
	One Sample t-test

data:  Boston$medv
t = 55.111, df = 505, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 21.72953 23.33608
sample estimates:
mean of x 
 22.53281 

95% confidence interval calculated by bootstrap method is very close to the confidence interval in the t.test.

(e)

In [7]:
median(Boston$medv)
21.2

(f)

In [9]:
median.function = function(data,index){
    d = data$medv
    return(median(d[index]))
}
boot(data=Boston,statistic = median.function,R=1000)
ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = Boston, statistic = median.function, R = 1000)


Bootstrap Statistics :
    original  bias    std. error
t1*     21.2 -0.0414   0.3793151

The median value is 21.2 and its standard error is 0.379. The standard error is very small compared to the median value

(g)

In [13]:
quantile(Boston$medv, 0.1)
10%: 12.75

(h)

In [14]:
quantile.function = function(data,index){
    d = data$medv
    return(quantile(d[index],0.1))
}
boot(data=Boston,statistic=quantile.function,R=1000)
ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = Boston, statistic = quantile.function, R = 1000)


Bootstrap Statistics :
    original  bias    std. error
t1*    12.75  0.0325   0.4990018

The 10th percentile is 12.75 with standard error of 0.499. The standard error is very small compared to the 10th percentile.