Midterm 1 22S:166, Fall 2006 Name: Solutions Copy and paste your R code and its numeric output for each question into this text file. Submit by emailing to kcowles@stat.uiowa.edu 1. a. BaP <- read.table("/group/ftp/pub/kcowles/datasets/BaP.txt",header=T) attach(BaP) b. library(e1071) skewness(indoor) [1] 1.618915 c. Function to compute .75 quantile: q75 <- function(x) {quantile(x, 0.75) } Function to do jackknife: > jack function( data, func ) { n <- length(data) thetahatn <- func( data ) thetahat <- rep(0,n) thetawig <- rep(0,n) # pseudovalues for (i in 1:n) { thetahat[i] <- func( data[-i] ) thetawig[i] <- n * thetahatn - (n-1) * thetahat[i] } thetahat.dot <- mean(thetahat) thetawig.dot <- mean(thetawig) se.thetan <- sqrt(sum((thetawig - thetawig.dot)^2) / (n * (n-1))) se.thetan2 <- sqrt((n-1) *sum((thetahat - thetahat.dot)^2) / n) unbiased2 <- n * thetahatn - (n-1) * thetahat.dot list(thetahatn = thetahatn, thetahat = thetahat, thetawig = thetawig, se.thetan = se.thetan, se.thetan2 = se.thetan2, thetawig.dot = thetawig.dot, unbiased2 = unbiased2) } Output: > jack( indoor, q75 ) $thetahatn 75% 73.75 *** This is the value from complete dataset *** $thetahat [1] 75 75 75 75 75 75 75 75 75 75 70 70 70 70 $thetawig [1] 57.5 57.5 57.5 57.5 57.5 57.5 57.5 57.5 57.5 57.5 122.5 122.5 [13] 122.5 122.5 $se.thetan [1] 8.14411 *** This is the std error, as estimated by jackknife *** $se.thetan2 [1] 8.14411 $thetawig.dot [1] 76.07143 *** This is the jackknife unbiased estimate *** $unbiased2 75% 76.07143 2. R code and output from both analyses goes here: Function for use by "boot" function: > q75b function( x, index) { quantile( x[index], 0.75 ) } > library(boot) Run with 100 bootstrap samples: > myboot <- boot( indoor, q75b, R=100) > myboot ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = indoor, statistic = q75b, R = 100) Bootstrap Statistics : original bias std. error t1* 73.75 7.375 34.60979 > boot.ci(myboot, type=c("basic","perc")) BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 100 bootstrap replicates CALL : boot.ci(boot.out = myboot, type = c("basic", "perc")) Intervals : Level Basic Percentile 95% (-40.00, 98.25 ) ( 49.25, 187.50 ) Calculations and Intervals on Original Scale Some basic intervals may be unstable Some percentile intervals may be unstable Run with 5000 bootstrap samples: > myboot <- boot( indoor, q75b, R=5000) > myboot ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = indoor, statistic = q75b, R = 5000) Bootstrap Statistics : original bias std. error t1* 73.75 14.57375 47.38112 > boot.ci(myboot, type=c("basic","perc")) BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 5000 bootstrap replicates CALL : boot.ci(boot.out = myboot, type = c("basic", "perc")) Intervals : Level Basic Percentile 95% (-72.5, 102.5 ) ( 45.0, 220.0 ) Calculations and Intervals on Original Scale Your sentence goes here: The bias and standard error, as well as the confidence intervals, change quite a lot between 100 and 5000 bootstrap datasets. 100 datasets is not enough for stable estimation in this problem. 3. a. > func2 function(x) { 4 * x^4 - 2 * x^3 + 3*x -3 } > > bisection function( func, interval, width, maxiters) { # width is width of final interval to be returned (which contains root) a <- min(interval) b <- max(interval) fa <- func(a) fb <- func(b) if( fa * fb > 0 ) # endpoints of original interval have same sign stop( "Function may not have a root on this interval" ) else { diff <- b - a iters <- 0 while( diff > width && iters < maxiters ) { iters <- iters + 1 midpoint <- (b + a) / 2 fm <- func(midpoint) if (fm * fa < 0) { b <- midpoint fb <- fm } else { a <- midpoint fa <- fm } diff <- b - a } } list( a,b,iters ) } > bisection( func2, interval=c(-0.75,1), width=0.01, maxiters=50) [[1]] [1] 0.7949219 [[2]] [1] 0.8017578 [[3]] [1] 8 b. > uniroot( func2, interval=c(-0.75,1), maxiter=50) $root [1] 0.7980354 $f.root [1] -3.298224e-06 $iter [1] 7 $estim.prec [1] 6.103516e-05