Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
13 views

Assignment 4

Uploaded by

hsarpong15
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Assignment 4

Uploaded by

hsarpong15
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

RMI 8300

Assignment 4
Please show your work clearly to get full credit

1. (a) Use the rnorm() function to generate a predictor X of length n = 100, as


well as a noise vector 𝜖 of length n = 100.

(b) Generate a response vector Y of length n = 100 according to the model

c) Use the regsubsets() function to perform best subset selection in order to choose the
best model containing the predictors X, X2,...,X10. What is the best model obtained
according to Cp,BIC,andadjustedR2? Show some plots to provide evidence for your
answer, and report the coefficients of the best model ob-tained. Note you will need to
use the data.frame() function to create a single data set containing both X and Y .

(d) Repeat (c), using forward stepwise selection and also using back-wards stepwise
selection. How does your answer compare to the results in (c)?

(e) Now fit a lasso model to the simulated data, again using X, X2, ...,X10 as
predictors. Use cross-validation to select the optimal value of λ. Create plots of the
cross-validation error as a function of λ. Report the resulting coefficient estimates, and
discuss the results obtained.

2. Use College data set by using library(ISLR).

(a) Split the data set into a training set and a test set.

(b) Fit a linear model using least squares on the training set, and report the test
error obtained.

(c) Fit a ridge regression model on the training set, with λ chosen by cross-
validation. Report the test error obtained.

(d) Fit a lasso model on the training set, with λ chosen by cross-validation. Report
the test error obtained, along with the number of non-zero coefficient estimates.

You might also like