Hey folks,

I just implemented a 5-fold Cross Validation to determine the optimal penalty value for a ridge regression. **(Code Below).** I am using the **lm.ridge function** in the l**ibrary(MASS).**

I double checked the results of my own 5 fold cross validation function with integrated generalised cross validation function in lm.ridge function. To my surprise the optimal penalty value are quite far from each other (difference of about 4.6).

It got me curious on why the results are so far from each other ? Can the difference in the optimal lambda parameter value be explained by the difference in the two methods?

`# Rridge Regression set.seed(3) library(MASS) grid = 10^seq(10, -2, length = 100) # grid with lambda/penalty values ridge_res = matrix(NA,1,100) # adapt lm crossvalidaiton for ridge grid cross_val_ridge = function(data,k) { require(MASS) set.seed(1) # student number as seed cv_index = sample(rep(1:5, length = nrow(data)) , nrow(data)) cv_train_e = matrix(NA, k) # create empty matrix to store cv_errors in cv_test_e = matrix(NA, k) for ( i in 1:k) { cv_train = data[cv_index!=i,] cv_test = data[cv_index==i,] cv_lm = lm.ridge(MEDV ~ . , data = cv_train, lambda = grid[j]) # compute prediction by hand pred.ridge = coef(cv_lm)[1] + coef(cv_lm)[2]*cv_test[,1] + coef(cv_lm)[3]*cv_test[,2] + coef(cv_lm)[4]*cv_test[,3] + coef(cv_lm)[5]*cv_test[,4] + coef(cv_lm)[6]*cv_test[,5] + coef(cv_lm)[7]*cv_test[,6] + coef(cv_lm)[8]*cv_test[,7] + coef(cv_lm)[9]*cv_test[,8] + coef(cv_lm)[10]*cv_test[,9] + coef(cv_lm)[11]*cv_test[,10] + coef(cv_lm)[12]*cv_test[,11] + coef(cv_lm)[13]*cv_test[,12] + coef(cv_lm)[14]*cv_test[,13] #cv_train_e[i,] = mean(cv_lm$residuals^2) cv_test_e[i,] = mean((cv_test$MEDV - pred.ridge) ^ 2) } return(mean(cv_test_e)) } for (j in 1:100) { ridge_res[j] = cross_val_ridge(train, k=5) } which.min(colMeans(ridge_res)) grid[76] # optimal lambda value as per 5k-cv own method = 8.111308 # double check using generalized cv ridge = lm.ridge(MEDV ~ . , data = train, lambda = grid) which.min(ridge$GCV) grid[79] # optimal lambda value as per GCV3.511192 `