A Complete Error Analysis of the K-fold Cross Validation for Regularized Empirical Risk Minimization in High Dimensions.
Published in Working in progress., 2025
This paper studies the error of k-fold cross validation in estimating the out-of-sample error of regularized empirical risk minimization (R-ERM) under proportional high dimensional settings, where the number of observations $n$ and the number of parameters $p$ both go to infinity proportionally. We provide a stochastic bound for the MSE of k-CV under mild assumptions. In contrast with common belief that the MSE decreases when the number of folds $k$ increases, we found that it actually stops decreasing anymore when $k$ exceeds a certain boundary, when $n,p$ are fixed. The manuscript will be finished and submitted soon.