# Asymptotics of Cross-Validation

@article{Austern2020AsymptoticsOC, title={Asymptotics of Cross-Validation}, author={Morgane Austern and Wenda Zhou}, journal={arXiv: Statistics Theory}, year={2020} }

Cross validation is a central tool in evaluating the performance of machine learning and statistical models. However, despite its ubiquitous role, its theoretical properties are still not well understood. We study the asymptotic properties of the cross validated-risk for a large class of models. Under stability conditions, we establish a central limit theorem and Berry-Esseen bounds, which enable us to compute asymptotically accurate confidence intervals. Using our results, we paint a big… Expand

#### 6 Citations

Uniform Consistency of Cross-Validation Estimators for High-Dimensional Ridge Regression

- Computer Science
- AISTATS
- 2021

It is shown that ridge tuning via minimization of generalized or leave-one-out cross-validation asymptotically almost surely delivers the optimal level of regularization for predictive accuracy, whether it be positive, negative, or zero. Expand

Cross-validation: what does it estimate and how well does it do it?

- Mathematics
- 2021

Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood. Ideally, one would like to think that cross-validation estimates the… Expand

Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection

- Computer Science
- ArXiv
- 2020

The proposed validation approach is suitable for a wide range of learning settings due to the independence of both augmentation and out-of-sample estimation on learning process, and is demonstrated by extensive evaluation on multiple data-sets, models and tasks. Expand

How Flexible is that Functional Form? Quantifying the Restrictiveness of Theories

- Computer Science, Economics
- SSRN Electronic Journal
- 2020

A new way to quantify the restrictiveness of an economic model, based on how well the model fits simulated, hypothetical data sets, is proposed, and two widely-used behavioral models are evaluated. Expand

Cross-validation Confidence Intervals for Test Error

- Mathematics, Computer Science
- NeurIPS
- 2020

This work develops central limit theorems for cross-validation and consistent estimators of its asymptotic variance under weak stability conditions on the learning algorithm. Together, these results… Expand

Local asymptotics of cross-validation in least-squares density estimation

- Mathematics
- 2021

In model selection, several types of cross-validation are commonly used and many variants have been introduced. While consistency of some of these methods has been proven, their rate of convergence… Expand

#### References

SHOWING 1-10 OF 12 REFERENCES

Cross-Validation and Mean-Square Stability

- Mathematics, Computer Science
- ICS
- 2011

A new measure of algorithm stability, called mean-square stability, is presented, which is weaker than most stability notions described in the literature, and encompasses a large class of algorithms including bounded SVM regression and regularized least-squares regression. Expand

A Swiss Army Infinitesimal Jackknife

- Computer Science, Mathematics
- AISTATS
- 2019

A linear approximation to the dependence of the fitting procedure on the weights is used, producing results that can be faster than repeated re-fitting by an order of magnitude and support the application of the infinitesimal jackknife to a wide variety of practical problems in machine learning. Expand

Beating the hold-out: bounds for K-fold and progressive cross-validation

- Mathematics, Computer Science
- COLT '99
- 1999

It is shown that for any nontrivial learning problem and learning algorithm that is insensitive to example ordering, the k-fold estimate is strictly more accurate than a single hold-out estimate on 1/k of the data, for 2 < k < n (k = n is leave-one-out), based on its variance and all higher moments. Expand

Near-Optimal Bounds for Cross-Validation via Loss Stability

- Mathematics, Computer Science
- ICML
- 2013

This work introduces a new and weak measure called loss stability and relates the cross-validation performance to this measure; it is established that this relationship is near-optimal and quantitatively improves the current best bounds on cross- validation. Expand

Concentration Inequalities - A Nonasymptotic Theory of Independence

- Mathematics, Computer Science
- Concentration Inequalities
- 2013

Deep connections with isoperimetric problems are revealed whilst special attention is paid to applications to the supremum of empirical processes. Expand

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition

- Mathematics, Computer Science
- Springer Series in Statistics
- 2009

This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. Expand

Approximate Leave-One-Out for Fast Parameter Tuning in High Dimensions

- Mathematics, Computer Science
- ICML
- 2018

Two frameworks to obtain a computationally efficient approximation ALO of the leave-one-out cross validation (LOOCV) risk for nonsmooth losses and regularizers are proposed and the equivalence of the two approaches under smoothness conditions is proved. Expand

Fundamentals of Stein's method

- Mathematics
- 2011

This survey article discusses the main concepts and techniques of Stein's method for distributional approximation by the normal, Poisson, exponential, and geometric distributions, and also its… Expand

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

- Mathematics
- 2004

In the words of the authors, the goal of this book was to “bring together many of the important new ideas in learning, and explain them in a statistical framework.” The authors have been quite… Expand