Voir la notice de l'article provenant de la source Numdam
In this note, we provide an elementary analysis of the prediction error of ridge regression with random design. The proof is short and self-contained. In particular, it bypasses the use of Rudelson’s deviation inequality for covariance matrices, through a combination of exchangeability arguments, matrix perturbation and operator convexity.
Mourtada, Jaouad 1 ; Rosasco, Lorenzo 2, 3
@article{CRMATH_2022__360_G9_1055_0, author = {Mourtada, Jaouad and Rosasco, Lorenzo}, title = {An elementary analysis of ridge regression with random design}, journal = {Comptes Rendus. Math\'ematique}, pages = {1055--1063}, publisher = {Acad\'emie des sciences, Paris}, volume = {360}, number = {G9}, year = {2022}, doi = {10.5802/crmath.367}, language = {en}, url = {https://geodesic-test.mathdoc.fr/articles/10.5802/crmath.367/} }
TY - JOUR AU - Mourtada, Jaouad AU - Rosasco, Lorenzo TI - An elementary analysis of ridge regression with random design JO - Comptes Rendus. Mathématique PY - 2022 SP - 1055 EP - 1063 VL - 360 IS - G9 PB - Académie des sciences, Paris UR - https://geodesic-test.mathdoc.fr/articles/10.5802/crmath.367/ DO - 10.5802/crmath.367 LA - en ID - CRMATH_2022__360_G9_1055_0 ER -
%0 Journal Article %A Mourtada, Jaouad %A Rosasco, Lorenzo %T An elementary analysis of ridge regression with random design %J Comptes Rendus. Mathématique %D 2022 %P 1055-1063 %V 360 %N G9 %I Académie des sciences, Paris %U https://geodesic-test.mathdoc.fr/articles/10.5802/crmath.367/ %R 10.5802/crmath.367 %G en %F CRMATH_2022__360_G9_1055_0
Mourtada, Jaouad; Rosasco, Lorenzo. An elementary analysis of ridge regression with random design. Comptes Rendus. Mathématique, Tome 360 (2022) no. G9, pp. 1055-1063. doi : 10.5802/crmath.367. https://geodesic-test.mathdoc.fr/articles/10.5802/crmath.367/
[1] Strong converse for identification via quantum channels, IEEE Trans. Inf. Theory, Volume 48 (2002) no. 3, pp. 569-579 | DOI | Zbl | MR
[2] Theory of reproducing kernels, Trans. Am. Math. Soc., Volume 68 (1950) no. 3, pp. 337-404 | Zbl | MR | DOI
[3] Non-strongly-convex smooth stochastic approximation with convergence rate
[4] Local rademacher complexities, Ann. Stat., Volume 33 (2005) no. 4, pp. 1497-1537 | Zbl | MR
[5] Benign overfitting in linear regression, Proc. Natl. Acad. Sci. USA, Volume 117 (2020) no. 48, pp. 30063-30070 | Zbl | MR | DOI
[6] Optimal Rates for Regularization of Statistical Inverse Learning Problems, Found. Comput. Math., Volume 18 (2018) no. 4, pp. 971-1013 | Zbl | MR | DOI
[7] Optimal rates for the regularized least-squares algorithm, Found. Comput. Math., Volume 7 (2007) no. 3, pp. 331-368 | Zbl | MR | DOI
[8] Trace inequalities and quantum entropy: an introductory course, Entropy and the quantum (Contemporary Mathematics), Volume 529, American Mathematical Society, 2010, pp. 73-140 | MR | DOI | Zbl
[9] PAC-Bayesian bounds for the Gram matrix and least squares regression with a random design (2016) (https://arxiv.org/abs/1603.05229)
[10] Model selection for regularized least-squares algorithm in learning theory, Found. Comput. Math., Volume 5 (2005) no. 1, pp. 59-85 | MR | DOI
[11] Nonparametric stochastic approximation with large step-sizes, Ann. Stat., Volume 44 (2016) no. 4, pp. 1363-1399 | Zbl | MR
[12] Empirical Processes in M-estimation, Cambridge University Press, 1999
[13] Average stability is invariant to data preconditioning. Implications to exp-concave empirical risk minimization, J. Mach. Learn. Res., Volume 18 (2018) no. 222, pp. 1-13 | Zbl | MR
[14] Application of Ridge Analysis to Regression Problems, Chem. Eng. Prog., Volume 58 (1962), pp. 54-59
[15] Random design analysis of Ridge regression, Found. Comput. Math., Volume 14 (2014) no. 3, pp. 569-600 | Zbl | MR
[16] Parallelizing stochastic gradient descent for least squares regression: Mini-batching, averaging, and model misspecification, J. Mach. Learn. Res., Volume 18 (2018) no. 223, pp. 1-42 | Zbl | MR
[17] Local Rademacher complexities and oracle inequalities in risk minimization, Ann. Stat., Volume 34 (2006) no. 6, pp. 2593-2656 | Zbl | MR
[18] Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. École d’Été de Probabilités de Saint-Flour, Lecture Notes in Mathematics, 2033, Springer, 2011
[19] Concentration inequalities and moment bounds for sample covariance operators, Bernoulli, Volume 23 (2017) no. 1, pp. 110-133 | Zbl | MR
[20] Fast rates for exp-concave empirical risk minimization, Advances in Neural Information Processing Systems 28 (2015), pp. 1477-1485
[21] Performance of empirical risk minimization in linear aggregation, Bernoulli, Volume 22 (2016) no. 3, pp. 1520-1534 | Zbl | MR
[22] Non commutative Khintchine and Paley inequalities, Ark. Mat., Volume 29 (1991) no. 1, pp. 241-260 | Zbl | DOI
[23] Some applications of concentration inequalities to statistics, Ann. Fac. Sci. Toulouse, Math., Volume 9 (2000) no. 2, pp. 245-303 | Zbl | mathdoc-id | MR | DOI
[24] On the performance of kernel classes, J. Mach. Learn. Res., Volume 4 (2003), pp. 759-771 | MR
[25] Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices, Ann. Stat., Volume 50 (2022) no. 4, pp. 2157-2178 | MR
[26] An improper estimator with optimal excess risk in misspecified density estimation and logistic regression, J. Mach. Learn. Res., Volume 23 (2022) no. 31, pp. 1-49 | MR
[27] Sums of random Hermitian matrices and an inequality by Rudelson, Electron. Commun. Probab., Volume 15 (2010), pp. 203-212 | Zbl | MR
[28] The lower tail of random quadratic forms with applications to ordinary least squares, Probab. Theory Relat. Fields, Volume 166 (2016) no. 3, pp. 1175-1194 | Zbl | MR | DOI
[29] Learning with incremental iterative regularization, Advances in Neural Information Processing Systems 28 (2015), pp. 1630-1638
[30] Random vectors in the isotropic position, J. Funct. Anal., Volume 164 (1999) no. 1, pp. 60-72 | Zbl | MR | DOI
[31] Shannon sampling II: Connections to learning theory, Appl. Comput. Harmon. Anal., Volume 19 (2005) no. 3, pp. 285-302 | Zbl | MR | DOI
[32] Learning theory estimates via integral operators and their approximations, Constr. Approx., Volume 26 (2007) no. 2, pp. 153-172 | Zbl | MR | DOI
[33] Support Vector Machines, Springer, 2008
[34] Optimal rates for regularized least squares regression, Proc. 22nd Conference on Learning Theory (2009), pp. 79-93
[35] Solution of incorrectly formulated problems and the regularization method, Sov. Math., Dokl., Volume 4 (1963), pp. 1035-1038 | Zbl
[36] User-friendly tail bounds for sums of random matrices, Found. Comput. Math., Volume 12 (2012) no. 4, pp. 389-434 | Zbl | MR | DOI
[37] Benign overfitting in ridge regression (2020) (https://arxiv.org/abs/2009.14286)
[38] Suboptimality of constrained least squares and improvements via non-linear predictors (2021) (https://arxiv.org/abs/2009.09304)
[39] Introduction to the non-asymptotic analysis of random matrices, Cambridge University Press (2012), pp. 210-268
[40] Learning from examples as an inverse problem, J. Mach. Learn. Res., Volume 6 (2005) no. 5, pp. 883-904 | Zbl | MR
[41] Spline Models for Observational Data, 59, Society for Industrial and Applied Mathematics, 1990 | DOI
[42] On early stopping in gradient descent learning, Constr. Approx., Volume 26 (2007) no. 2, pp. 289-315 | Zbl | MR
[43] Dimension-free bounds for sums of independent matrices and simple tensors via the variational principle (2021) (https://arxiv.org/abs/2108.08198)
Cité par Sources :