Seminari
Dipartimento Matematica
Home
Seminari periodici
Archivio
Login
Seminario del 2024
Maggio
29
2024
pagina stampabile
Urte Adomaityte
Heavy tails in high dimensions: relaxing data assumptions in exact asymptotics for classification and robust regression
fisica matematica
interdisciplinare
We characterise the exact asymptotic performance of high-dimensional classification and robust regression estimators under convex loss and regularisation assumptions. Using tools from replica theory, our analysis covers a large family of data distribution assumptions, including any power-law tail, and allows us to determine cases where Gaussian data universality breaks. For classification, we characterise the learning of a mixture of clouds by studying the generalisation performance of the obtained estimator, analyse the role of regularisation and analytically derive the data separability transition. For robust regression, we provide an exact asymptotic characterisation of the recovery of a planted estimator under heavy-tailed contamination of covariates and label noise. We show that, unlike in the classical regime of small dimension-to-data sample ratio, regularisation becomes necessary for the Huber loss estimator to achieve optimality under heavy-tailed contamination in the modern high-dimensional regime, and we derive decay rates for the estimation error of ridge regression.
indietro