Seminari del Dipartimento di Matematica

Seminario del 2025

Gian Paolo Leonardi

Geometric Post-Training Quantization of Deep Neural Networks

fisica matematica

probabilità

Quantization is a key technique for reducing deep learning models' memory footprint and computational cost. However, traditional quantization methods are not supported by general theoretical results. Moreover, they typically overlook the role of the geometry induced on the parameter space by the structure of the model and by the training dynamics. Our main theoretical finding consists of identifying an appropriate metric to be used when projecting weights onto a quantization grid after training. More precisely, we consider suitably scaled, over-parametrized deep neural networks with L layers, whose parameters are initialized as i.i.d. normal variables with zero mean and unit variance, and subsequently trained with gradient descent. Then, we rigorously prove that the natural quantization metric is defined by the Gauss-Newton seminorm whenever the final point of the training dynamics satisfies suitable sparsity assumptions. Specifically, we quantify the contribution of this seminorm with high probability over the initialization as the dimension of the parameter space becomes large enough. Based on this theoretical result, we propose a novel, post-training quantization algorithm called GeoPTQ, which is shown to outperform classical quantization schemes in some preliminary experiments. This is a joint work with Massimiliano Datres (LMU Munich) and Andrea Agazzi (Univ. Bern).