Seminari del Dipartimento di Matematica

Seminario del 2025

Gennaio

2025

Giovanni Seraghiti

Adaptive gradient methods and parameter pruning

nell'ambito della serie: SCUBE

Seminario di analisi numerica

In this seminar, I will talk about Objective Function Free Optimization (OFFO) in the context of pruning the parameter of a given model. OFFO algorithms are methods where the objective function is never computed; instead, they rely only on derivative information, thus on the gradient in the first-order case. I will give an overview of the main OFFO methods, focusing on adaptive algorithms such as Adagrad, Adam, RMSprop, ADADELTA, which are gradient methods that share the common characteristic of depending only on current and past gradient information to adaptively determine the step size at each iteration. Next, I will briefly discuss the most popular pruning approaches. As the name implies, pruning a model, typically a neural networks, refers to the process of reducing its size and complexity, typically by removing certain parameters that are considered unnecessary for its performance. Pruning emerges as an alternative compression technique for neural networks to matrix and tensor factorization or quantization. Mainly, I will focus on pruning-aware methods that uses specific rules to classify parameters as relevant or irrelevant at each iteration, enhancing convergence to a solution of the problem at hand, which is robust to pruning irrelevant parameters after training.Finally, I will introduce a novel deterministic algorithm which is both adaptive and pruning-aware, based on a modification Adagrad scheme that converges to a solution robust to pruning with complexity of $\log(k) \backslash k$. I will illustrate some preliminary results on different applications.

indietro