Seminari del Dipartimento di Matematica

Febbraio

19

2026

Urte Adomaityte

Unveiling the Hessian's Connection to the Decision Boundary

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

Understanding why some neural network minima generalize better than others is a fundamental challenge in deep learning. To analyse this question, we bridge two perspectives: the analysis of the geometric complexity of decision boundaries in input space and the spectral properties of the Hessian of the training loss in parameter space. We show that the top eigenvectors of the Hessian encode the decision boundary, with the number of spectral outliers correlating with its complexity, a finding consistent across datasets and architectures. This insight leads to a formulation of a proxy generalization measure based on alignment between training gradients and Hessian eigenvectors. Additionally, as the measure is blind to simplicity bias, we develop a novel margin estimation technique that, in combination with the generalization measure, helps analyse the generalisation capabilities of neural networks trained on toy and real datasets.

Gennaio

12

2026

Roberto Conti

A walk in the Natural Avenue: bounds, zeta functions and asymptotic estimates

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica, interdisciplinare

TBA

Dicembre

04

2025

Alessandro Breccia

Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica, interdisciplinare

We explore whether a Transformer neural network can learn the deterministic sequence of rooted planar trees generated by iterated prime factorization of the natural numbers. By encoding each integer as a tree, the resulting sequence (named Natural Text NT) forms an arithmetic pseudo-text with rich and measurable statistical and combinatorial structure. We train a GPT-2–style transformer from scratch on the first $10^11$ elements of this sequence and evaluate its performance on next-token and masked-token prediction tasks. The model exhibits a partial grasp of the underlying generative grammar, capturing non-trivial patterns and long-range correlations. These results suggest that modern language models may be capable of learning structural properties intrinsic to arithmetic, extending their applicability beyond conventional empirical datasets.

Giugno

19

2025

Valentina Ros

High-dimensional chaos is more than just wandering around unstable equilibria.

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

Complex systems tend to equilibrate slowly, exhibiting out-of-equilibrium dynamics over a broad range of timescales. A key theory challenge is to understand the features of this out-of-equilibrium behavior from the properties of the attractors of the system’s dynamical equations. Mean-field theories of spin glass dynamics offer elegant examples of this, linking phenomena such as aging to the properties of the stationary points of the underlying free-energy landscape (which are attractors of the dynamics). However, these insights apply mainly to conservative systems. In recent years, there has been growing interest in extending these ideas to high-dimensional non-conservative systems, motivated by neural networks and theoretical ecology. In this talk, I will present a simple model of a high-dimensional system with non-reciprocal interactions, whose chaotic, out-of-equilibrium dynamics can be analyzed analytically at long times. I will discuss its dynamical phase diagram and compare it to the statistical distribution of the many, unstable equilibria of the dynamical equations. This comparison challenges the common assumption that chaotic dynamics in non-conservative settings can be understood from equilibria alone. The results rely on a combination of two analytical techniques, Dynamical Mean-Field Theory and the Kac-Rice formalism, and are presented in arXiv:2503.20908.

Giugno

09

2025

Matteo Marsili

Part 2: Abstraction and understanding in the "mind" of simple neural networks

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

Note: this is the second part of a two-part seminar. AI is progressing at a remarkable speed, but we still don’t have a clear understanding of basic concepts in cognition (e.g. learning, understanding, abstraction, awareness, intelligence, etc). I shall argue that research focused on understanding how learning machines such as LLMs or deep neural networks do what they do, sidesteps the key issue by defining these concepts from the outset. For example, statistical learning is based on a classification of problems (supervised/unsupervised, classification, regression etc.) and addresses the resulting optimisation problem (maximisation of the likelihood, minimisation of errors, etc). Learning entails first of all detecting what makes sense to be learned 1) from very few samples, and 2) without a priori knowing why that date makes sense. This requires a quantitative notion of relevance that can distinguish data that makes sense from meaningless noise. I will first introduce and discuss the notion of relevance. Next I will claim that learning differs from understanding, where the latter implies integrating data that make sense into a pre-existing representation. The properties of this representation should be abstract, i.e. independent of the data, precisely because they need to represent data of a widely different domain. This is what enables higher cognitive functions that we do all the time, like drawing analogies and relating data learned independently from widely different domains. Such a representation should be flexible and continuously adaptable if more data or more resources are made available. I will show that such an abstract representation can be defined as the fixed point of a renormalisation group transformation, and it coincides with a model that can be defined from the principle of maximal relevance. I will provide empirical evidence that the representations of simple neural networks approach this universal model as the network is trained on a broader and broader domain of data. Overall, the aim of the seminar is to support the idea that an approach to central issues in cognition is also possible studying very simple models and does not necessarily require understanding large machine learning models.

Giugno

09

2025

Matteo Marsili

Part 1: Learning as making sense of data that makes sense

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

Note: this is the first part of a two-part seminar. AI is progressing at a remarkable speed, but we still don’t have a clear understanding of basic concepts in cognition (e.g. learning, understanding, abstraction, awareness, intelligence, etc). I shall argue that research focused on understanding how learning machines such as LLMs or deep neural networks do what they do, sidesteps the key issue by defining these concepts from the outset. For example, statistical learning is based on a classification of problems (supervised/unsupervised, classification, regression etc.) and addresses the resulting optimisation problem (maximisation of the likelihood, minimisation of errors, etc). Learning entails first of all detecting what makes sense to be learned 1) from very few samples, and 2) without a priori knowing why that date makes sense. This requires a quantitative notion of relevance that can distinguish data that makes sense from meaningless noise. I will first introduce and discuss the notion of relevance. Next I will claim that learning differs from understanding, where the latter implies integrating data that make sense into a pre-existing representation. The properties of this representation should be abstract, i.e. independent of the data, precisely because they need to represent data of a widely different domain. This is what enables higher cognitive functions that we do all the time, like drawing analogies and relating data learned independently from widely different domains. Such a representation should be flexible and continuously adaptable if more data or more resources are made available. I will show that such an abstract representation can be defined as the fixed point of a renormalisation group transformation, and it coincides with a model that can be defined from the principle of maximal relevance. I will provide empirical evidence that the representations of simple neural networks approach this universal model as the network is trained on a broader and broader domain of data. Overall, the aim of the seminar is to support the idea that an approach to central issues in cognition is also possible studying very simple models and does not necessarily require understanding large machine learning models.

Giugno

05

2025

Maciej Zworski

Why is our world classical despite being governed by quantum mechanics?

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di analisi matematica, fisica matematica

This question has been much discussed in physics and one suggestion is that the long time persistence of classical/quantum correspondence is due to interaction of a small, observed system with a larger environment. Lindblad or GKSL evolution is one of the standard models for describing such interactions. In that context the question of the length of time of classical/quantum agreement was recently revisited in physics by Hernández-Ranard-Riedel. In my talk I will introduce the concept of Lindblad evolution and present results showing that the evolution of a quantum observable remains close to the classical Fokker-Planck evolution in the Hilbert-Schmidt norm for times vastly exceeding the Ehrenfest time (the limit of such an agreement when there is no interaction with a larger system). The time scale is the same as in two recent papers by Hernández-Ranard-Riedel but the statement and methods are different. The talk is based on joint work with J Galkowski and numerical results obtained jointly with Z Huang. I will also comment on recent progress on trace class estimates by Z Li and on the hypoelliptic case by H Smith.

Maggio

07

2025

Michela Lapenna

How do Probabilistic Graphical Models and Graph Neural Networks look at Network Data?

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

Graphs are a powerful data structure for representing relational data and are widely used to describe complex real-world systems. Probabilistic Graphical Models (PGMs) and Graph Neural Networks (GNNs) can both leverage graph-structured data, but their inherent functioning is different. The question is how do they compare in capturing the information contained in networked datasets? We address this objective by solving a link prediction task and we conduct three main experiments, on both synthetic and real networks: one focuses on how PGMs and GNNs handle input features, while the other two investigate their robustness to noisy features and increasing heterophily of the graph. PGMs do not necessarily require features on nodes, while GNNs cannot exploit the network edges alone, and the choice of input features matters. We find that GNNs are outperformed by PGMs when input features are low-dimensional or noisy, mimicking many real scenarios where node attributes might be scalar or noisy. Then, we find that PGMs are more robust than GNNs when the heterophily of the graph is increased. Finally, to assess performance beyond prediction tasks, we also compare the two frameworks in terms of their computational complexity and interpretability.

Aprile

30

2025

Davide Pastorello

Training quantum GANs with classical data

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

After an introduction to the notion of quantum generative adversarial networks (qGANs), I will summarize a recent quantum tomography protocol for constructing a classical estimate of a quantum state by performing repeated measurements on a n-qubit system. I will then discuss the convergence of the protocol with respect to a quantum version of the first-order Wasserstein distance, inspired by the theory of optimal mass transport. In particular, I will show how this convergence result allows us to conclude that a qGAN can be equivalently trained using classical estimators of quantum states instead of quantum data. This fact is important in practice, as it enables the training of quantum models without requiring direct access to quantum memory or coherent quantum data streams.

Aprile

16

2025

Francesco Mori

Dynamics and Control Out of Equilibrium: From Active to Learning Systems

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

Nonequilibrium systems are ubiquitous, from swarms of living organisms to machine learning algorithms. While much of statistical physics has focused on predicting emergent behavior from microscopic rules, a growing question is the inverse problem: how can we guide a nonequilibrium system toward a desired state? This challenge becomes particularly daunting in high-dimensional or complex systems, where classical control approaches often break down. In this talk, I will integrate methods from optimal control theory with techniques from soft matter and statistical physics to tackle this problem in two broad classes of nonequilibrium systems: active matter—focusing on multimodal strategies in animal navigation and mechanical confinement of active fluids—and learning systems, where I will apply control theory to identify optimal learning principles for neural networks. Together, these approaches point toward a general framework for controlling nonequilibrium dynamics across systems and scales.

Aprile

08

2025

Maxime Ingrémeau

L∞ norms of chaotic eigenfunctions: probabilistic and deterministic results

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

When studying Laplace eigenfunctions on compact manifolds, their localisation or delocalisation properties at large eigenvalues are strongly related to the dynamics of the geodesic flow. In this talk, I will be interested in delocalisation phenomena, through the study of L∞ norms of eigenfunctions, on manifolds of negative curvature. After recalling the existing results and conjectures, I will show how these results can be improved by adding small random perturbations to the Laplacian. I will also present some deterministic improvements, in the case of manifolds of constant curvature. These are joint works with Martin Vogel, and with Yann Chaubet.

Febbraio

07

2025

Alessandro Ingrosso

Statistical mechanics of transfer learning in the proportional limit

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

Locandina

Transfer learning (TL) is a well-established machine learning technique to boost the generalization performance on a specific (target) task using information gained from a related (source) task, and it crucially depends on the ability of a network to learn useful features. I will present a recent work that leverages analytical progress in the proportional regime of deep learning theory (i.e. the limit where the size of the training set P and the size of the hidden layers N are taken to infinity keeping their ratio P/N finite) to develop a novel statistical mechanics formalism for TL in Bayesian neural networks. I'll show how such single-instance Franz-Parisi formalism can yield an effective theory for TL in one-hidden-layer fully-connected neural networks. Unlike the (lazy-training) infinite-width limit, where TL is ineffective, in the proportional limit TL occurs due to a renormalized source-target kernel that quantifies their relatedness and determines whether TL is beneficial for generalization.

Dicembre

17

2024

Francesco Camilli

A new perspective on matrix glasses: matrix denoising beyond rotational invariance

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica, interdisciplinare

Matrix denoising is central to signal processing and machine learning. Its analysis when the matrix to infer has a factorised structure with a rank growing proportionally to its dimension remains a challenge, except when it is rotationally invariant. In this case, the information theoretically optimal estimator, called rotational invariant estimator, is known and its performance is rigorously controlled. Beyond this setting few results can be found. The reason is that the model is not a usual spin system because of the growing rank dimension, nor a matrix model due to the lack of rotation symmetry, but rather a hybrid between the two. It is rather a "matrix glass". In this talk I shall illustrate our progresses towards the understanding of Bayesian matrix denoising when the hidden signal is a factored matrix XX⊺ that is not rotationally invariant. Monte Carlo simulations suggest the existence of a denoising-factorisation transition separating a phase where denoising using the rotational invariant estimator remains optimal due to universality properties of the same nature as in random matrix theory, from one where universality breaks down and better denoising is possible by exploiting the signal's prior and factorised structure, though algorithmically hard. We also argue that it is only beyond the transition that factorisation, i.e., estimating X itself, becomes possible up to sign and permutation ambiguities. On the theoretical side, we combine different mean-field techniques in order to access the minimum mean-square error and mutual information. Interestingly, our alternative method yields equations which can be reproduced using the replica approach of Sakata and Kabashima that were deemed wrong for a long time. Using numerical insights, we then delimit the portion of the phase diagram where this mean-field theory is reliable, and correct it using universality when it is not. Our ansatz matches well the numerics when accounting for finite size effects.

Dicembre

12

2024

Alexander Zlokapa

Slow mixing from quantum bottlenecks

nell'ambito della serie: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

Seminario di fisica matematica

Locandina seminario

In classical algorithms, tools such as the overlap gap property and free energy barrier are used to provide lower bounds for algorithms that are local, stable, or low-degree. In this talk, we review quantum algorithms for Gibbs sampling and show that they face analogous obstructions due to a general quantum bottleneck lemma. When applied to Metropolis-like algorithms and classical Hamiltonians, our result reproduces classical slow mixing arguments. Unlike previous techniques to bound mixing times of quantum Gibbs samplers, however, our bottleneck lemma provides bounds for non-commuting Hamiltonians. We apply it to systems such as random classical CSPs, quantum code Hamiltonians, and the transverse field Ising model. Key to our work are two notions of distance, which we use to measure the locality of quantum samplers and to construct the bottleneck.

Elenco seminari del ciclo di seminari “SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND”

Elenco seminari del ciclo di seminari
“SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND”