Elenco seminari del ciclo di seminari
“SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND”

Dicembre
12
2024
In classical algorithms, tools such as the overlap gap property and free energy barrier are used to provide lower bounds for algorithms that are local, stable, or low-degree. In this talk, we review quantum algorithms for Gibbs sampling and show that they face analogous obstructions due to a general quantum bottleneck lemma. When applied to Metropolis-like algorithms and classical Hamiltonians, our result reproduces classical slow mixing arguments. Unlike previous techniques to bound mixing times of quantum Gibbs samplers, however, our bottleneck lemma provides bounds for non-commuting Hamiltonians. We apply it to systems such as random classical CSPs, quantum code Hamiltonians, and the transverse field Ising model. Key to our work are two notions of distance, which we use to measure the locality of quantum samplers and to construct the bottleneck.
Dicembre
17
2024
Matrix denoising is central to signal processing and machine learning. Its analysis when the matrix to infer has a factorised structure with a rank growing proportionally to its dimension remains a challenge, except when it is rotationally invariant. In this case, the information theoretically optimal estimator, called rotational invariant estimator, is known and its performance is rigorously controlled. Beyond this setting few results can be found. The reason is that the model is not a usual spin system because of the growing rank dimension, nor a matrix model due to the lack of rotation symmetry, but rather a hybrid between the two. It is rather a "matrix glass". In this talk I shall illustrate our progresses towards the understanding of Bayesian matrix denoising when the hidden signal is a factored matrix XX⊺ that is not rotationally invariant. Monte Carlo simulations suggest the existence of a denoising-factorisation transition separating a phase where denoising using the rotational invariant estimator remains optimal due to universality properties of the same nature as in random matrix theory, from one where universality breaks down and better denoising is possible by exploiting the signal's prior and factorised structure, though algorithmically hard. We also argue that it is only beyond the transition that factorisation, i.e., estimating X itself, becomes possible up to sign and permutation ambiguities. On the theoretical side, we combine different mean-field techniques in order to access the minimum mean-square error and mutual information. Interestingly, our alternative method yields equations which can be reproduced using the replica approach of Sakata and Kabashima that were deemed wrong for a long time. Using numerical insights, we then delimit the portion of the phase diagram where this mean-field theory is reliable, and correct it using universality when it is not. Our ansatz matches well the numerics when accounting for finite size effects.
Febbraio
07
2025
Transfer learning (TL) is a well-established machine learning technique to boost the generalization performance on a specific (target) task using information gained from a related (source) task, and it crucially depends on the ability of a network to learn useful features. I will present a recent work that leverages analytical progress in the proportional regime of deep learning theory (i.e. the limit where the size of the training set P and the size of the hidden layers N are taken to infinity keeping their ratio P/N finite) to develop a novel statistical mechanics formalism for TL in Bayesian neural networks. I'll show how such single-instance Franz-Parisi formalism can yield an effective theory for TL in one-hidden-layer fully-connected neural networks. Unlike the (lazy-training) infinite-width limit, where TL is ineffective, in the proportional limit TL occurs due to a renormalized source-target kernel that quantifies their relatedness and determines whether TL is beneficial for generalization.
When studying Laplace eigenfunctions on compact manifolds, their localisation or delocalisation properties at large eigenvalues are strongly related to the dynamics of the geodesic flow. In this talk, I will be interested in delocalisation phenomena, through the study of L∞ norms of eigenfunctions, on manifolds of negative curvature. After recalling the existing results and conjectures, I will show how these results can be improved by adding small random perturbations to the Laplacian. I will also present some deterministic improvements, in the case of manifolds of constant curvature. These are joint works with Martin Vogel, and with Yann Chaubet.
Nonequilibrium systems are ubiquitous, from swarms of living organisms to machine learning algorithms. While much of statistical physics has focused on predicting emergent behavior from microscopic rules, a growing question is the inverse problem: how can we guide a nonequilibrium system toward a desired state? This challenge becomes particularly daunting in high-dimensional or complex systems, where classical control approaches often break down. In this talk, I will integrate methods from optimal control theory with techniques from soft matter and statistical physics to tackle this problem in two broad classes of nonequilibrium systems: active matter—focusing on multimodal strategies in animal navigation and mechanical confinement of active fluids—and learning systems, where I will apply control theory to identify optimal learning principles for neural networks. Together, these approaches point toward a general framework for controlling nonequilibrium dynamics across systems and scales.
Aprile
30
Mercoledì
Davide Pastorello
Seminario di fisica matematica
ore 10:00
presso Aula G3 Mineralogia
After an introduction to the notion of quantum generative adversarial networks (qGANs), I will summarize a recent quantum tomography protocol for constructing a classical estimate of a quantum state by performing repeated measurements on a n-qubit system. I will then discuss the convergence of the protocol with respect to a quantum version of the first-order Wasserstein distance, inspired by the theory of optimal mass transport. In particular, I will show how this convergence result allows us to conclude that a qGAN can be equivalently trained using classical estimators of quantum states instead of quantum data. This fact is important in practice, as it enables the training of quantum models without requiring direct access to quantum memory or coherent quantum data streams.
Graphs are a powerful data structure for representing relational data and are widely used to describe complex real-world systems. Probabilistic Graphical Models (PGMs) and Graph Neural Networks (GNNs) can both leverage graph-structured data, but their inherent functioning is different. The question is how do they compare in capturing the information contained in networked datasets? We address this objective by solving a link prediction task and we conduct three main experiments, on both synthetic and real networks: one focuses on how PGMs and GNNs handle input features, while the other two investigate their robustness to noisy features and increasing heterophily of the graph. PGMs do not necessarily require features on nodes, while GNNs cannot exploit the network edges alone, and the choice of input features matters. We find that GNNs are outperformed by PGMs when input features are low-dimensional or noisy, mimicking many real scenarios where node attributes might be scalar or noisy. Then, we find that PGMs are more robust than GNNs when the heterophily of the graph is increased. Finally, to assess performance beyond prediction tasks, we also compare the two frameworks in terms of their computational complexity and interpretability.
Giugno
09
Lunedì
Note: this is the first part of a two-part seminar. AI is progressing at a remarkable speed, but we still don’t have a clear understanding of basic concepts in cognition (e.g. learning, understanding, abstraction, awareness, intelligence, etc). I shall argue that research focused on understanding how learning machines such as LLMs or deep neural networks do what they do, sidesteps the key issue by defining these concepts from the outset. For example, statistical learning is based on a classification of problems (supervised/unsupervised, classification, regression etc.) and addresses the resulting optimisation problem (maximisation of the likelihood, minimisation of errors, etc). Learning entails first of all detecting what makes sense to be learned 1) from very few samples, and 2) without a priori knowing why that date makes sense. This requires a quantitative notion of relevance that can distinguish data that makes sense from meaningless noise. I will first introduce and discuss the notion of relevance. Next I will claim that learning differs from understanding, where the latter implies integrating data that make sense into a pre-existing representation. The properties of this representation should be abstract, i.e. independent of the data, precisely because they need to represent data of a widely different domain. This is what enables higher cognitive functions that we do all the time, like drawing analogies and relating data learned independently from widely different domains. Such a representation should be flexible and continuously adaptable if more data or more resources are made available. I will show that such an abstract representation can be defined as the fixed point of a renormalisation group transformation, and it coincides with a model that can be defined from the principle of maximal relevance. I will provide empirical evidence that the representations of simple neural networks approach this universal model as the network is trained on a broader and broader domain of data. Overall, the aim of the seminar is to support the idea that an approach to central issues in cognition is also possible studying very simple models and does not necessarily require understanding large machine learning models.
Note: this is the second part of a two-part seminar. AI is progressing at a remarkable speed, but we still don’t have a clear understanding of basic concepts in cognition (e.g. learning, understanding, abstraction, awareness, intelligence, etc). I shall argue that research focused on understanding how learning machines such as LLMs or deep neural networks do what they do, sidesteps the key issue by defining these concepts from the outset. For example, statistical learning is based on a classification of problems (supervised/unsupervised, classification, regression etc.) and addresses the resulting optimisation problem (maximisation of the likelihood, minimisation of errors, etc). Learning entails first of all detecting what makes sense to be learned 1) from very few samples, and 2) without a priori knowing why that date makes sense. This requires a quantitative notion of relevance that can distinguish data that makes sense from meaningless noise. I will first introduce and discuss the notion of relevance. Next I will claim that learning differs from understanding, where the latter implies integrating data that make sense into a pre-existing representation. The properties of this representation should be abstract, i.e. independent of the data, precisely because they need to represent data of a widely different domain. This is what enables higher cognitive functions that we do all the time, like drawing analogies and relating data learned independently from widely different domains. Such a representation should be flexible and continuously adaptable if more data or more resources are made available. I will show that such an abstract representation can be defined as the fixed point of a renormalisation group transformation, and it coincides with a model that can be defined from the principle of maximal relevance. I will provide empirical evidence that the representations of simple neural networks approach this universal model as the network is trained on a broader and broader domain of data. Overall, the aim of the seminar is to support the idea that an approach to central issues in cognition is also possible studying very simple models and does not necessarily require understanding large machine learning models.