Seminari del Dipartimento di Matematica

Questo sito utilizza solo cookie tecnici per il corretto funzionamento delle pagine web e per il miglioramento dei servizi.
Se vuoi saperne di più o negare il consenso consulta l'informativa sulla privacy.
Proseguendo la navigazione del sito acconsenti all'uso dei cookie.

Seminario del 2025

Giugno

2025

Matteo Marsili

Part 2: Abstraction and understanding in the "mind" of simple neural networks

nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND

fisica matematica

Note: this is the second part of a two-part seminar. AI is progressing at a remarkable speed, but we still don’t have a clear understanding of basic concepts in cognition (e.g. learning, understanding, abstraction, awareness, intelligence, etc). I shall argue that research focused on understanding how learning machines such as LLMs or deep neural networks do what they do, sidesteps the key issue by defining these concepts from the outset. For example, statistical learning is based on a classification of problems (supervised/unsupervised, classification, regression etc.) and addresses the resulting optimisation problem (maximisation of the likelihood, minimisation of errors, etc). Learning entails first of all detecting what makes sense to be learned 1) from very few samples, and 2) without a priori knowing why that date makes sense. This requires a quantitative notion of relevance that can distinguish data that makes sense from meaningless noise. I will first introduce and discuss the notion of relevance. Next I will claim that learning differs from understanding, where the latter implies integrating data that make sense into a pre-existing representation. The properties of this representation should be abstract, i.e. independent of the data, precisely because they need to represent data of a widely different domain. This is what enables higher cognitive functions that we do all the time, like drawing analogies and relating data learned independently from widely different domains. Such a representation should be flexible and continuously adaptable if more data or more resources are made available. I will show that such an abstract representation can be defined as the fixed point of a renormalisation group transformation, and it coincides with a model that can be defined from the principle of maximal relevance. I will provide empirical evidence that the representations of simple neural networks approach this universal model as the network is trained on a broader and broader domain of data. Overall, the aim of the seminar is to support the idea that an approach to central issues in cognition is also possible studying very simple models and does not necessarily require understanding large machine learning models.

indietro