Dicembre
04
2025
Seminario di fisica matematica, interdisciplinare
ore 15:00
presso Aula Magna Antropologia, via Selmi 3
We explore whether a Transformer neural network can learn the deterministic sequence of rooted planar trees generated by iterated prime factorization of the natural numbers. By encoding each integer as a tree, the resulting sequence (named Natural Text NT) forms an arithmetic pseudo-text with rich and measurable statistical and combinatorial structure. We train a GPT-2–style transformer from scratch on the first $10^11$ elements of this sequence and evaluate its performance on next-token and masked-token prediction tasks. The model exhibits a partial grasp of the underlying generative grammar, capturing non-trivial patterns and long-range correlations. These results suggest that modern language models may be capable of learning structural properties intrinsic to arithmetic, extending their applicability beyond conventional empirical datasets.
Torna alla pagina dei seminari del Dipartimento di Matematica di Bologna