Seminario del 2025
Dicembre
04
2025
Alessandro Breccia
nel ciclo di seminari: SEMINARS IN MATHEMATICAL PHYSICS AND BEYOND
Seminario di fisica matematica, interdisciplinare
We explore whether a Transformer neural network can learn the deterministic sequence of rooted planar trees generated by iterated prime factorization of the natural numbers. By encoding each integer as a tree, the resulting sequence (named Natural Text NT) forms an arithmetic pseudo-text with rich and measurable statistical and combinatorial structure. We train a GPT-2–style transformer from scratch on the first $10^11$ elements of this sequence and evaluate its performance on next-token and masked-token prediction tasks. The model exhibits a partial grasp of the underlying generative grammar, capturing non-trivial patterns and long-range correlations. These results suggest that modern language models may be capable of learning structural properties intrinsic to arithmetic, extending their applicability beyond conventional empirical datasets.