Questo sito utilizza solo cookie tecnici per il corretto funzionamento delle pagine web e per il miglioramento dei servizi.
Se vuoi saperne di più o negare il consenso consulta l'informativa sulla privacy.
Proseguendo la navigazione del sito acconsenti all'uso dei cookie.
Se vuoi saperne di più o negare il consenso consulta l'informativa sulla privacy.
Proseguendo la navigazione del sito acconsenti all'uso dei cookie.
Seminario del 2026
Maggio
21
Giovedì
Jodi Dianetti
nell'ambito della serie: STOCHASTICS AND APPLICATIONS
Seminario di probabilità
ore
11:00
presso Aula Arzelà
seminario on line •
collegamento al meeting
(codice: ID riunione: 363 719 926 275 962 Passcode: yP9hT6CJ)
In this talk we discuss continuous-time and state-space optimal stopping problems from a reinforcement learning perspective. We begin by formulating the stopping problem using randomized stopping times, where the decision maker's control is represented by the probability of stopping within a given time--specifically, a bounded, non-decreasing, càdlàg control process. To encourage exploration and facilitate learning, we introduce a regularized version of the problem by penalizing it with the cumulative residual entropy of the randomized stopping time. The regularized problem takes the form of an (n+1)-dimensional degenerate singular stochastic control with finite-fuel. We address this through the dynamic programming principle, which enables us to identify the unique optimal exploratory strategy. For the specific case of a real option problem, we derive a semi-explicit solution to the regularized problem, allowing us to assess the impact of entropy regularization and analyze the vanishing entropy limit. Finally, we propose a reinforcement learning algorithm based on policy iteration. We show policy improvement results for our proposed algorithm.
This talk is based on a joint project together with Giorgio Ferrari and Renyuan Xu.