STEP-RL: Specializing TEmporal Planning using Reinforcement Learning 

11 March 2024 | Sala Stringa - Online | 11:00 | Andrea Micheli (FBK-PSO)


Planning - devising a strategy to achieve a desired objective - is one of the basic forms of intelligence. Temporal planning studies the automated synthesis of strategies when time and temporal constraints matter. Temporal planning is one of the most strategic fields of Artificial Intelligence, with applications in autonomous robotics, logistics, flexible production, and many other fields. Historically, the research on temporal planning follows a general-purpose framework: a generic engine searches for the strategy by reasoning on the problem statement (i.e. the starting condition and the desired objective), as well as on a formal model of the domain (i.e. the possible actions). Despite substantial progress in recent years, domain-independent temporal planning still suffers from scalability issues, and fails to deal with real-word problems. The alternative is to devise ad-hoc, domain-specific solutions that, although efficient, are costly to develop, rigid to maintain, and often inapplicable in non-nominal situations.

The STEP-RL ERC project will study the foundations of a new approach to Temporal Planning, that is domain-independent and efficient at the same time. The idea is to adopt a framework based on Reinforcement Learning, where a domain-independent temporal planner is specialized with respect to the domain at hand. STEP-RL continuously improves its ability to solve temporal planning problems by learning from experience, thus becoming increasingly efficient by means of self-adaptation.

In this talk, I will present the concept and the ideas that we will tackle in the STEP-RL ERC project, which just officially started and I will present some preliminary results we have already achieved.  

Andrea Micheli is the head of the "Planning Scheduling and Optimization" research unit at Fondazione Bruno Kesser, Trento, Italy. His research focuses on the development and technology transfer of automated planning technologies. He obtained his PhD in Computer Science from the University of Trento in 2016. His PhD thesis titled "Planning and Scheduling in Temporally Uncertain Domains'' won several awards including the EurAI Best Dissertation Award and the honorable mention at the ICAPS Best Dissertation award. He currently works in the field of temporal planning and is the main developer of the TAMER planner. He is also lead developer of the pysmt open-source project aiming at providing a standard Python API for satisfiability modulo theory solvers. Andrea coordinated the AIPlan4EU project aiming to remove the access barriers to automated planning technology and to bring such technology to the European AI On-Demand Platform. He authored more than 30 papers in the Formal Methods and Artificial Intelligence fields. Andrea recently won an ERC Starting Grant for researching novel solutions in the combination of temporal planning and reinforcement learning