On timeline-based games and their complexity

In timeline-based planning, domains are described as sets of independent, but interacting, components, whose behaviour over time (the set of timelines) is governed by a set of temporal constraints. A distinguishing feature of timeline-based planning systems is the ability to integrate planning with execution by synthesising control strategies for ﬂexible plans . However, ﬂexible plans can only represent temporal uncertainty , while more complex forms of nondeterminism are needed to deal with a wider range of real-world domains. In this paper, we propose a novel game-theoretic approach to timeline-based planning problems, generalising the state of the art while uniformly handling temporal uncertainty and nondeterminism. We deﬁne a general concept of timeline-based game and we show that the notion of winning strategy for these games is strictly more general than that of control strategy for dynamically controllable ﬂexible plans. Moreover, we show that the problem of establishing the existence of such winning strategies is 2EXPTIME -complete.


Introduction
Timeline-based planning is an approach to planning originally proposed in the context of planning and scheduling of space operations. The approach was outlined by Muscettola et al. [40], and deployed soon after in the HSTS system [39], used to schedule and control the operations of the Hubble Space Telescope.
Timeline-based planning follows a different modelling perspective, when compared to action-based planning paradigmsà la STRIPS [26]. In the timeline paradigm, there is no explicit separation among states, actions, and goals; rather, the domain is modelled as a set of independent, but interacting, components, whose behaviour over time, described by the timelines, is governed by a set of temporal constraints, called synchronisation rules. The solution plan consists of a set of timelines describing a possible behaviour of the system's components that satisfies all the rules. This is a more declarative point of view than that of common action-based languages such as PDDL, since it is focused on what has or has not to happen, instead of on what the agent has to do to achieve a given goal. Furthermore, the modelling of the system can be subdivided among multiple knowledge engineers and domain experts, since the timelines of distinct components can be separately modelled, and the resulting models can better reflect the architecture of the combined system.
In the last decades, timeline-based planning has been adopted and deployed in a number of systems developed by space agencies on both sides of the Atlantic. Systems developed following this paradigm include EUROPA [44], EUROPA 2 [3,5], and ASPEN [15], developed by NASA, and APSI-TRF, developed for the European Space Agency [11,25,28]. These systems have been repeatedly employed for mid-to long-term mission planning [9,10,16,17,29,41], but the approach was also used to handle on-board autonomy [16,29,41]. Recently, the timeline-based approach has been incarnated also in the PLATINUm system [47,48], a general purpose framework which is being employed in cooperative robotics [13] and assistive robotics tasks [14].
One of the flagship features of timeline-based systems, which makes them particularly suited for such domains, is the ability to integrate the planning phase with the execution of the plan. Timeline-based planning domains often model real-time systems, whose constraints heavily depend on the precise timing of execution of the tasks. However, ensuring precise timing is often not possible, because of the inherent temporal uncertainty that arises in the interaction with the environment. The controller executing the plan can handle temporal uncertainty by the use of flexible plans, i.e., sets of different plans that differ in the execution time of the tasks.
Despite the practical success of the approach deployed in many complex real-world problems, little work has been done on timeline-based planning from a foundational perspective, till very recently. The concept of timelines and the main features of the paradigm have been characterised by different authors [6,8,21,27]. A formal framework capturing the concept of timeline-based planning, including aspects regarding uncertainty and controllability issues of plans, has been defined a few years ago by Cialdea Mayer et al. [19]. Building on this framework, recent work explored the timeline-based paradigm from a formal perspective, comparing the expressiveness of timeline-based and action-based modeling languages and studying the computational complexity of the involved planning problems [23,24,31,32]. Gigante [30] provides a comprehensive reference on the recent work on the formal properties of timeline-based planning.
Such a recent work was initially confined to timeline-based planning problems where no uncertainty about the interaction with the environment was considered. In this paper, we add uncertainty back into the picture. Timeline-based planning systems generally focus on handling temporal uncertainty, but nondeterminism is not supported in general terms: even the behaviour of external variables, completely controlled by the environment, is known up to temporal uncertainty only. The choice on concentrating on temporal reasoning and temporal uncertainty only is coherent with the history and scope of timeline-based systems. However, timeline-based modelling languages are expressive enough to model complex scenarios, such as those faced in cooperative robotics applications [13], that involve non-temporal nondeterminism, such as uncertainty about the task the environment will perform at a certain point in time. In such cases, current systems often employ a re-planning stage as part of their execution cycle (see, e.g., [46]): any mismatch between the expected and actual behaviours of the environment results into a revision of the flexible plan, which then can resume execution. Unfortunately, the cost of such a re-planning phase may be incompatible with the requirements of real-time execution and, more importantly, if a wrong choice is made by the original flexible plan, the re-planning might happen too late to be able to recover a controllable state of the system. Hence, knowledge engineers have to explicitly account for this problem if they want to avoid unnecessary failures and costly re-planning during execution, which make the system less effective and more complex to use.
To address such limitations, this paper introduces the novel concept of timeline-based planning game, a game-theoretic generalisation of the timeline-based planning problem with uncertainty, which uniformly deals with both temporal uncertainty and general nondeterminism: the controller tries to satisfy the given temporal constraints no matter what the choices of the environment are. We compare the proposed games with the current approaches based on flexible plans. In particular, we show how current timeline-based modelling languages can express problems that, only seeming to involve temporal uncertainty at first, in fact model scenarios which would require the controller to handle non-temporal nondeterminism. We show that these problems do not admit dynamically controllable flexible plans (as defined in [19]), but do admit winning strategies when viewed as instances of timeline-based games. Then, we prove that establishing the existence of a winning strategy for a given timeline-based planning game is 2EXPTIME-complete.
The paper is structured as follows. Section 2 defines timeline-based planning, including the concepts of flexible plan and dynamic controllability, borrowed from the formal framework provided by Cialdea Mayer et al. [19], which forms the basis of our analysis. Then, Section 3 discusses in detail a few limitations of the current approach based on flexible plans, motivating the rest of the paper. Section 4 addresses these issues by defining timeline-based games, and proving that the existence of a winning strategy for such games subsumes, and is strictly more general than, the existence of dynamically controllable flexible plans. Finally, Section 5 shows that the problem of establishing whether such a strategy exists is 2EXPTIME-complete. Section 6 Earth [1, +∞] Slewing [30,30] Science [36,58] Comm [30,50]    concludes the paper by summarising its results and discussing future lines of research and open problems.

Timeline-based planning
In this section, we formally define timeline-based planning problems. At first (Section 2.1), we introduce problems that are not concerned with any uncertainty in the interaction with the external environment. Such an ingredient will be added in Section 2.2, by means of the notion of flexible plan.

Timelines, plans, synchronization rules, and timeline-based planning problems
In our setting, interesting properties of the modelled system are represented by state variables.

Definition 1 (State variable).
A state variable is a tuple x = (V x , T x , D x , γ x ), where V x is the finite domain of x, T x : V x → 2 Vx is the value transition function of x, D x : V x → N × N ∪ {+∞} is the duration function of x, and γ x : V x → {c, u} is the controllability tag.
Definition 1 can be interpreted as follows. The transition function specifies which values can follow any other during the evolution of the variable. The duration function D x maps any value v ∈ V x into a pair of non-negative integers (d x=v min , d x=v max ), which respectively specify the minimum and maximum duration of any time interval (more precisely, of any token, as defined below) where x = v. The maximum duration can be infinite (d x=v max = +∞), in which case there is no upper bound to how long the variable can hold the given value. The controllability tag comes into play when handling uncertainty. Intuitively, it states whether the duration of any token where x = v is controllable by the system (γ x (v) = c) or not (γ x (v) = u).
Two example state variables, that belong to the domain of the operations of a satellite orbiting a planet (a scenario elicited from the ESA Mars Express mission [12]), are depicted in Fig. 1. The first variable (x p ) represents the pointing mode of the satellite, i.e., whether it is pointing towards Earth, doing maintenance, doing scientific measurements, slewing between the direction facing Earth and the direction facing the underlying planet, or whether it is transmitting some information. The domain of the variable thus consists of the five depicted values, and the transition function states which task can follow each other, being visualisable as a state machine. Minimum and maximum duration of each value are reported inside the bubbles. The second variable (x v ) represents the visibility window of the Earth ground station, which determines when the station is visible for transmitting. In this example, γ xp (Comm) = u, i.e., the Comm value is uncontrollable, meaning that the system can decide when to start communicating but can neither decide nor predict how much time will be required by the transmission. All the values of the variable x v are uncontrollable since, of course, the satellite cannot decide when the ground stations are visible or not. In the rest of the section, the controllability tag will be ignored; it will be considered again in Section 2.2.
The evolution over time of the values of each state variable is modelled by the timelines, which are the core concept of the whole formalism.
Definition 2 (Timeline). A token for a state variable x is a triple τ = (x, v, d), where v ∈ V x is the value taken by the variable, and d ∈ N + (positive integers) is the duration of the token. A timeline for a state variable x is a finite sequence τ = τ 1 , . . . , τ k of tokens for x.
A timeline thus represents how a state variable changes its value over time in terms of a sequence of time intervals where the variable keeps the same value. Figure 1b shows two example timelines for the state variables x p and x v . Note that d ∈ N + , i.e., the duration of tokens cannot be zero. For any token When there is no ambiguity about which timeline we refer to, we write start-time(τ i ) and end-time(τ i ) to denote, respectively, start-time(τ , i) and end-time(τ , i). Note that two consecutive tokens can hold the same value.
Let us denote by T SV the set of all the possible timelines for the set of variables in SV. A plan is a set of timelines describing the evolution of the considered set of state variables.
Definition 3 (Plan). Let SV be a set of state variables. A plan over SV is a function π : SV → T SV , that maps each variable to the timeline describing its behaviour, such that all the timelines have the same total duration.
We denote by tokens(π) the set of tokens in π(SV) and by tokens(π(x i )) the set of tokens in π(x i ). The behaviour of a system over time, described by a set of state variables, is governed by a set of temporal constraints, called synchronisation rules. Let N = {a, b, . . .} be an arbitrary set of token names. The building blocks of synchronisation rules are atomic temporal relations.
Definition 4 (Atomic temporal relation). An atomic temporal relation (atom for short) over N is defined by the following grammar: term where t, l ∈ N, u ∈ N ∪ {+∞}, and a ∈ N . Terms of the form t, with t ∈ N, are called timestamps.
Notice that the only meaningful atoms are those where at least one of the two terms is not a timestamp. Let a and b be two token names. Examples of atoms are start(b) ≤ 5, start(a) ≤ [3,7] end(b), and start(a) ≤ [0,+∞] start(b). Intuitively, a token name a refers to a specific token in a timeline, and start(a) and end(a) to its endpoints. Then, an atom such as start(a) ≤ [l,u] end(b) constrains a to start before the end of b, and the distance between the two endpoints to be comprised between the lower and upper bounds l and u. Atoms are grouped into quantified clauses called existential statements.
Definition 5 (Existential statement). Given a set SV of state variables, an existential statement over SV is a statement of the following form: where n ∈ N, a 1 , . . . , a n ∈ N , x 1 , . . . , x n ∈ SV, and v i ∈ V xi for all 1 ≤ i ≤ n.
The blocks of the form a i [x i = v i ] are called quantifiers. All the token names appearing in the atoms inside the clause C of an existential statement E ≡ ∃a 1 [x 1 = v 1 ] . . . a n [x n = v n ] . C, that do not appear in the quantifier prefix, are said to be free in E, and all those that do appear are said to be bound. An existential statement is closed if it does not contain free token names. Note that the quantifier prefix may as well be empty.
The syntax of synchronisation rules is defined as follows.
Definition 6 (Synchronisation rules). Given a set of state variables SV, a synchronisation rule over SV is an expression matching the following grammar: , and the only token name appearing free in the body is a 0 , and only in rules of the first form.
In rules of the first form, the quantifier in the head is called trigger ; rules of the second form are called trigger-less rules. Intuitively, a synchronisation rule requires that, whenever a token exists that satisfies the trigger, at least one of the disjuncts (existential statements) is satisfied, i.e., there exist other tokens, as specified in the quantifier prefix, such that the corresponding clause is satisfied. Trigger-less rules have a trivial universal quantification, which means that they only ask for the existence of some tokens, as specified by the existential statements. Consider, for instance, the timelines in Fig. 1b, and the synchronisation rules: The first rule expresses an essential guarantee for the satellite system represented by the two example variables, namely, that when the spacecraft is communicating with Earth, the ground station is visible. The timelines in Fig. 1b satisfy this constraint, since the time interval corresponding to the execution of the token where x p = Comm is contained in the one of the token where x v = Visible. The second rule instructs the system to transmit data back to Earth after every measurement session, within a certain time bound.
A trigger-less rule can instead be used to state the goal of the system, namely, to perform some scientific measurement at all: Some simple syntactic sugar can be introduced on top of the basic syntax. A strict version of unbounded atoms can be added by writing T < T for T ≤ [1,+∞] T . Moreover, one can force two endpoints to coincide in time by writing start(a) = start(b) for start(a) ≤ [0,0] start(b), and two tokens to coincide by writing a = b for start(a) = start(b) ∧ end(a) = end(b). More generally, all the Allen's interval relations [1] can be expressed in terms of these basic temporal relations.
We conclude the section by providing the notion of timeline-based planning problem. As a preliminary step, we give a formal account of the semantics of synchronisation rules, to back up their intuitive meaning.
Definition 7 (Semantics of atomic temporal relations). An atomic evaluation is a function λ : N → N 2 that maps each token name a to a pair λ(a) = (s, e) of natural numbers. Given a term T and an atomic evaluation λ, the evaluation of T induced by λ, denoted T λ , is defined as follows: • t λ = t for any t ∈ N; • for any a ∈ N , if λ(a) = (s, e), then start(a) λ = s and end(a) λ = e.
Given an atomic temporal relation α ≡ T ≤ [l,u] T and an atomic evaluation λ, we say that λ satisfies α, written λ |= α, if and only if l ≤ T λ − T λ ≤ u.
Given a clause C ≡ α 1 ∧ . . . ∧ α k , by extension we write λ |= C if λ |= α i for all 1 ≤ i ≤ k. Atomic evaluations are extracted from tokens when trying to satisfy a whole existential statement.
Definition 8 (Semantics of existential statements). Let π be a plan over a set of state variables SV, and let E ≡ ∃a 1 [x 1 = v 1 ] . . . a n [x n = v n ] . C be an existential statement. A function η : N → tokens(π) mapping any token name to a token belonging to the plan π is called token mapping.
We say that π satisfies E with the token mapping η, written π |= η E, if, for all 1 ≤ i ≤ n, there is a token τ i ∈ tokens(π(x i )) such that η(a i ) = τ i and val(τ i ) = v i , and λ |= C for an atomic evaluation λ such that λ(a i ) = (start-time(τ i ), end-time(τ i )) for all 1 ≤ i ≤ n.
Given a synchronisation rule R ≡ a 0 [x 0 = v 0 ] → E 1 ∨ . . . ∨ E m , we say that π satisfies R, written π |= R, if for any token τ 0 ∈ π(x 0 ), if val(τ 0 ) = v 0 , then there is an existential statement E i and a token mapping η such that η(a 0 ) = v 0 and π |= η E i . For a trigger-less rule R ≡ → E 1 ∨ . . . ∨ E m , π |= R if there exist an existential statement E i and a token mapping η such that π |= η E i . Definition 9 (Timeline-based planning problems). A timeline-based planning problem is a pair P = (SV, S), where SV is a set of state variables, and S is a set of synchronisation rules over SV. A plan π over SV is a solution plan for P if and only if π |= R for all synchronisation rules R ∈ S.
Let P be a timeline-based planning problem. It has been shown that the problem of establishing whether there exists a plan π |= P , is EXPSPACE-complete [30,32].
Definition 9 captures the deterministic variant of the problem, where there is no support for modelling the uncertainty coming from the interaction with the external world. The next section defines timeline-based planning problems with uncertainty, which account for this important feature.

Timeline-based planning with uncertainty
In this section, we extend the above definitions with the notion of temporal uncertainty, defining timelinebased planning problems with uncertainty. We basically follow the way in which they are presented in Cialdea Mayer et al. [19]. The ability of dealing with this form of uncertainty, integrating the planning and execution phases, is one of the key features of timeline-based planning systems, which usually employ the concept of flexible timeline and, consequently, of flexible plan.
A flexible timeline can be viewed as a set of timelines that differ only in the precise timings of the start and the end of the tokens therein, embodying some temporal uncertainty about the events described by the timeline.
Hence, flexible timelines provide an uncertainty range for the end time and the duration of each flexible token of the timeline. Note that each flexible token reports a range of its end time, rather than its start time, because in this way it can explicitly constrain its horizon. Tokens and timelines as specified in Definition 2 are also called scheduled tokens and scheduled timelines. Similarly to the notation used for scheduled timelines, the set of all the possible flexible timelines for the set of state variables SV is denoted as F SV . Let x = (V x , T x , D x , γ x ) be a state variable. We now focus on the controllability tag γ x , which has been ignored in Section 2. The controllability tag tells, for each value of the domain of each variable, if the duration of tokens that hold the given value are under the control of the planner or not.
Given a flexible timeline τ = τ 1 , . . . , τ k , with Figure 2 shows a flexible timeline for the example state variable x p of Fig. 1, and one of its instances. We are now ready to define the concept of flexible plan.
Definition 11 (Flexible plan). Given a set of state variables SV, a flexible plan over SV is a pair Π = (π, R), where π : SV → F SV is a function providing a flexible timeline π(x) for each state variable x, and R is a set of atoms (Definition 4) using as token names the set of tokens of the timelines in π. Intuitively, the flexible plan Π = (π, R) represents a set of instances of the flexible timelines of π which, additionally, satisfy the constraints imposed by the atoms included in R.
Definition 12 (Instances of flexible plans). Let Π = (π, R) be a flexible plan over SV. A plan π is an instance of Π if π (x) is an instance of π(x) for all x ∈ SV, and all the atoms T ∈ R are satisfied by the atomic evaluation λ such that λ(τ ) = (start-time(τ ), end-time(τ )) for all tokens τ of π (x), for any x ∈ SV.
To understand the role of the R component in Definition 11, consider the example given in Fig. 3, which shows flexible timelines τ x = τ x 0 , τ x 1 , τ x 2 and τ y = τ y 0 , τ y 1 , τ y 2 for two variables x and y, that have to be constrained by the shown synchronisation rule. The lower part of the picture shows some example instances of the flexible timelines. Given how the token τ x 1 is instantiated, not all the possible instances of the timeline for y are valid according to the considered rule. The first example instantiation, namely, τ y , violates the rule, while the second one satisfies it. This happens because a simple set of flexible timelines misses the key information that τ y 1 cannot start before the end of τ x 1 . A flexible plan satisfying such a rule would then have to provide additional constraints ensuring this fact, such as R = {end(τ x 1 ) = start(τ y 1 )} or R = {end(τ x 1 ) ≤ [5,10] start(τ y 1 )}. We are now ready to introduce the timeline-based planning problem with uncertainty, as an extension of the timeline-based planning problem of Definition 9. We first provide the definition of the problem and of the flexible solution plan, and then discuss in detail their meaning and structure.
Definition 13 (Timeline-based planning problem with uncertainty). A timeline-based planning problem with uncertainty is defined as a tuple P = (SV C , SV E , S, O), where: 1. SV C and SV E are the sets of, respectively, the controlled and the external variables; 2. S is a set of synchronisation rules over SV C ∪ SV E ; 3. O = (π E , R E ) is a flexible plan, called the observation, specifying the behaviour of external variables.
Definition 14 (Flexible solution plan). Let P = (SV C , SV E , S, O), with O = (π E , R E ), be a timelinebased planning problem with uncertainty. A flexible solution plan for P is a flexible plan Π = (π, R) over SV C ∪ SV E such that: 1. Π agrees with O, i.e., π(x) = π E (x), for each x ∈ SV E , and R E ⊆ R; 2. the plan does not restrict the duration of uncontrollable tokens, i.e., for any state variable x and any flexible token 3. any instance of π is a solution plan for the timeline-based planning problem P = (SV C ∪ SV E , S), and there is at least one such instance.
The definitions above are worth a detailed explanation. The timeline-based planning problem with uncertainty considers two different sources of uncertainty: the behaviour of external variables, and the duration of uncontrollable tokens. In contrast to the simple problem without uncertainty (see Definition 9), the set of state variables is split into the controlled variables SV C and the external variables SV E . The behaviour of external variables cannot be constrained by the planner in any way, hence any solution plan is constrained to replicate the flexible timelines given by the observation O, which is a flexible plan describing [5,10] start(a1) their behaviour. Since O is a flexible plan, there is temporal uncertainty on the start and end times of the involved tokens, but the behaviour of the variables is otherwise known beforehand to the planner. Despite the name, borrowed from Cialdea Mayer et al. [19], the observation O is more an a priori description of how the external variables will behave during the execution of the plan, up to the given temporal uncertainty on the precise timing of the events. The intended role of the external variables, then, is not so much that of independent components interacting with the planned system, but rather, of external entities useful to represent given facts and invariants that the planner has to account for during the search for a solution.
As an example, consider a satellite seeking the right time to transmit data to Earth. When modelling this scenario as a timeline-based planning problem with uncertainty, the window of visibility of Earth's ground stations can be represented as an external variable with a suitable observation. Note that the exact timing of when each station will effectively be available can be uncertain, but each visibility time slots are usually scheduled for the next months to come, and the planner has not to account for any variability in that regard. Hence, specifying the expected behavior of the environment as a flexible plan is usually enough when external variables are used in this way, not so much if a more general specification of the environment behavior is needed, as we will see in the next section.
The second considered source of temporal uncertainty comes from tokens holding uncontrollable values. The duration of such tokens cannot be decided by the planner, and thus their minimum and maximum duration in the flexibility range of the timeline has to coincide with that specified by the duration function of the variable. The planner can, however, decide which tokens to start and when, on controlled variables, even if γ x (v) = u (the uncontrollability is specifically limited to the duration of the token). It is worth noting how the formalism has intentionally been tailored to consider only temporal uncertainty, both with regards to external variables and to uncontrollable tokens.

Controllability of flexible plans
As already pointed out, the timeline-based approach to planning is specifically targeted at the integration between the planning phase and the execution of the plan. Hence, it is important to ensure that, once a flexible plan is found for a timeline-based planning problem with uncertainty, the plan can be effectively executed. This is not a trivial requirement given the presence of uncontrollable tokens, whose duration is decided during execution and is unknown beforehand. Definition 11, indeed, ensures that any scheduled instance of the plan is a solution for the problem, but it does not guarantee that (1) such an instance exists for any possible choice of the duration of uncontrollable tokens, and (2) at any time during the execution, the correct choice to keep following an instance of the plan depends only on events already happened and information already known.
For this reason, we must take into consideration the controllability of flexible plans, i.e., the property of being effectively executable by a controller. There are three major kinds of controllability that one may want to ensure on a flexible plan, depending on the application, which can be intuitively defined as follows.
Weak controllability For any possible choice of the duration of uncontrollable tokens, there is an instance of the flexible plan respecting that choice.
Strong controllability There is a way of instantiating controllable tokens that results into a valid instance of the flexible plan, no matter which is the duration of uncontrollable ones.
Dynamic controllability A strategy exists to choose how to instantiate each token, which, at any given point in time, can keep the execution in a valid instance of the plan, based only on past events.
Such concepts have been formalized by Cialdea Mayer et al. [19] for flexible plans, but ideas and terminology come from further back to the contributions on simple temporal networks with uncertainty (STNU) [50], which face very similar problems. In this paper, we are mostly concerned with dynamic controllability, as it represents the most reactive scenario, where the controller can react in real-time to what happens around it in order to achieve its goals or guarantee its safety requirements.
In the following, we briefly remind how dynamically controllable flexible plans can be formally defined. Recall that the set of tokens of all the timelines of a plan is denoted by tokens(π). We extend this notation by distinguishing among the set of tokens of all the timelines of a flexible plan Π, denoted by tokens(Π), the set of uncontrollable tokens of Π, denoted by tokens U (Π), and the set of controllable ones, denoted by tokens C (Π).

Definition 15 (Situations and relevant situations). Let
, be a timeline-based planning problem with uncertainty, and let Π = (π, R) be a flexible plan. A situation for Π is a map ω : tokens U (Π) → N assigning a duration to each uncontrollable token of Π. A situation ω is said to be relevant if any instance of Π in ω(Π) satisfies the constraints of R E .
A situation represents the choices of the environment for the duration of uncontrollable tokens, both of controlled and external variables. Given a flexible plan Π = (π, R), we denote by ω(Π) the set of instances of Π where the duration of uncontrollable tokens corresponds to what dictated by ω. Relevant situations are those where the external variables actually follow the behaviour described by the observation O. Since the controller is allowed to assume that this happens, only relevant situations are considered.
Let us denote by Ω Π the set of relevant situations for Π. If situations represent the decisions of the environment about the duration of uncontrollable tokens, then scheduling functions define the controller's counterpart, deciding how to execute the whole plan.
Definition 16 (Scheduling function). Given a timeline-based planning problem P = (SV C , SV E , S, O) and a flexible plan Π = (π, R) for P , a scheduling function for Π is a map θ : tokens(Π) → N, that assigns an end time to each token in Π, such that the resulting scheduled plan θ(Π) is an instance of Π.
We are now ready to formally define the different concepts of controllability introduced above. We start from the two simplest ones.
Definition 17 (Weak and strong controllability). Let Π be a flexible plan. Then, we say that Π is: 1. weakly controllable if there exists an execution strategy ς for Π; 2. strongly controllable if there is an execution strategy ς for Π such that ς(ω) = ς(ω ) for all ω, ω ∈ Ω Π . Given a scheduling function θ, let θ <t be a function mapping any token τ such that θ(τ ) < t to its duration. Such a function can be viewed as a description of the evolution of the system up to time t, ignoring any token that does not end before it. By exploiting it, we can define dynamic execution strategies, and dynamically controllable flexible plans, i.e., plans that admit such strategies.

Limitations of the current approach
In this section, we point out some limitations of the current approach to uncertainty in timeline-based planning, based on flexible plans, that we described in the previous section. The whole discussion revolves around the notion of nondeterminism. The design of most timeline-based planning systems, and, in particular, of the formal framework by Cialdea Mayer et al. [19], has been intentionally tailored to the handling of temporal uncertainty, i.e., uncertainty about when things will happen, disregarding general forms of nondeterminism, i.e., uncertainty about what will happen. According to the definitions given in Section 2.2, indeed, flexible plans are intrinsically sequential objects, that cannot represent any choice about how the execution of the plan can proceed if not regarding the timing of events (once more, this has been an intentional design choice of these systems).
In the meantime, the action-based planning community studied how to handle general nondeterminism quite extensively in the past years, following different approaches such as, for instance, reactive planning systems [4], deductive planning [45], model checking [20], and, especially, fully observable nondeterministic planning (FOND planning) [7,37,38,42]. However, these approaches to nondeterministic action-based planning do not support flexible plans and temporal uncertainty, and do not account for controllability issues. Recently, SMT-based techniques have been exploited to deal with uncontrollable durations in strong temporal planning [22], but dynamic controllability issues are not addressed.
It seems therefore that the two worlds have evolved in different and incomparable ways. On the one hand, timeline-based planning supports temporal uncertainty, but it does not consider general nondeterminism; on the other hand, action-based planning deals with general nondeterminism, but it does not explicitly support temporal uncertainty.
As a matter of fact, it is worth observing that the explicit focus of timeline-based planning on temporal uncertainty does not mean that handling general nondeterminism is not needed in the common application scenarios of these systems. However, the history of timeline-based planning, with its roots in scheduling and control theory, naturally led over time to this formulation. As explained in Section 2.2, the external variables in timeline-based planning problems with uncertainty are used to express known facts about what will happen, rather than components of a fully-fledged external entity running alongside the planned system. To this end, planning problems include a flexible plan, the observation, describing the behaviour of external variables up to the given temporal flexibility. The definition of the various forms of controllability then assumes that the behaviour of the environment follows what is stated by the observation. This is perfectly fine in some scenarios, such as the satellite control example. In other ones, however, the approach can be limiting. As an example, in collaborative robotics domains where the PLATINUm planning system was designed to be deployed [47], the controlled system has to cooperate with human agents, and thus a true reactive behaviour is required and strong assumptions about the environment choices are not possible. To cope with application domains of this nature, many timeline-based systems employ a feedback loop between the planning and execution phases, which includes a failure manager that senses when the execution is deviating from the assumed observation, and triggers a re-planning phase if necessary, devising a new flexible plan and a dynamic execution strategy that can be used to resume execution. Unfortunately, the re-planning phase can be expensive to perform on-the-fly, limiting the real-time reactivity of the system. Even ignoring the above issue, the relationships between temporal uncertainty, nondeterminism, and timeline-based planning languages turn out to be more complex than anticipated. As a matter of fact, even explicitly focusing on temporal uncertainty, timeline-based planning languages are still able to express scenarios where handling nondeterminism in a more general way is required. Consider, for instance, a timeline-based planning problem with uncertainty P = (SV C , SV E , S, O), with a single controlled state variable x ∈ SV C , with V x = {v 1 , v 2 , v 3 }, SV E = ∅, and S consisting of the following rules: [1,10] for all v i ∈ V x , and that tokens where x = v 1 are uncontrollable, i.e., The rules require the controller to start the execution with a token where x = v 1 , followed by a token where either x = v 2 or x = v 3 depending on the duration of the first token. This scenario is, intuitively, trivial to control. The system must execute x = v 1 as a first token due to the second rule. Then, the environment controls its duration, and the system simply has to wait for the token to end, and then execute either x = v 2 or x = v 3 depending on how long the first token lasted. However, there are no flexible plans that represent this simple strategy, since each given plan must fix the value of every token in advance.
To guarantee the satisfaction of the rules, the value to assign to x on the second token must be chosen during the execution, but this is not possible because of the sequential nature of flexible plans. In this case, therefore, the problem would be considered as unsolvable, even if the goals stated by the rules seem simple to achieve.
The above simple scenario shows that the inherently sequential nature of flexible plans does not allow one to express the need for a choice to be made during execution other than regarding the timings of events. However, the syntax of the language supports the modelling of scenarios where making qualitative choices depending on the environment nondeterministic behaviour is needed. Note that this is a different situation to that of deterministic action-based languages such as PDDL. In these languages, nondeterminism is not supported and simply cannot enter the picture. To allow one to model nondeterministic behaviours, PDDL has to be extended with syntactic elements useful for the purpose, like, e.g., the anyof keyword for nondeterministic effects. In this case, instead, the basic syntax of the language is sufficient to express such scenarios, but their possible solutions cannot be represented. We may say that dynamically controllable flexible plans do not provide a complete semantics for timeline-based planning with uncertainty. One may suppose that this expressive power comes from disjunctions in synchronisation rules, which come into play the above example, but results such as the encoding of action-based temporal planning given by Gigante [30] show how their presence is essential even to express simple deterministic scenarios, and thus the gap cannot be filled by removing them.
It can be easily seen that scenarios like the above one would immediately arise when trying to encode any kind of nondeterministic action-based problem such as fully observable nondeterministic (FOND) planning problems. Hence, it is impossible to extend the aforementioned encoding of action-based temporal planning to nondeterministic planning. The notable observation, however, is that a syntactic representation of a FOND planning problem as a timeline-based planning problem would be perfectly feasible, similarly to the encoding for classical planning given in [30], but such an encoding would lack a proper semantics, corresponding to FOND policies, to express the solutions to the problem.
In this paper, we propose and systematically study an extension to timeline-based planning problems with uncertainty, called timeline-based games, which addresses both the issues outlined above by treating temporal uncertainty and general nondeterminism in a uniform way.
Timeline-based games are two-player turn-based perfect-information games where the players play by executing the start and end endpoints of tokens, building a set of timelines. The first player, representing the controller, wins the game if it can build a solution plan for a given timeline-based planning problem, independently from the behaviour of the second player, which represents the environment.
In the next section, we first define the structure of timeline-based games, and then we show that they can capture the semantics of timeline-based planning problems with uncertainty, in the sense that for any such problem there is a game where the controller has a winning strategy if and only if the problem admits a dynamically controllable flexible plan. Moreover, we demonstrate that they strictly subsume the approach based on flexible plans, by showing how the problematic example given above can be modelled by means of a timeline-based game that admits a winning strategy for the controller. Finally, we address the problem of finding a winning strategy for such games, showing that the problem of deciding whether the controller has a winning strategy for a given timeline-based planning game is 2EXPTIME-complete (Section 5). The decision procedure heavily exploits the machinery of matching records defined by Gigante [30].

Timeline-based games
This section introduces the timeline-based games, our game-theoretic approach to the handling of uncertainty in timeline-based planning. We first describe their general structure, including the winning condition, and then go in detail on how they relate to dynamically controllable flexible plans and the issues brought up in the previous section.
Intuitively, a timeline-based game is a turn-based, two-player game played by the controller, Charlie, and the environment, Eve. By playing the game, the players progressively build the timelines of a scheduled plan (see Definition 3). At each round, each player makes a move deciding which tokens to start and/or to end and at which time. Both players are constrained by a set D of domain rules, which describe the basic rules governing the world. Domain rules replace the observation carried over by timeline-based planning problems with uncertainty (Definition 13), but generalise them allowing one to freely model the interaction between the system and the environment. Note that domain rules are not intended to be Eve's (nor Charlie's) goals, but, rather, a set of background facts about how the world works that can be assumed to hold at any time. Since no player can violate D, the strategy of each player may safely assume the validity of such rules. In addition, Charlie is responsible for satisfying a set S of system rules, which describe the rules governing the controlled system, including its goals. Charlie wins if, assuming Eve behaves according to the domain rules, he manages to construct a plan satisfying the system rules. In contrast, Eve wins if, while satisfying the domain rules, she prevents Charlie from winning, either by forcing him to violate some system rule, or by indefinitely postponing the fulfilment of his goals.

Partial plans
Players play the game by building a set of timelines, that is, a plan, in turns. Hence, we need to find a way to describe the partial result of this turn-based plan-building activity, that we call partial plans, which are incomplete plans under construction.
We start with the concept of event sequence, a different representation of a plan, easier to manipulate from a formal standpoint. Representing plans as event sequences is at the core of recent complexity results about timeline-based planning problems [30,32]. In particular, we follow here the exposition given by Gigante [30].
In event sequences, instead of focusing on the single timelines as building blocks, plans are flattened over a single sequence of events that mark the start/end of tokens.
Definition 19 (Event sequence). Let SV be a set of state variables, and let A SV be the set of all the terms, called actions, of the form start(x, v) or end(x, v), where x ∈ SV and v ∈ V x .
An event sequence over SV is a sequence µ = µ 1 , . . . , µ n of pairs µ i = (A i , δ i ), called events, where A i ⊆ A SV is a non-empty set of actions, and δ i ∈ N + , such that, for any x ∈ SV: 1. for all 1 ≤ i ≤ n, if start(x, v) ∈ A i for some v ∈ V x , then there are no start(x, v ) in any µ j before the closest µ k with k > i, if any exists at all, such that end(x, v) ∈ A k ; 2. for all 1 ≤ i ≤ n, if end(x, v) ∈ A i for some v ∈ V x , then there are no end(x, v ) in any µ j after the closest µ k with k < i, if any exists at all, such that start(x, v) ∈ A k ; Intuitively, an event µ i = (A i , δ i ) consists of a set A i of actions describing the start or the end of some tokens, happening δ i time steps after the previous one. Event sequences collect events to describe a whole plan.
By Definition 19, a started token is not required to end before the end of the sequence, and a token can end without the corresponding starting action to ever have appeared before. In this case, the event sequence is said open for the variable x whose start/end event is missing. In event sequences where this does not happen, called closed event sequences, both the endpoints of all tokens are specified.
Definition 20 (Open and closed event sequences). An event sequence µ = µ 1 , . . . , µ n is closed on the right (left) for a variable x if for each 1 An event sequence is simply open or closed (to the right or to the left) if it is respectively open or closed (to the right or to the left) for any variable x. Note that the empty event sequence is closed on both sides for any variable. Moreover, on closed event sequences, the first event only contains start(x, v) actions and the last event only contains end(x, v) actions, and one for each variable x. Given an event sequence µ = µ 1 , . . . , µ n over a set of state variables SV, where µ i = (A i , δ i ), we define δ(µ) = 1<i≤n δ i , that is, δ(µ) is the time passed between the start and the end of the sequence (its duration). The amount of time spanning a subsequence, written as δ i,j when µ is clear from context, is then δ(µ [i...j] ) = i<k≤j δ k .
As a consequence of their definition, closed sequences can be directly mapped to plans.
In our context, we can assume w.l.o.g. that the rules in S and D do not use pointwise atoms [30]. In this way, we can forget about any absolute time reference and reason only in terms of distance between events. In mapping an event sequence to the plan it represents, the value of δ 1 of the first event µ 1 = (A 1 , δ 1 ) is ignored, since it would represent the time passed after a non-existent previous event. By fixing an arbitrary value for δ 1 , the converse mapping from plans to event sequences can also be defined. Hence, we denote by µ π the event sequence such that π µ π = π. By admitting open event sequences, we can represent plans that are under construction, which was our original need.
Definition 22 (Partial plan). Let SV be a set of state variables. A partial plan over SV is an event sequence µ over SV, closed on the left.
Partial plans can be either open or closed on the right depending on the particular moment of the game, but they are always closed on the left. Since there is no ambiguity, we will simply say open or closed to mean open or closed on the right.

The game arena
Let us start by defining the key notion of timeline-based games.
Definition 23 (Timeline-based game). A timeline-based game is a tuple G = (SV C , SV E , S, D), where SV C and SV E are the sets of, respectively, the controlled and the external variables, and S and D are two sets of synchronisation rules, respectively called system and domain rules, involving variables from SV C and SV E .
A partial plan for G is a partial plan over the state variables SV C ∪ SV E . Let Π G be the set of all possible partial plans for G, or simply Π when there is no ambiguity. It is worth stressing again that the plan being built by the players, represented by the partial plan, is a scheduled plan, not a flexible one. The uncertainty is moved to the ignorance about what the next moves of Eve will be at each step. Recall that δ(µ) denotes the duration of µ, that is, the distance in time between the last and the first events of the sequence, hence in our settings it can be interpreted as the time elapsed from the start of the game.
Since ε is a closed event sequence and δ(ε) = 0, the empty partial plan ε is a good starting point for the game. Players incrementally build a partial plan, starting from ε, by playing actions that specify which tokens to start and/or end, producing an event that extends the event sequence, or complementing the already existing last event of the sequence. Recall from Definition 19 that actions are terms of the form start(x, v) or end(x, v), where x ∈ SV and v ∈ V x , and that the set of possible actions over SV is denoted as A SV , here just A for simplicity. Actions of the former kind are called starting actions, and those of the latter kind are called ending actions. Then, we partition all the available actions into those that are playable by either of the two players.
Definition 24 (Partition of player actions). The set A of available actions over the set of state variables SV = SV C ∪SV E is partitioned into the set A C of Charlie's actions, and the set A E of Eve's actions, defined as: Hence, players can start tokens for the variables that they own, and end the tokens that hold values that they control. It is worth noting that, in contrast to the original definition of timeline-based planning problems with uncertainty (Definition 13), Definition 24 admits cases where x ∈ SV E and γ x (v) = c for some v ∈ V x , that is, cases where Charlie may control the duration of a variable that belongs to Eve. This situation is symmetrical to the more common one where Eve controls the duration of a variable that belongs to Charlie (i.e., uncontrollable tokens), and we have no need to impose any asymmetry.
Actions are combined into moves that can start/end multiple tokens at once.

Definition 25 (Moves).
A move m C for Charlie is a term of the form wait(δ C ) or play(A C ), where δ C ∈ N and ∅ = A C ⊆ A C is either a set of starting actions or a set of ending actions. A move m E for Eve is a term of the form play(A E ) or play(δ E , A E ), where δ E ∈ N and A E ⊆ A E is either a set of starting actions or a set of ending actions.
Two different aspects of the mechanics of the game influence the above definitions. First, moves such as play(A C ) and play(δ E , A E ) can play either start(x, v) actions only or end(x, v) actions only. A move of the former kind is called a starting move, while a move of the latter kind is called an ending move. Note that empty moves play(δ E , ∅) can be considered both starting or ending moves. Moreover, we consider wait moves as ending moves. In some sense, starting and ending moves have to be alternated during the game.
Second, the two players can play the two different sets of moves defined above, hence we denote as M C the set of moves playable by Charlie, and as M E the set of moves playable by Eve. Charlie can choose to play some actions to start/end a set of tokens, by playing a play(A C ) move, or to do nothing and wait a certain amount of time by playing a wait(δ C ) move. Charlie plays first at each round, as will be formally stated later, and Eve can reply to Charlie's move by playing a play(A E ) move in response to a play(A C ) move by Charlie, and a play(δ E , A E ) move in response to a wait(δ C ) move by Charlie. If Charlie plays a play(A C ) move, the given actions are applied immediately, for some specific sense defined later, and Eve replies by specifying what happens to her variables at the same time point. Instead, if Charlie plays a wait(δ C ) move to wait some amount of time δ C , there is no reason why Eve should be forced to wait the same amount of time without doing nothing, so she can play a play(δ E , A E ) move, specifying an amount of time δ E ≤ δ C , so that actions in A E will be applied accordingly, interrupting the wait of Charlie who can then timely reply to Eve's actions. This is formalised by the following notion of round.

Definition 26 (Round).
A round ρ is a pair (m C , m E ) ∈ M C × M E of moves such that: 1. m C and m E are either both starting or both ending moves; 2. either ρ = (play(A C ), play(A E )), or ρ = (wait(δ C ), play(δ E , A E )), with δ E ≤ δ C ; A starting (ending) round is one made of starting (ending) moves. Note that since Charlie cannot play empty moves and wait moves are considered ending moves, each round is unambiguously either a starting or an ending round. We can now define how a round is applied to the current partial plan to obtain the new one.
Definition 27 (Outcome of rounds). Let µ = µ 1 , . . . , µ n be a partial plan, with µ n = (A n , δ n ), let ρ = (m C , m E ) be a round, let δ E and δ C be the time increments of the moves, with δ C = δ E = 1 for play(A) moves, and let A E and A C be the set of actions of the two moves (A C is empty if m C is a wait move).
The outcome of ρ on µ is the event sequence ρ(µ) defined as follows: 1. if ρ is a starting round, then ρ(µ) = µ <n µ n , where µ n = (A n ∪ A C ∪ A E , δ n ); 2. if ρ is an ending round, then ρ(µ) = µµ , where µ = (A C ∪ A E , δ E ); We say that ρ is applicable to µ if: a) the above construction is well-defined, i.e., ρ(µ) is a valid event sequence by Definition 19;

b) ρ is an ending round if and only if µ is open for all variables.
We say that a single move by either player is applicable to µ if there is a move for the other player such that the resulting round is applicable to µ.
Together, Definitions 26 and 27 define the mechanics of the game, that can now be fully clarified. The game starts from the empty partial plan ε, and players play in turn, composing a round from the move of each one, which is applied to the current partial plan to obtain the new one. Let µ be the current partial plan. At each step of the game, both players can either stop the execution of a set of tokens, by playing an ending round, or start the execution of a set of others, by playing a starting round (Item 1 of Definition 26). This does not mean that at each time point in the constructed plan only one of the two things can happen, but that the ending and starting actions of each events are contributed separately in two phases. When a starting round is played, its actions are added to the last event of the round (since no time amount needs to be specified, starting rounds can only consist of play(A) moves). In contrast, when an ending round is played, the corresponding actions form an event that is appended to µ, obtaining that δ(ρ(µ)) > δ(µ). Then, the next round, which must be a starting round by Item b) of Definition 27, can start the new tokens following the ones that were just closed. Note that Items a) and b) of Definition 27 together ensure that a) the played actions make sense with regards to the current partial plan being built (such as the fact that a token can be closed only if it was open etc., see Definition 19), and b) that time cannot stall, by forcing starting rounds to be immediately followed by ending ones.

The winning condition
It is now time to define the notion of strategy for each player, and of winning strategy for Charlie.

Definition 28 (Strategies).
A strategy for Charlie is a function σ C : Π → M C that maps any given partial plan µ to a move m C applicable to µ. A strategy for Eve is a function σ E : Π × M C → M E that maps a partial plan µ and a move m C ∈ M C applicable to µ, to a m E such that ρ = (m C , m E ) is applicable to µ.
A sequence ρ = ρ 0 , . . . , ρ n of rounds is called a play of the game. A play is said to be played according to some strategy σ C for Charlie, if, starting from the initial partial plan µ 0 = ε, it holds that ρ i = (σ C (Π i−1 ), m i E ), for some m i E , for all 0 < i ≤ n, and to be played according to some strategy σ E for Eve if ρ i = (m i C , σ E (Π i−1 , m i C )), for all 0 < i ≤ n. It can be seen that for any pair of strategies (σ C , σ E ) and any n ≥ 0, there is a unique run ρ n (σ C , σ E ) of length n played according both to σ C and σ E .
Note that, according to our definition of strategy, Charlie can base his decisions only on the previous rounds of the game, not including Eve's move at the current round. However, Charlie can still react immediately, in some sense, to decide which token to start after an uncontrollable one closed by Eve, because of the alternation between starting and ending rounds. Hence Charlie can choose the starting actions of an event depending on the ending actions of that same event, but the contrary is not true: after Eve closes a token, Charlie has to wait at least one time step to react to that move with an ending action. This design choice is crucial to replicate and capture the semantics of dynamically controllable flexible plans, as will be detailed in Section 4.4.
As for the winning condition, we have to formalise the intuition given at the beginning of the section, regarding the role of domain rules and system rules. Charlie wins if, assuming domain rules are respected, he manages to satisfy the system rules no matter how Eve plays.
Let G = (SV C , SV E , S, D) be a planning game. To evaluate the satisfaction of the two sets of rules over the current partial plan, we proceed as follows. First, we define from G two timeline-based planning problems (as for Definition 9), P D = (SV, D) and P S = (SV, S). Then, given a partial plan µ, we consider the scheduled plan π µ corresponding to an event sequence µ obtained by closing µ at time δ(µ), i.e., completing the last event of µ in such a way to close any open token. Then, we say that a partial plan µ, and the play ρ such that µ = ρ(ε), are admissible, if π µ |= P D , i.e., if the partial plan satisfies the domain rules, and are successful if π µ |= P S , i.e., if the partial plan satisfies the system rules.
Definition 29 (Admissible strategy for Eve). A strategy σ E for Eve is admissible if for each strategy σ C for Charlie, there is k ≥ 0 such that the play ρ k (σ C , σ E ) is admissible.
Definition 30 (Winning strategy for Charlie). Let σ C be a strategy for Charlie. We say that σ C is a winning strategy for Charlie if for any admissible strategy σ E for Eve, there exists n ≥ 0 such that the play ρ n (σ C , σ E ) is successful.
We say that Charlie wins the game G if he has a winning strategy, while Eve wins the game if a winning strategy does not exist.
As an example, consider a timeline-based game G = (SV C , SV E , S, D) with two variables x ∈ SV C and y ∈ SV E , V x = V y = {go, stop}, unit duration, and the sets of rules defined as follows: Here, Charlie's ultimate goal is to realise x = stop, but this can only happen after Eve realised y = stop. This is guaranteed to happen, since we consider only admissible strategies. Hence, the winning strategy for Charlie only chooses x = go until Eve chooses y = stop, and then wins by executing x = stop. If D was instead empty, a winning strategy would not exist since a strategy that never chooses y = stop would be admissible. This would therefore be a case where Charlie loses because Eve can indefinitely postpone his victory. This is a simple example of a kind of problems and solutions that are not approachable with flexible plans. The next section will formalize and prove the greater generality of this approach. As pointed out in Section 3, the inherent sequentiality of flexible plans poses some limitations in interactive application scenarios, where a feeback loop involving a re-planning phase is often needed. In contrast, in timeline-based games, the interaction between the environment and the controlled system can be modeled in a rich and expressive way, by means of the domain rules. Note that domain rules can mention both controlled and external variables, allowing for the specification of complex dynamics and interactions. A winning strategy for such a game is then able to cope with the maximum generality to any execution of such a specification, without the need of any sort of re-planning (hence without the need to implement the planner itself as part of the executive system). Of course, as in any model-based approach, the modelling task is crucial, as a badly modeled game would result into a strategy unable to really react to the environment when executed in the real world. However, this is rather a problem of knowledge engineering and domain modeling: as far as the world is correctly modelled, the game-theoretic approach avoids the need of a run-time feedback loop involving any kind of re-planning phase.
Moreover, the high complexity of the strategy existence problem, proved in the next section, does not confute by itself any of the above claims: the search for a winning strategy and the synthesis of a controller implementing such a strategy are done off-line, while during the execution a blind execution of the strategy suffices. The costly re-planning phases, in contrast, takes place during execution, impairing the applicability of the approach to real-time scenarios.

Timeline-based games and flexible plans
Let us compare now the concept of dynamic controllability of flexible plans, as defined in [19], with the existence of winning strategies for timeline-based planning games.
The first step is to back the claim of the greater generality of the latter with respect to the former. We prove that, given a flexible solution plan for a timeline-based planning problem with uncertainty, we can reduce the problem of the dynamic controllability of the plan to the existence of a winning strategy for a particular game. To this aim, we need a way to represent as a game any given planning problem with uncertainty together with its flexible plan. Intuitively, this can be done by encoding the observations O into suitable domain rules. The game associated with a problem therefore mimics the exact setting described by it. What follows shows how such a game is built and which relationship exists between its winning strategies and dynamically controllable flexible plans for the original problem.
Theorem 1 (Winning strategies vs. dynamic controllability). Let P be a timeline-based planning problem with uncertainty, and suppose that P admits a flexible solution plan Π. Then, a timeline-based game G P,Π can be built, in polynomial time, such that Π is dynamically controllable iff Charlie has a winning strategy for G P,Π .
Proof. Let P = (SV C , SV E , S, O) be a timeline-based planning problem with uncertainty and let Π = (π, R) be a flexible solution plan for P . We can build an equivalent timeline-based game G P,Π = (SV C , SV E , S, D), by keeping SV C and SV E unchanged, and suitably encoding the observation O and the flexible plan Π into, respectively, the set of domain rules D and of system rules S. In this way, Eve's behaviour will be constrained to follow what is dictated by the observation, replicating the semantics of timeline-based planning problems with uncertainty, and the behaviour of Charlie will follow by construction what is stated by the flexible plan.
To proceed, let SV E = {x 1 , . . . , x n }, O = (π E , R E ), and τ i = π E (x i ) = τ i 1 , . . . , τ i ki , for some k i and all Such a rule can be written as follows: Adding the above rule to the set D of domain rules ensures that any admissible play of the game follows the observation O. In a completely similar way, we can encode the flexible plan Π into a rule to add to the system rules S. Note that, by definition of flexible plan, following the plan satisfying R is sufficient to satisfy the set S of problem rules, which thus can be discarded and replaced by the single rule that encodes the plan. Now, we prove that Charlie has a winning strategy for G P,Π if and only if Π is dynamically controllable. (−→). Suppose that there exists a dynamic execution strategy ς for Π. We show how to obtain a winning strategy σ for G P,Π by combining the flexible plan with the dynamic execution strategy. The strategy σ is built as follows. Let µ be an event sequence. Situations describe the duration of all the uncontrollable tokens in the plan, and thus we cannot directly construct a situation from the current partial plan, since only the duration of tokens ended before δ(µ) is known. However, a relevant situation ω can be obtained by choosing the duration of missing tokens arbitrarily, as long as the result projects an instance of the observation O. The winning strategy we are looking for can assume that Charlie is playing against an admissible Eve strategy, and hence the existence of such a relevant situation is guaranteed. Then, the resulting scheduling function θ = ς(ω) can be used to decide the next move. Among all the flexible tokens in Π whose instance is currently open in Then, if t = δ(µ) + 1, the strategy plays the end of those tokens, i.e., σ(µ) = play(A C ), where A C = end(x 1 , v 1 ), . . . , end(x k , v k ). Otherwise, it is not yet time to end them, and thus σ(µ) = wait(t δ ), with t δ = t − δ(µ). Note that the arbitrary completion of the situation ω for future tokens was only a formal obligation: since ς is a dynamic execution strategy (Definition 18), the consequent choice only depended on the tokens ended before δ(µ), anyway.
(←−). We now show that if a winning strategy σ C for G P,Π exists, then there exists a dynamic execution strategy ς for Π, defined as follows. Let ω be a relevant situation. An admissible strategy σ E for Eve is induced by ω as follows. For variables x ∈ SV E , the strategy σ E starts tokens in the order specified by O, and ends them with the timings specified by ω. Since ω is relevant, we are sure that a valid instance of O is obtained, hence σ E is admissible, as it satisfies the domain rule in D, that encodes O. Then, for variables y ∈ SV C , if Charlie builds an instance of Π, then σ E ends the uncontrollable tokens of the plan according to ω, behaving arbitrarily otherwise. Note that D does not involve any variable in SV C , and thus the behavior of the strategy on those variable does not affect its admissibility. Now, since σ C is a winning strategy, and σ E is admissible, there is a natural number k such that the play ρ k of k rounds played according to σ C and σ E produces an event sequence µ = ρ k (ε) that satisfies D and S. Since S faithfully encodes the flexible plan Π, the plan π µ induced by µ is an instance of Π. Hence, we can define ς(ω) as the scheduling function θ such that, for each token τ in Π which ends at time t τ in π µ , θ(τ ) = t τ . Since π µ is an instance of Π, θ is a scheduling function for Π. Moreover, θ is based on µ, which is the result of playing the strategy σ C . By how the game is defined, at each ending round, σ C has only access to the previous history of the game up to the previous time step. Hence, let τ be a token and t = θ(τ ). The decision by θ of ending τ at time t only depends on the prefix of µ happening before t, and thus, given any other situation ω , with θ = ς(ω ), if θ <t = θ <t , we have that θ (τ ) = t as well, by construction. This confirms that ς is a dynamic execution strategy for Π. Theorem 1 shows that given a flexible solution plan Π, we can decide its dynamic controllability by looking for a winning strategy for the game G P,Π . More generally, given the timeline-based planning problem with uncertainty P = (SV C , SV E , S, O), we can similarly build a game G P = (SV C , SV E , S, D) such that the existence of a dynamically controllable flexible plan for P implies the existence of a winning strategy for G P . This is done by encoding the observation O into the set of domain rules D exactly as done in Theorem 1, but setting S = S, without constraining the game to any specific plan. Then, if a plan exists, and it is dynamically controllable, it can be checked that a winning strategy for G P must exist as well.
Corollary 1 (Generality of timeline-based games). Let P be a timeline-based planning problem with uncertainty. Then, a timeline-based game G P can be built, in polynomial time, such that if P admits a dynamically controllable flexible solution plan, then Charlie has a winning strategy for G P .
The converse is not true, however, because winning strategies for timeline-based games are strictly more expressive than flexible plans. Hence, there can be some problems P that do not have any dynamically controllable flexible plan, but such that there is a winning strategy for G P . This is the case with the example problem discussed in Section 3, which has an easy winning strategy when seen as a game, while it has no dynamically controllable flexible plan. We can encode the example problem P with the game G P , in which the shown synchronisation rules are included as system rules, and the set of domain rules is empty (since there are no external variables and thus the observation is empty as well). The winning strategy is simple: after playing start(x, v 1 ) at the beginning, Charlie only has to wait for Eve to play end(x, v 1 ), and then play start(x, v 2 ) or start(x, v 3 ) according to the current timestamp. Therefore, one can prove the following theorem.
Theorem 2. A timeline-based planning problem with uncertainty P exists such that there are no dynamically controllable flexible plans for P , but Charlie has a winning strategy for the associated planning game G P .

Complexity of finding winning strategies
In previous sections, we introduced and formally defined the notion of timeline-based game, and showed how the existence of a winning strategy for such a game subsumes the existence of a dynamically controllable flexible plan for the equivalent timeline-based planning problem with uncertainty. In this section, we show that deciding whether such a strategy exists is a 2EXPTIME-complete problem.

Finite representation of game plays
From the definitions given in Section 4 and, in particular, the definitions of strategies for the two players (Definition 28), it can be seen that a timeline-based game provides an implicit representation for a potentially infinite state space consisting of all possible partial plans Π. To solve the game, we first reduce the game to a finite game. The key observation here is that, although each synchronisation rule can potentially speak about events arbitrarily far in the past and in the future, a finite representation of the history of the game is possible. The same issue has been met already in the study of the computational complexity of the plan existence problem for timeline-based planning problems [30,32], leading to the development of a few conceptual tools that we are going to reuse here: a graph-theoretic representation of synchronisation rules, called the rule graphs, and a data structure, called matching records, that, by using rule graphs, can finitely represent an infinite set of similar partial plans.
What follows briefly recaps a minimal set of definitions that make it possible to understand how these concepts can be leveraged to obtain a finite state space for timeline-based games. The exposition is borrowed from [30], where many additional details can be found.
Then, the rule graph of E is an edge-labelled graph G E = (V, E, β) where: 1. the set of nodes V is made of terms (as per Definition 4) such that: Intuitively, the rule graph G E for an existential statement E has a node for each term that appears in E, including both endpoints of each mentioned token, and an edge for each temporal constraint imposed between any two nodes. An edge e is said to be unbounded, if β(e) = (l, +∞) for some l ∈ N, and bounded otherwise. Note that the nodes representing the token a 0 quantified in the trigger of the rule are included in the rule graph of all the existential statements of the rule. Given a rule graph G = (V, E) and an event sequence µ = µ 1 , . . . , µ n , a matching function γ : V → [1, . . . , n] can be used to match the nodes of G to the events of µ. If a matching function γ exists such that all the temporal constraints are satisfied (with satisfaction defined in the standard way), written µ, γ |= G, then we say that G matches over µ, written µ |= G. If start(a 0 ) appears in µ i (a rule containing E would be triggered by µ i ), we write µ, γ |= i G if γ(start(a 0 )) = i, and µ |= i G if there exists such a γ. In general, we can rephrase the satisfaction of synchronization rules in terms of matching of rule graphs.
We can thus reason about the satisfaction of the synchronization rules of the game in terms of matching of the rule graphs of their existential statements. A few useful observations can be made about rule graphs. Given a rule graph G E = (V, E, β), a subgraph of G is a graph G = (V , E , β ) such that V ⊆ V , E ⊆ E, and β = β| E . A subgraph of a rule graph can be seen itself as the rule graph of a simpler existential statement. A particularly important kind of subgraphs are the bounded components: subgraphs that are connected by bounded edges. Given a bounded component B of G, we can compute the maximum distance of any two events involved in the matching of B on any event sequence. A reasonable upper bound to window(B) can be given by the sum of all the upper bounds of the bounded edges of B. A much tighter bound can be computed as shown in [30]. Here, were are interested in highlighting that window(B) is at most exponential in the size of G, since we consider numeric coefficients to be expressed in binary notation. Then, we can extend the concept to a generic set S of synchronization rules, by defining window(S) as the sum of window(B) for all the bounded components B of all the rule graphs of every rule in S. Note that window(S) ∈ O(2 |S| ), where |S| is the size of S (defined in the natural way).
The definition of window(S) allows us recall a basic property of event sequences.
Proposition 2 (Bounded distance in event sequences [30]). Let S be a set of synchronization rules over a set of state variables SV, and let µ = µ 1 , . . . , µ n be an event sequence satisfying S. Then, there exists another event sequence µ = µ 1 , . . . , µ m , satisfying S, such that δ i ≤ window(S) for all i ≥ 0.
Intuitively, the distance between two events of an event sequence does not need to exceed window(S) because no rule in the set has any way to discriminate two consecutive events so far in time.
The above-introduced concepts allow us to define the notion of matching record. Intuitively, given a set of synchronisation rules S and any event sequence µ, the matching record [µ] of µ is a structure of bounded size that allows us to effectively test whether µ satisfies any of the rules in S. Furthermore, given an event µ, it is possible to effectively build the matching record [µµ] starting from [µ].
Definition 32 (Matching record). Let S be a set of synchronization rules over a set SV of state variables, and let µ = µ 1 , . . . , µ n be an event sequence over SV, closed to the left, such that δ(µ) ≥ 2 window(S).
A detailed account of the above definition can be found in [30]. Here, we will briefly summarize the role of the components of a matching record [µ] = (ω, Γ, ∆). The first component is a suffix ω of the actual event sequence µ, which is considered as composed of two parts, ω − and ω + , each spanning at least a window(S) amount of time. ω records the recent history of the sequence in an exact way. The rest of the sequence does not need to be stored completely, as the essential information about it is represented by Γ and ∆. In particular, Γ records which parts of the rule triggered inside ω − match over µ, including those that matched in the past, before the recent history recorded by ω. For each newly triggered rule, Γ is queried to know which parts of the rule matched in the distant past. Then, ∆, records the parts that matched in each instance of the rule triggered in the whole µ. The missing parts will need to be satisfied in the future in order to fulfill the rule. In both cases, the subgraphs recorded are required to only have unbounded outgoing edges, or, in other words, to be only made of whole bounded components. Since ω − and ω + span at least window(S), this ensures that quantitative constraints can be fully matched inside ω when building Γ and ∆.
The above definition is relatively abstract, but matching records can be nonethelesss computationally manipulated in useful way, as formally stated by the next proposition.
Proposition 3 (Matching records (see Gigante [30], Chapter 4)). Let S be a set of synchronization rules over a set SV of state variables, and let µ be an event sequence over SV. The following statements hold: 1. the size of [µ] is at most exponential in the size of S; 2. given [µ] and R ∈ S, whether µ |= R can be decided in exponential time;

given [µ]
and an event µ, whether µµ would be a valid event sequence can be checked in polynomial time; 4. given [µ] and an event µ, the matching record [µµ] can be effectively built in exponential time.

Deciding the existence of winning strategies
We can use matching records to reduce the state space of our games to a finite size, that is, given a timeline-based game, we can build a structure representing a finite-state equivalent game. In particular, we can build a turn-based synchronous game structure, as introduced by Alur et al. [2].
Definition 33 (Turn-based synchronous game structure). A turn-based synchronous game structure is a tuple S = P, Q, Σ, ν, λ, R , where P = {1, . . . , k} is the set of players, Q is the finite set of states, Σ is the finite set of propositions, ν : Q → 2 Σ specifies the set ν(q) of propositions true at any state q ∈ Q; λ : Q → P is a function telling which player owns any given state, and R ⊆ Q × Q is the transition relation.
Turn-based synchronous game structures, simply called game structures hereinafter, represent games where players play in turn, not concurrently, since each state q ∈ Q is owned by the player λ(q), who plays when the game reaches one of its states. A path of the game is an infinite sequence of states q = q 0 , q 1 , . . . such that (q i , q i+1 ) ∈ R for all i ≥ 0. Given a player a ∈ P, a strategy for a is a function f a : Q + → Q that maps any non-empty finite prefix q = q 0 , . . . , q n of a path (the history of the game play), where λ(q n ) = a, to the next state f a (q n ) chosen among the successors of q n . A play such that q i+1 = f a (q i ) for any q i such that λ(q i ) = a is said to be played according to the strategy f a . Given a set of players A ⊆ P, and a set of strategies F A , one for each a ∈ A, the sequence q is played according to F A if it is played according to all the strategies in 4. the valuation ν is such that for all q ∈ Q 2 , ν(q) = ∅, and for all 6. the transition relation is bipartite, relating only states from Q 1 to Q 2 or vice versa, and is defined as: In order to formally tie to S G the existence of winning strategies for G, we introduce a logical formulation of Definition 28 in terms of alternating-time temporal logic (ATL) or, more precisely, its extension ATL * . Introduced by Alur et al. [2], ATL and ATL * are strategic logics that are interpreted over concurrent game structures, of which turn-based synchronous structures are a special case. Given a set P = 1, . . . , k of players and a finite set Σ of propositions, the syntax of ATL * is given in terms of state formulas and path formulas, defined as follows: ATL * formulas are all the state formulas defined above. Given a game structure S = P, Q, Σ, ν, λ, R and a state q ∈ Q, the formula A ψ holds over S and q, written S, q |= A ψ, if there exists a set of strategies F A , one for each a ∈ A, such that S, q |= ψ for all paths q = q, q , . . . starting from q played according to F A . The other connectives and temporal operators are defined as expected. See Alur et al. [2] for the complete semantics of the logic.
Lemma 2 (ATL * formulation of winning strategies). Let G = (SV C , SV E , S, D) be a timeline-based game, and let S G be its associated game structure. Then, Charlie has a winning strategy for G iff it holds that: 1 Proof (−→). First, we prove that if Charlie has a winning strategy for G, then the given formula holds on the [ε] state of S. Let σ C : Π → M C be such a strategy. A strategy for Player 1 of S G is a function σ 1 : Q + → Q. Let q = (q 0 , . . . , q n ) be a path in S G where q 0 = [ε]. Suppose λ(q n ) = 1, i.e., it is Player 1's turn to play. Given the particular transition relation of S G , this means that n is even. By construction, states in even positions q 0 , q 2 , q 4 , . . . are a sequence of matching records [µ 0 ], [µ 2 ], . . . , whereas for the odd ones, q 1 , q 3 , q 5 , . . . , we have q i = ([µ i−1 ], m C ), where m C is a move for Charlie applicable to µ i−1 . Vice versa, states in odd positions are related to even ones by moves for Eve. Hence, from the path we can reconstruct the actual event sequence µ n built by the full play of the game. Suppose Charlie has a winning strategy σ C for G, and let m C = σ C (µ n ). We can define a strategy σ 1 for Player 1 in S G as σ 1 (q) = m C . Now, let w = w 0 , w 1 , . . . be a path played according to the strategy σ 1 defined above and some strategy σ 2 for Player 2. We show that w |= Fd → Fw. Suppose Fd holds. This means that there is k ≥ 0 such that S, w k |= d. Note that k is even, since, by definition, only states in Q 1 are labelled, hence let w k = [µ k ]. Then, by construction, it follows that domain rules are satisfied by µ k , hence σ 2 corresponds to an admissible strategy for Eve in the game G. Since σ C is a winning strategy, we know that any play played according to it and any admissible strategy leads to a k such that the rules in S are satisfied by µ k , and consequently w k is labelled by w, meaning that w satisfied Fw. We can conclude that all paths starting from [ε] and played according to σ 1 satisfy Fd → Fw, hence S, [ε] |= 1 (Fd → Fw).
(←−). Conversely, let us show that if the formula holds on S G , then a winning strategy for Charlie exists on G. If 1 (Fd → Fw) holds on [ε], there exists a strategy σ 1 for Player 1 such that for all paths q = q 0 , q 1 , . . . played according to σ 1 , it holds that S, q |= Fd → Fw. From σ 1 , we can define a corresponding strategy σ C for Charlie similarly to the converse operation defined above: for each µ, a path w is defined that reconstructs µ, and then σ C (µ) = σ 1 (w). Then, we can see that σ C is a winning strategy: a play played according to an admissible strategy σ E for Eve would correspond to a path u = u 0 , . . . , with u 0 = [ε], such that S, u k |= d for some k ≥ 0, which means that the domain rules are satisfied at the k-th round of the play of G. Since the path w is played according to σ 1 , i.e., the play is played according to σ C , there is a k where S, u k |= w (because Fd → Fw holds), which means in turn that system rules are satisfied as well at the k game round. Hence, σ C is a winning strategy.
Everything is now in place to prove the complexity of finding a winning strategy for a given game.
Theorem 3 (Complexity of finding winning strategies). Whether a timeline-based game G admits a winning strategy for Charlie can be decided in doubly exponential time.
Proof. Let G be a timeline-based game. Thanks to Lemma 2, we can verify whether G admits a winning strategy for Charlie by building the corresponding turn-based synchronous game structure S G and checking whether S, [ε] |= 1 (Fd → Fw). It is known from Alur et al. [2] that model checking an ATL * formula on a concurrent game structure has polynomial time complexity in terms of the size of the structure, for formulas of bounded size. This is our case, since the formula that we need to check is fixed, and always the same for any G. Hence, as the structure can be built in doubly exponential time (Lemma 1), such is the complexity of checking whether G admits a winning strategy for Charlie.
The ATL * formula used by Theorem 3 is fixed and very simple, and most of the complexity of the procedure comes from the construction of the game structure. However, having framed the problem in logical terms gives us much flexibility in how to extend the current setting to more complex or expressive variants, or to different winning conditions. Exploring this potential is left as future work.

Finding whether winning strategies exist is 2EXPTIME-complete
We will now prove that deciding whether a winning strategy exists for Charlie in a given timeline-based game is 2EXPTIME-hard (hence 2EXPTIME-complete, as well). The proof is based on a reduction from a particular kind of tiling games, introduced by Chlebus [18] as a 2-player variant of common tiling problems. These kind of problems have been used for a long time as a source of reductions to study the computational complexity of many problems in logic and combinatorics [34,35,36,43,49,51].

Definition 35 (Tiling structures and tilings).
A tiling structure is a tuple T = (T, t 0 , t * , H, V, n), where T is a set of elements called tiles, t 0 ∈ T is the initial tile, t * ∈ T is the final tile, H, V ⊆ T × T are the horizontal and vertical adjacency relations, and n ∈ N + is a positive number, encoded in binary.
A k-tiling of the tiling structure T , for k > 1, is a function f : [n] × [k] → T , mapping any position (x, y) of the rectangle of size n × k to a tile f (x, y) ∈ T such that: The exponential rectangle tiling problem is the problem of deciding, given a tiling structure T , if there exists a k-tiling for T for some k > 1. The problem is known to be EXPSPACE-complete [49]. The exponential rectangle tiling game, is a 2-player variant of the problem, where a player, the Constructor , tries to build a k-tiling for a given tiling structure, and the opposite player, the Saboteur , tries to prevent it to happen. The two players play in turn, starting from Constructor , choosing one tile at the time, starting from the one in position (0, 0), filling one row after the other. A strategy for Constructor is a function σ : T * → T , that given the sequence of tiles positioned up to the current time, gives the next tile to play. Given a tiling structure T , a winning strategy lets Constructor build a k-tiling, for some k > 1, no matter which tiles are chosen by Saboteur .
Given a tiling structure T , the problem of deciding whether Constructor has a winning strategy can be seen to be 2EXPTIME-complete: Chlebus [18] proves that the problem is EXPTIME-complete if n is encoded in unary, while here it is encoded in binary. Following the same proof with this difference, we can obtain the 2EXPTIME-completeness result. We can now prove how to reduce tiling games to timeline-based games.
Theorem 4 (Deciding the existence of winning strategies is 2EXPTIME-hard). Let G = (SV C , SV E , S, D) be a timeline-based game. The problem of deciding whether Charlie has a winning strategy for G is 2EXPTIME-hard.
Proof. As anticipated, the proof goes by reduction from exponential rectangle tiling games. We prove that, given any tiling structure T = (T, t 0 , t * , H, V, n), we can build in polynomial time a corresponding timeline-based game G = (SV C , SV E , S, D) such that Charlie has a winning strategy for G if and only if Constructor has a winning strategy in the tiling game over T . The timeline-based game encoding needs the implementation of a binary counter, repeatedly counting from 0 to n. The bits of the counter are represented by a number of variables c 1 , . . . , c m ∈ SV E , where m = log 2 (n) . The binary variables are uncontrollable, i.e., γ(c i ) = u for all 1 ≤ i ≤ m, have all domain V ci = {0, 1}, and have trivial transition function and unit duration, i.e., T ci (0) = T ci (1) = {0, 1} and D ci (0) = D ci (1) = (1, 1) for all 1 ≤ i ≤ m. It can be seen that, with a polynomial number of synchronisation rules of polynomial size, it is possible to force these variables to encode the correct behaviour of the counter. Such rules are placed in D, so that the evolution of the counter is completely handled by Eve. Other rules can look at them to query the current value.
Then, the rectangle to be tiled is represented by two variables x ∈ SV C and y ∈ SV E , defined as follows. For x = (V x , T x , D x , γ x ) ∈ SV C , the domain is defined as V x = {t even , t odd | t ∈ T }, i.e., an even and odd version of each tile t ∈ T . The transition function forces a strict alternation between even and odd values, i.e., T x (t even ) = {t odd | t ∈ T } and T x (t odd ) = {t even | t ∈ T } for each t ∈ T . Any token for x is controllable and is forced to be of unit duration, i.e., D x (v) = (1, 1) and γ(x) = c for each v ∈ V x . In contrast, the domain of y contains a single value for each tile, with the addition of a special symbol ⊥, i.e., V y = T ∪ {⊥}, and any token for y is uncontrollable and forced to last two units of time, i.e., D y (v) = (2, 2) and γ y (v) = u for all v ∈ V y . The transition function is trivial, defined as T y (v) = V y for all v ∈ V y .
With these variables and suitable synchronisation rules, we can simulate the tiling game. Tiles chosen by Charlie (in the role of Constructor ) are directly put on his timelines. Tiles chosen by Eve (in the role of Saboteur ) are put on her timeline, and then replicated by Charlie, in order to turn the timeline for x into a row-major representation of the current partially tiled rectangular area. The separation between even and odd values in the domain of x is needed to tell Charlie whether in the current turn it is time to freely choose the next tile, or to blindly replicate Eve's choice. As remarked in Section 4, Charlie needs at least one time step of delay to replicate Eve's moves, while Eve can reply immediately. For this reason, Eve's tokens last two time steps, so that Charlie has time to see Eve's choice and replicate. Tokens on the two timelines remains aligned: Eve's tokens span over the last Charlie choice and Eve's move replica. Now, we show the domain and system rules that can enforce such a dynamics, starting with the basic construction of the grid. The first token of Charlie's timeline must be t 0 even , to enforce the base case of the tiling, and the fact that the token starting at time 0 is, in fact, marked as even: Let us shorthand the second disjunct of the above rule simply as c = n, and as c = 0 the similar disjunct that requires all c i to be zero. Then, we have to enforce the adjacency conditions for the tiling, which must be obeyed by both players. For Charlie, this can be done as follows, for the horizontal relation: a[x = t even ] → c = n ∨ Note how in the rules above, the last column and the last row are detected using, respectively, the value of the counter and the presence of the final tile. For Eve, the rules that enforce the adjacency relations are similar, but with two differences. First of all, the rules are triggered by tokens on the y variable, which implies that they do not check the consistency of Charlie's choices. This is because a wrong choice by Charlie has to make it lose, not to make the run not admissible. The second difference is that, by means of additional disjuncts, Eve is admitted to play a token with value y = ⊥ in place of any other tile. This value, however, is allowed only when no other viable choice is available, by means of a rule such as: All the rules encoding the behaviour of Eve are written in such a way that they do not hold after ⊥ is played. Since ⊥ ∈ V x , when Eve plays it, Charlie cannot replicate her move, hence violating his rules and losing the game. In this way, the inability of Eve to progress in the tiling is turned into a lost game by Charlie instead of an inadmissible run, accordingly to the semantics of the tiling game. Finally, we can state the final goal of Charlie, namely, that of placing the final tile as the top tile of the current row: It can be easily checked that the number and size of all the rules described above is polynomial in the size of the tiling problem, and that a winning strategy for Charlie effectively corresponds to a winning strategy for Constructor , hence concluding the proof.

Conclusions and future work
In this paper, we introduced uncertainty in the recent body of work devoted to the investigation of formal properties of timeline-based planning problems [23,30,31,32]. Rather than studying the complexity of the problem of finding dynamically controllable flexible plans -a research direction which will be worth exploring anyway -we took a more proactive approach, analysing some issues of the current approach based on flexible plans, and proposing a more general game-theoretic formulation of the problem.
We generalised timeline-based planning problems with uncertainty by defining a novel concept of timelinebased game, where the controller tries to execute some tasks as dictated by a timeline-based model, independently of the choices of the environment. In comparing this approach to the state-of-the-art one, we showed that the existence of winning strategies for timeline-based games is strictly more general than the existence of dynamically controllable flexible plans: the latter implies the former, but there are some problems that, when stated as games, have easy winning strategies but do not admit dynamically controllable flexible plans. Then, we analysed the computational complexity of checking whether a winning strategy exists for a given timeline-based planning game, proving that the problem is 2EXPTIME-complete.
This work opens the way for further interesting developments. First of all, the problem of how to efficiently synthesize a controller implementing a winning strategy for a given game is still open, with a look as well at the quality of the synthesized strategy. Work in this direction may exploit existing machinery from the field of reactive synthesis of logical specifications, by means of the logical encoding of timeline-based problems given by Della Monica et al. [23].
Then, timeline-based games may be extended to multi-agent scenarios, where multiple players are involved, each with its own objectives and constraints, all playing in the surrounding environment. Strategies may be synthesized for single players or for coalitions, sharing some objectives while pursuing also individual goals. This setting could be further extended to distributed games, where players do not share a single clock, and communicate via message passing. Variants with partial observability are also an interesting direction. Having framed the problem in terms of model checking of ATL * formulas will allow us to extend our work to other settings while exploring the full potential of such logics and of the framework of concurrent game structures.
Finally, extending the modelling language to cope with much-needed features such as representation and handling of resources might be fundamental to handle complex real-world scenarios. On the other hand, given the high computational complexity of solving the games, the pursuit of easier fragments, such as with bounded durations or bounded horizon of the game, is an important step towards the application of this approach.