Sections
- Definition of Classical Planning - develops an expressive yet carefully constrained language for representing planning problems
- Algorithms for Planning as State-Space Search - forward and backward search algorithms while using heuristics that can be derived from the structure of the representation
- Planning Graphs - the data-structure planning graph can make the search for a plan more efficient
- Other Classical Planning Approaches - discuss other approaches to planning
Introduction
this chapter covers fully observable, deterministic, static environments with single agents
chapters 11 and 17 cover partially observable, stochastic, dynamic environments with multiple agents
- planning combines search and logic
- planning is an exercise of controlling combinatorial explosion
Definition of Classical Planning
The problem-solving agent of Chapter 3 can find sequences of actions that result in a goal state. But it deals with atomic representations of states and thus needs good domain-specific heuristics to perform well. The hybrid propositional logical agent of Chapter 7 can find plans without domain-specific heuristics because it uses domain-independent heuristics based on the logical structure of the problem. But it relies on ground (variable-free) propositional inference, which means that it may be swamped when there are many actions and states. For example, in the wumpus world, the simple action of moving a step forward had to be repeated for all four agent orientations, T time steps, and n2current locations
in response to this, planning researchers have settled on a factored representation (where states of the world is represented by a collection of variables).
Planning Domain Definition Language (PDDL)
allows us to express all 4Tn2 actions with one action schema
can describe the 4 components of a search problem:
-
- initial state
- actions available for each state
- results of each action
- goal test
sections:
- state representation
- action representation
- states & actions - putting it together
- example - the blocks world
- complexity of classical planning
State Representation
state is represented as either a:
- conjunction of fluents, which can be manipulated by logical inference
- set of fluents, which can be manipulated with set operations
- in either cases, the fluents must be grounded, functionless atoms (e.g. Poor ∧ Unknown)
- for example, the following fluents are not allowed:
-
- At(x,y) - because it’s not grounded
- ¬Poor - because it’s a negation
- At(Father(Fred), Sydney) - because it uses a function symbol
Action Representation
actions are represented by a set of action schemas that implicit define the ACTIONS(s) and RESULT(s,a) functions needed to do a problem-solving search.
Action Schema
a set of ground (variable-free) actions can be represented by a single action schema
an action schema is a most general unifier (MGU) of the set of grounded actions
an action schema is a lifted representation (lifts the level of reasoning from propositional logic to a restricted subset of first-order logic)
for example, an action schema for flying a plane from one location to another:
// action schema
Action(Fly(p,from,to),
PRECONDITION: At(p,from) ∧ Plane(p) ∧ Airport(from) ∧ Airport(to)
EFFECT: ¬At(p,from) ∧ At(p,to))
an action schema consists of:
- action name - a list of all variables used in the schema
- precondition - conjunction of literals (positive or negated atomic sentences)
- effect - conjunction of literals (positive or negative atomic sentences)
an action schema can have a mix of variables and/or values/objects
we are free to choose whatever values/objects we want to instantiate the variables of the action schema. For example, here is one ground action that results from substituting values/objects for all the variables of the action schema:
// grounded action
Action(Fly(P1,SFO,JFK),
PRECONDITION: At(P1,SFO) ∧ Plane(P1) ∧ Airport(SFO) ∧ Airport(JFK)
EFFECT: ¬At(P1,SFO) ∧ At(P1,JFK))
States & Actions - Putting it Together
an action a is applicable in state s if the preconditions are satisfied by s
if an action a has v variables, then, in a domain with k values/objects, it takes O(vk) time in the worst case to find the applicable grounded actions of a
sometimes we want to propositionalize a PDDL problem (replace each action schema with its set of ground actions) and then use a propositional solver such as SATPLAN to find a solution. However, this is impractical when v and k are large
The result of executing action a in state s is defined as a state s’ which is represented by:
- the set of fluents formed by starting with s
- removing the fluents that appear as negative literals in the action’s effects (what we call the delete list or DEL(a))
- adding the fluents that are positive literals in the action’s effects (what we call the add list or ADD(a))
- RESULT(s,a) = (s − DEL(a)) ∪ ADD(a)
A set of action schemas serves as a definition of a planning domain. A specific problem within the domain is defined with the addition of an initial state and a goal
- the initial state is a conjunction of ground atoms. (as with all states, the closed-world assumption is used, which means that any atoms that are not mentioned are false.)
- the goal is just like a precondition: a conjunction of literals (positive or negative) that may contain variables, such as At (p, SFO ) ∧ Plane (p). Any variables are treated as existentially quantified, so this goal is to have any plane at SFO.
The problem is solved when we can find a sequence of actions that end in a state s that entails the goal g, for example:
- the state Rich ∧ Famous ∧ Miserable entails the goal Rich ∧ Famous
- the state Plane(Plane1) ∧ At(Plane1,SFO) entails the goal At(p,SFO) ∧ Plane(p)
Example - The Blocks World
---cognitive-computing---machine-intelligence/ai---textbooks/ai---a-modern-approach---summaries/ai-chapter-10---classical-planning/screen-shot-2019-04-12-at-7.33.00-pm.png)
The Complexity of Classical Planning
2 types of decision problems:
- PlanSAT - is the question whether there exist any plan that solves a planning problem
- Bounded PlanSAT - asks whether there is a solution of length k or less (could be used to find an optimal plan)
Algorithms for Planning as State-Space Search
the previous section discussed about the declarative representation of a planning action schemas. this representation allows us to search forward and backward:
In the forward search we choose actions that are applicable—those actions that could be the next step in the plan.
In backward search we want actions that are relevant—those actions that could be the last step in a plan leading up to the current goal state
neither forward nor backward search is efficient without good heuristic function, therefore we go over:
- AI Chapter 10 - Classical Planning
- relaxing problem
- subgoal independence
Forward/Progression State-Space Search
reasons forward state-space search is inefficient:
- forward search is prone to exploring irrelevant actions
- planning problems often have large state spaces
relies on accurate domain-specific heuristic, to make it feasible
Backward/Regression Relevant-States Search
in regression search or relevant-states search we start at the goal and apply the actions in reverse until we find a sequence of steps that reaches the initial state. we only consider actions that are relevant to the goal (or current state)
if a domain can be expressed in PDDL, then we can do regression search on it. Given a ground goal description g and a ground action a, the regression from g over a gives us a state description g’defined by
g’= (g − ADD(a)) ∪ Precond(a)
That is, the effects that were added by action a need not have been true before, and also the preconditions must have held before, or else the action could not have been executed
To get the full advantage of backward search, we need to deal with partially uninstantiated actions and states, not just ground ones. For example, suppose the goal is to deliver a specific piece of cargo to SFO: At(C2,SFO). That suggests the action Unload(C2,p,SFO):
Action(Unload(C2,p',SFO),
PRECOND: In(C2,p') ∧ At(p',SFO) ∧ Cargo(C2) ∧ Plane(p') ∧ Airport(SFO)
EFFECT: At(C2,SFO) ∧ ¬In(C2,p')
Figure for Both Forward and Backward Search
---cognitive-computing---machine-intelligence/ai---textbooks/ai---a-modern-approach---summaries/ai-chapter-10---classical-planning/screen-shot-2019-04-12-at-7.40.13-pm.png)
Heuristics For Planning
neither forward nor backward search is efficient without good heuristic function
a heuristic function h(s) estimates the distance from a state s to the goal and that if we can derive an admissible heuristic for this distance—one that does not overestimate—then we can use A∗search to find optimal solutions. An admissible heuristic can be derived by defining a relaxed problem that is easier to solve. The exact cost of a solution to this easier problem then becomes the heuristic for the original problem
Planning uses a factored representation for states and action schemas (where states and actions of the world is represented by a collection of variables). That makes it possible to define good domain-independent heuristics and for programs to automatically apply a good domain-independent heuristic for a given problem
Relaxed Problem
Think of a search problem as a graph where the nodes are states and the edges are actions. The problem is to find a path connecting the initial state to a goal state.
2 ways to relax this problem to make it easier:
Relaxing Actions
- ignore-preconditions heuristic - relax/drops all preconditions from actions
- ignore-selected-preconditions heuristic - consider the sliding-block puzzle (8-puzzle or 15-puzzle) from Section 3.2. We could encode this as a planning problem involving tiles with a single schema Slide:
- Action(Slide(t, s1, s2),
PRECONDITION: On(t,s1) ∧ Tile(t) ∧ Blank(s2) ∧ Adjacent(s1,s2)
EFFECT: On(t,s2) ∧ Blank(s1) ∧ ¬On(t,s1) ∧ ¬Blank(s2))
if we remove the preconditions Blank(s2) ∧ Adjacent(s1,s2) then any tile can move in one action to any space and we get the number-of-misplaced-tiles heuristic. If we remove Blank(s2) then we get the Manhattan-distance heuristic - ignore-delete-lists heuristic - assuming all goals and preconditions contain only positive literals, the length of the solution will serve as a good heuristic. For this to work we remove the delete lists from all actions (i.e. removing all negative literals from effects)
- set-cover problem using relaxed actions - first we relax the actions by removing all preconditions and all effects except those that are literals in the goal. Then, we count the minimum number of actions required such that the union of those actions’ effects satisfies the goal
Relaxing States
- state abstraction (grouping multiple nodes together) - forming an abstraction of the state space that has fewer states, and thus is easier to search
- decreases the number of states, by a many-to-one mapping from states in the ground representation of the problem to the abstract representation
- ignore some fluents - a form of state abstraction. for example, consider an air cargo problem with 10 airports, 50 planes, and 200 pieces of cargo. Each plane can be at one of 10 airports and each package can be either in one of the planes or unloaded at one of the airports. So there are 5010× 20050+10≈ 10155states. Now consider a particular problem in that domain in which it happens that all the packages are at just 5 of the airports, and all packages at a given airport have the same destination. Then a useful abstraction of the problem is to drop all the At fluents except for the ones involving one plane and one package at each of the 5 airports. Now there are only 510× 55+10≈ 1017states
Key Idea in Defining Heuristics
key idea: is decomposition of goal into subgoals which contains 3 main steps:
- divide problem into subproblems
- solve each subproblem independently
- combine the subproblems
(Max Estimate vs Sum Estimate) Heuristics for Subgoals
Suppose the goal is a set of fluents G, which we divide into disjoint subsets G1, …, Gn. We then find plans P1, …, Pnthat solve the respective subgoals. What is an estimate of the cost of the plan for achieving all of G? We can think of each Cost(Pi) as a heuristic estimate:
- max estimate - we know that if we combine estimates by taking their maximum value, we always get an admissible heuristic. So maxiCOST(Pi) is admissible, and sometimes it could be that P1serendipitously achieves all the Gi. But in most cases, in practice the estimate is too low
- sum estimate - could we sum the costs instead? For many problems that is a reasonable estimate, but it is not admissible. The best case is when we can determine that Giand Gjare independent (subgoal independence assumption). If the effects of Pileave all the preconditions and goals of Pjunchanged, then the estimate COST(Pi) + COST(Pj) is admissible, and more accurate than the max estimate
Subgoal Independence Assumption
- an assumption that the cost of solving a conjunction of subgoals is approximated by the sum of the costs of solving each subgoal independently.
- the assumption can be optimistic or pessimistic:
- It is optimistic when there are negative interactions between the sub-plans for each subgoal—for example, when an action in one subplan deletes a goal achieved by another subplan
- It is pessimistic, and therefore inadmissible, when sub-plans contain redundant actions—for instance, two actions that could be replaced by a single action in the merged plan
Planning Graphs
all heuristics suggested can suffer from inaccuracies
a planning graph data-structure can be used to give better heuristic estimates
a planning problem asks “can we reach state G from state S0”. Suppose we are given a tree of all possible actions from the initial state to successor states, and their successors, and so on. If we indexed this tree appropriately, we could answer the planning question “can we reach state G from state S0” immediately, just by looking it up. Of course, the tree is of exponential size, so this approach is impractical. A planning graph is polynomial-size approximation to this planning problem tree. The planning graph can’t answer definitively whether G is reachable from S0, but it can estimate how many steps it takes to reach G. The estimate is always correct when it reports the goal is not reachable, and it never overestimates the number of steps, so it is an admissible heuristic
a planning graph is a directed graph organized into levels:
- first a level S0for the initial state, consisting of nodes representing each fluent that holds in S0
- then a level A0consisting of nodes for each ground action that might be applicable in S0
- then alternating levels Sifollowed by Ai until we reach a termination condition
Roughly speaking, Si contains all the literals that could hold at time i, depending on the actions executed at preceding time steps. If it is possible that either P or ¬P could hold, then both will be represented in Si. Also roughly speaking, Aicontains all the actions that could have their preconditions satisfied at time i. We say “roughly speaking” because the planning graph records only a restricted subset of the possible negative interactions among actions; therefore, a literal might show up at level Sjwhen actually it could not be true until a later level, if at all. (A literal will never show up too late.) Despite the possible error, the level j at which a literal first appears is a good estimate of how difficult it is to achieve the literal from the initial state
Planning graphs work only for propositional planning problems—ones with no variables
A Planning Problem & Its Planning Graph
---cognitive-computing---machine-intelligence/ai---textbooks/ai---a-modern-approach---summaries/ai-chapter-10---classical-planning/screen-shot-2019-04-12-at-8.27.30-pm.png)
Figure 10.7 shows a simple planning problem
Figure 10.8 shows its planning graph
- rectangular boxes represents non-persistence actions
- a real action at level Aiis connected to its preconditions at Siand its effects at Si+1.
- small square boxes represents (persistence actions or no-op) actions
- a literal appears because an action caused it, but we also want to say that a literal can persist if no action negates it
- gray lines indicates (mutual exclusion or mutex) links
- e.g. Eat(Cake) is mutually exclusive with the persistence of either Have(Cake) or ¬Eaten (Cake)
- e.g. in level A0 of figure 10.8, the non-persistence action, Eat(Cake), are connected to 2 persistence actions via mutex links
we alternate between state level Siand action level Aiuntil the graph has leveled off - where two consecutive levels (Sj-1 and Sj) are identical (e.g. the graph in Figure 10.8 levels off at S2)
Defining Mutex Links for Actions and Literals
mutex relation holds between two actions if any of the following three conditions holds:
- inconsistent effects
- one action negates an effect of the other
- for example, Eat(Cake) and the persistence of Have(Cake) have inconsistent effects because they disagree on the effect Have(Cake)
- interference
- one of the effects of one action is the negation of a precondition of the other
- for example Eat(Cake) interferes with the persistence of Have(Cake) by negating its precondition
- competing needs
- one of the preconditions of one action is mutually exclusive with a precondition of the other
- for example, Bake(Cake) and Eat(Cake) are mutex because they compete on the value of the Have(Cake) precondition
mutex relation holds between two literals at the same level if:
- one is the negation of the other
- inconsistent support
- each possible pair of actions that could achieve the two literals is mutually exclusive
- for example, Have(Cake) and Eaten(Cake) are mutex in S1because the only way of achieving Have(Cake), the persistence action, is mutex with the only way of achieving Eaten (Cake), namely Eat (Cake). In S2the two literals are not mutex, because there are new ways of achieving them, such as Bake(Cake) and the persistence of Eaten(Cake), that are not mutex
Planning Graphs - Heuristic Estimation
if any goal literal fails to appear in the final level of the graph, then the problem is unsolvable
level cost of goal literal gi- level at which gifirst appears in the planning graph constructed from initial state s (e.g. In Figure 10.8, Have(Cake) has level cost 0 and Eaten(Cake) has level cost 1)
Serial Planning Graph
- a modification of the original planning graph
- serial planning graph produces better level costs estimates than original planning graph (because ordinary planning graphs allow several actions at each level and the heuristic counts just the level and not the number of actions)
- therefore in a serial planning graph we enforce at most one action can occur at any given time step; this is done by adding mutex links between every pair of non-persistence actions
3 Heuristic Estimates of the Cost of a Conjunction of Goals:
- max-level heuristic
- maximum level cost of any of the goals
- admissible
- not necessarily accurate
- sum-level heuristic
- sum of the level costs of the goals
- can be inadmissible
- works well in practice for problems that are largely decomposable
- set-level heuristic
- level at which all the literals in the conjunctive goal appear in the planning graph without any pair of them being mutually exclusive
- admissible
- dominates the max-level heuristic
- works extremely well on tasks in which there is a good deal of interaction among sub-plans
- e.g. this heuristic gives the correct values of 2 for our original problem in Figure 10.8 and infinity for the problem without action Bake(Cake)
The GRAPHPLAN algorithm
The GRAPHPLAN algorithm (Figure 10.9) repeatedly adds a level to a planning graph with EXPAND-GRAPH. Once all the goals show up as non-mutex in the graph, GRAPHPLAN calls EXTRACT-SOLUTION to search for a plan that solves the problem. If that fails, it expands another level and tries again, terminating with failure when there is no reason to go on
---cognitive-computing---machine-intelligence/ai---textbooks/ai---a-modern-approach---summaries/ai-chapter-10---classical-planning/screen-shot-2019-04-14-at-3.01.39-pm.png)
if all the literals from the goal are present in St, and none of them is mutex with any other. That means that a solution might exist
We can formulate EXTRACT-SOLUTION as a:
- Boolean Constraint Satisfaction Problem (CSP), where:
- variables are the actions at each level
- values for each variable are in or out of the plan
- constraints are the mutexes and the need to satisfy each goal and precondition
- Backward Search Problem, where each state in the search contains a pointer to a level in the planning graph and a set of unsatisfied goals, where:
In the case where EXTRACT-SOLUTION fails to find a solution for a set of goals at a given level, we record the (level,goals) pair as a no-good. Whenever EXTRACT-SOLUTION is called again with the same level and goals, we can find the recorded no-good and immediately return failure rather than searching again
As soon as the graph itself and the no-goods have both leveled off, with no solution found, we can terminate with failure because there is no possibility of a subsequent change that could add a solution
Other Classical Planning Approaches
Currently the most popular and effective approaches to fully automated planning are:
- Translating to a Boolean satisfiability (SAT) problem
- Forward state-space search with carefully crafted heuristics (Section Algorithms for Planning as State-Space Search)
- Search using a planning graph (Section Planning Graphs)
3 other influential approaches:
- planning as first-order logical deduction
- Boolean Satisfiability
- Situational Calculus
- planning as constraint satisfaction
- planning as plan refinement of partially ordered plans