Planning Methods: State-Space Search & Goal Stack

The following points highlight the two main planning methods used to solve AI problems. The methods are: 1. Planning with State-Space Search 2. Goal Stack Planning.

Method # 1. Planning with State-Space Search:

The most straight forward approach is to use state-space search. Because the descriptions of actions in a planning problem specify both preconditions and effects, it is possible to search in either direction: forward from the initial state or backward from the goal, as shown in Fig. 8.5. We can also use the explicit action and goal representations to derive effective heuristics automatically.

i. Forward State-Space Search:

ADVERTISEMENTS:

Planning with forward state-space search is similar to the problem-solving approach. It is sometimes called progression planning, because it moves in the forward direction.

We start with the problem’s initial state, considering sequences of actions until we reach a goal state.

The formulation of planning problem as state-space search problems is as follows:

i. The initial state of the search is the initial state from the planning problem. In general each state will be set of positive ground literals; literals not appearing are false.

ADVERTISEMENTS:

ii. The actions which are applicable to a state are all those whose preconditions are satisfied. The successor state resulting from an action is generated by adding the positive effect literals and deleting the negative effect literals.

iii. The goal test checks whether the state satisfies the goal of the planning problem.

iv. The step cost of each action is typically 1. Although it would be easy to allow different costs for different actions, this was seldom done by STRIPS planners.

Since function symbols are not present, the state space of a planning problem is finite and therefore, any graph search algorithm such as A * will be a complete planning algorithm.

ADVERTISEMENTS:

From the early days of planning research it is known that forward state-space search is too inefficient to be practical. Mainly, this is because of a big branching factor since forward search does not address only relevant actions, (all applicable actions are considered). Consider for example, an air cargo problem with 10 airports, where each airport has 5 planes and 20 pieces of cargo.

The goal is to move all the cargo at airport A to airport B. There is a simple solution to the problem: load the 20 pieces of cargo into one of the planes at A, fly the plane to B, and unload the cargo. But finding the solution can be difficult because the average branching factor is huge: each of the 50 planes can fly to 9 other airports, and each of the 200 packages can be either unloaded (if it is loaded), or loaded into any plane at its airport (if it is unloaded).

On average, let’s say there are about 1000 possible actions, so the search tree up to the depth of the obvious solution has about 1000 nodes. It is thus clear that a very accurate heuristic will be needed to make this kind of search efficient.

ii. Backward State-Space Search:

ADVERTISEMENTS:

Backward search can be difficult to implement when the goal states are described by a set of constraints which are not listed explicitly. In particular, it is not always obvious how to generate a description of the possible predecessors of the set of goal states. The STRIPS representation makes this quite easy because sets of states can be described by the literals which must be true in those states.

The main advantage of backward search is that it allows us to consider only relevant actions. An action is relevant to a conjunctive goal if it achieves one of the conjuncts of the goal. For example, the goal in our 10-airport air cargo problem is to have 20 pieces of cargo at airport B, or more precisely.

At (C₁B) ∧ At (C₂ B)……………….. At (C₂₀, B)

Now consider the conjunct At (C₁, B). Working backwards, we can seek those actions which have this as an effect,

ADVERTISEMENTS:

There is only one:

Unload (C₁p, B),

where plane p is unspecified.

We may note that there are many irrelevant actions which can also lead to a goal state. For example, we can fly an empty plane from Mumbai to Chennai; this action reaches a goal state from a predecessor state in which the plane is at Mumbai and all the goal conjuncts are satisfied. A backward search which allows irrelevant actions will still be complete, but it will be much less efficient. If a solution exists, it should be found by a backward search which allows only relevant action.

This restriction to relevant actions only means that backward search often has a much lower branching factor than forward search. For example, our air cargo problem has about 1000 actions leading forward from the initial state, but only 20 actions working backward from the goal. Hence backward search is more efficient than forward searching.

Searching backwards is also called regression planning. The principal question in regression planning is: what are the states from which applying a given action leads to the goal? Computing the description of these states is called regressing the goal through the action. To see how does it work, once again consider the air cargo example. We have the goal

At (C₁ B) ˄ At (C₂ B) ˄…. ˄ At (C₂₀, B)

The relevant action UNLOAD (C₁ ‘ p, B) achieves the first conjunct. The action will work only if its preconditions are satisfied. Therefore, any predecessor state must include these preconditions: In (C₁, p) ˄ At (p, B) as sub-goals. Moreover, the sub-goal At (C_1, B) should not be true in the predecessor state which will no doubt be a goal but not relevant one (justify).

Thus, the predecessor description is:

In(C₁, p) ˄ At(p, B) ˄ At(C₂, B) ˄……………………˄ At (C₂₀,B)

In addition to insisting that actions achieve some desired literal, we must insist that the actions do not undo any desired literals. An action which satisfies this restriction is called consistent. For example, the action load (C₂, p) would not be consistent with the current goal, because it would negate the literal At (C₂, B) (verify).

Given definitions of relevance and consistency, we can now describe the general process of constructing predecessors for backward search. Given a goal description G, let A be an action which is relevant and consistent.

The corresponding predecessor is constructed as follows:

I. Any positive effects of A which appear in G are deleted.

II. Each precondition literal of A is added, unless it already appears.

Any of the standard search algorithms can be used to carry out the search. Termination occurs when a predecessor description is generated which is satisfied by the initial state of the planning problem. In the first-order logic, satisfaction might require a substitution for variables in the predecessor description. For example, the predecessor description in the preceding paragraph is satisfied by the initial state.

In (C_1, P₁₂) ˄ At (P₁₂, B) ˄ At (C₂, B) ˄ ……………………….˄ At (C₂₀, B)

with substitution (P/P₁₂). The substitution must be applied to the action leading from the state to the goal, producing the solution

[Unload (C_1,P₁₂, B)]

iii. Heuristics for State-Space Search:

It turns out that neither forward nor backward search is efficient without a good heuristic function. Let us recall that in a search technique a heuristic function estimates the distance from a state to the goal. In STRIPS planning, the cost of each action is 1, so the distance is the number of actions.

The basic idea is to look at the effects of the actions and to guess how many actions are needed to achieve all the goals. Finding the exact number is NP hard, but it is possible to find reasonable estimates, most of the time without too much computation. We might also be able to derive an admissible heuristic- (one which does not overestimate). This could be achieved with A * search.

There are two approaches which can be tried. The first is to derive relaxed problem from the given problem specification. The optimal solution cost for the relaxed problem — which is very easy to solve-gives an admissible heuristic for the admissible original problem. A problem with fewer restrictions on the actions is called a relaxed problem.

The cost of an optimal solution to a relax, problem is an admissible heuristic for the original problem. The heuristic is admissible because the optimal solution in the original problem, is by definition, also a solution in the relaxed problem and therefore must be at least as expensive as the optimal solution in the relaxed problem. The second approach is to pretend that a pure divide-and- conquer algorithm will work.

This is called the sub-goal independence assumption: the cost of solving a conjunction of sub-goals in approximated by the sum of the costs of solving each sub-goal independently. The sub-goal independence assumption can be optimistic or pessimistic.

It is optimistic when there are negative interactions between the sub-plans for each sub-goal-for example, when an action in one sub-plan deletes a goal achieved by another sub-plan. It is pessimistic, and therefore inadmissible, when sub-plans contain redundant actions-for instance, two actions that could be replaced by a single action in the merged plan.

So let us consider how to derive relaxed planning problems. Since explicit representations of preconditions and effects are available (compared with ordinary search space where the successor states are not known) the process will work by modifying those representations.

The simplest idea is to relax the problem by removing all preconditions from the actions. Then every action will always be applicable, and any literal can be achieved in one step, if there is an applicable action (goal is impossible if action is not applicable).

It implies that the number of steps required to solve a conjunction of goals is the number of unsatisfied goals – almost but not quite, because:

(1) There may be two actions each which of deletes the goal literal achieved by the other, and

(2) Some actions may achieve multiple goals.

If we combine our relaxed problem with the sub-goal independence assumption, both of these issues are assumed and the resulting heuristic is exactly the number of unsatisfied goals.

In many cases, a more accurate heuristic is obtained by considering at least the positive interactions arising from actions which achieve multiple goals. First, we relax the problem further by removing negative effects. Then, we count the minimum number of actions required such that the union of those actions’ positive effects satisfies the goal.

For example, consider

Goal (A ˄ B ˄ C)

Action (X Effect: A A P)

Action (Y, Effect: B A C A Q)

Action (Z, Effect: B A P A Q)

The minimal set cover of goal {A, B, C} is given by actions {X, Y}, so the set cover heuristic returns a cost of 2. This improves on the sub-goal independence assumption, which gives a heuristic value of 3. There is one minor irritation: the set cover problem is NP hard. A simple greedy-set cover algorithm is guaranteed to return a value which is within a factor of log n of the true minimum value, where n is the number of literals in the goal, and usually works much better than this. Unfortunately, the greedy algorithm loses the guarantee of admissibility for the heuristic.

It is also possible to generate relaxed problem by removing negative effects without removing preconditions. That is, if an action has the effect A ˄ ¬ B in the original problem, it will have the effect A in the relaxed problem. This means that we need not worry about negative interactions between sub-plans, because no action can delete the literals achieved by another action.

The solution cost of the resulting relaxed problem gives what is called the empty-delete-list heuristic. This heuristic is quite accurate, but computing it involves actually running a (simple) planning algorithm. In practice, the search in the relaxed problem is often fast enough that the cost is worthwhile.

The heuristics described here can be used in either the progression or the regression direction. At present, progression planners using the empty-delete-list heuristic hold the lead. But it is likely to change as new heuristics and new search techniques are being explored. Since planning is exponentially hard, no algorithm will be efficient for all problems, but mostly practical problems can be solved with the heuristic method.

Method # 2. Goal Stack Planning:

This was perhaps the first method used to solve the problems in which goal interacted and was the approach used by STRIPS. The planner used a single stack which contains both goals and operators which are proposed to satisfy those goals.

It also depends on a data base which describes the current situation and a set of operators described as PRECONDITION, ADD, and DELETE lists. Let us illustrate the working of this method with the help of an example of blocks world, shown in Fig. 8.6. At the start of solution, the goal stack is simply.

ON(C, A)

ON(B, D)

ONTABLE(A)

ONTABLE(D)

ONTABLE(C)

But we want to separate this problem into four sub-problems, one for each component of the original goal. Two of the sub-problems ON TABLE (A) and ON TABLE (D) are already true in the initial state. So we need to work on the remaining two. There are two goal stacks depending on the order of tackling the sub problems.

where OTDA is the abbreviation for ONTABLE (A) ˄ ONTABLE (D). Single line below the operator represents the goal.

Let us recapitulate the process of finding the goal in STRIPS. In each succeeding step of planner, the top goal on the stack is pursued. When a sequence of operators which satisfied the goal is found that sequence is applied to the state description, yielding a new description. Next the goal which is then at the top of the stack is explored and an attempt is made to satisfy it, starting from the situation which was produced as a result of satisfying the first goal.

This process continues until the goal stack is empty. Towards the last check, the original goal is compared to the final state derived from the application of the chosen operators. If any components of the goal are not satisfied, may be they were satisfied at one time but went to unsatisfying during the course of a subsequent step, then those unresolved parts of the goal are reinserted onto the stack and the process resumed.

Let us continue with the example and explore the first alternative. Since ON(C, A) does not hold in the current (initial) stale we check for the operators which could cause it is be true — only one operator STACK(C, A) out of the four defined in block-world example (Fig. 8.3), could cause this.

Hence ON(C, A) is replaced by STACK(C, A); yielding:

STACK(C, A)

ON (B, D)

ON (C, A) ON (C, A) ˄ ON (B, D) ˄ OTAD

But in order to apply STACK, (C, A) its preconditions must hold, which now become sub-goals. The new compound sub-goal CLEAR (A) ˄ HOLDING(C) must be broken into components and decide on the order. Help is sought from some heuristic.

HOLDING(x) is very easy to achieve: put down some thing else and then pickup the desired object. In order to do anything else, the robot will need to use the arm. So if we achieve HOLDING first and then try to do something else the robot will have to use arm. So if we achieve HOLDING first and then try to do something else, will imply that HOLDING is no longer true towards the end.

So the heuristic used is:

If HOLDING is one of several goals to be achieved at once, it should be tackled last, the other sub goal, CLEAR (A) should be tackled first.

So the new goal stack becomes:

CLEAR (A)

HOLDING(C)

CLEAR (A) ∧ HOLDING(C)

STACK(C, A)

ON (B, D) ˄ ON (C, A) ˄ ON (B, D) ˄ OTAD.

This kind of heuristic information could be contained in the precondition list itself by stating the predicates in the order in which they should be achieved.

Next, is CLEAR (A) true or not, is checked. It is not. The only operator UNSTACK (B, A) makes it true.

Its preconditions form, the sub-goals, so the new goal stack becomes:

ON (B, A)

CLEAR (B)

ARMEMPTY

ON (B, A) ˄ CLEAR (B) ˄ ARMEMPTY

UNSTACK (B, A)

HOLDING(C)

CLEAR (A) ˄ HOLDING(C)

STACK (C, A)

ON (B, D)

ON (C, A) ˄ ON (B, D) ˄ OTAD.

Now on comparing the top element of the goal stack ON (B, A) to the block world problem initial state it is found is be satisfied. So it is popped off and the next goal CLEAR (B) considered. It is also already true (How, it is left as an exercise). So this goal can also be popped from the stack. The third pre-condition for UNSTACK (B, A) – ARMEMPTY also holds good; hence can be popped off the stack.

The next element of the stack is the combined goal representing all of the preconditions for the UNSTACK (B, A). It is also satisfied, so it can also be popped of the stack. Now the top element of the stack is the operator UNSTACK (B, A).

Since its preconditions are satisfied so it can be applied to produce a new world model from which the rest of problem solving process can continue. This is done by using ADD and DELETE lists specified for UNSTACK. We also note that UNSTACK (B, A) is the first operator of the proposed solution sequence.

Now the data base corresponding to blocks world model is:

ONTABLE (A) ˄ ONTABLE(C) ˄ ONTABLE (D) ˄ HOLDING (B) ˄ CLEAR (A).

The goal stack now is:

HOLDING(C)

CLEAR (A) ˄ HOLDING(C)

STACK(C, A)

ON (B, D)

ON(C, A) ˄ ON (B, D) ˄ OTAD

The next goal to be satisfied is HOLDING(C) and this is made true by two operators PICK UP(C) and UNSTACK(c, x), where x could be any block from which the block c could be un-stacked.

Using those two operators the two branches of the search tree are:

(1) (2) In stack (2) a variable x, appears at three places. Though, any block could be substituted for x but it is important that the same block be matched to x at all these places. So it is important that each time a variable is introduced into the goal stack, it be given a name distinct from any other variables already in the stack. For this purpose, whenever a candidate object is chosen to match a variable the bindings must be recorded so that any future bindings of the same variable be done to the same object.

Now the question is which of the two alternatives be selected?

Alternative 1 is better than the alternative 2, because block C is not on anything so pickup(c) is better than un-stacking it; because in order to un-stack it has first to be stacked with some block. Of course this could be done but this would be a fruitless effort. But how could a program know that? By chance if alternative 2 is tried then in order to satisfy ON(c, x) (pre condition of operator UNSTACK(X, Y) we would have to STACK C onto some block x.

The goal stack would then become:

CLEAR (x)

HOLDING(c)

CLEAR(x) ˄ HOLDING(c)

STACK(c, x)

CLEAR(c)

ARMEMPTY

ON(c, x) ˄ CLEAR(c) ˄ ARMEMPTY

UNSTACK(c, x)

CLEAR (A) ˄ HOLDING(c)

STACK(C, A)

ON (B, D)

ON(C, A) ˄ ON(B, D) ˄ OTAD.

One of the pre-conditions of STACK is HOLDING(c). This is what we were trying to achieve by applying UNSTACK, which required us to apply STACK so that the condition ON(c, x) would be satisfied, that is we are back to our original goal. In addition, there are other goals (such as ARMEMPTY). So with type of planning this alternative is unproductive, so need be terminated. If however block C had been on another block in the current state, ON(c, x) would have been satisfied immediately with no need to do a STACK and this move would have led to a good solution.

So, let us now consider the alternative 1. This uses PICKUP to get the arm holding C. The top element on the goal stack is ONTABLE(C), which is already satisfied so it is popped off. The next element is CLEAR(C), which is also satisfied, so also popped off. The third precondition of PICKUP(C), ARMEMPTY is not satisfied because the arm is picking up B, and HOLDING (B) holds good.

This condition becomes true through the two operators STACK (B, x) and PUTDOWN (B) that is B can be either put on table or on another block. Which alternative should be selected? Since the goal state contains ON (B, D), B can be put on D, so we choose to apply STACK (B, D); by binding D to x.

Now the goal stack becomes:

CLEAR (D)

HOLDING (B)

CLEAR (D) ˄ HOLDING (B)

STACK (B, D)

ONTABLE(C) ˄ CLEAR(C) ˄ ARMEMPTY PICKUP(C)

CLEAR (A) ˄ HOLDING(C)

STACK (C, A)

ON (B, D)

ON(C, A) ˄ ON (B, D) ˄ OTAD.

CLEAR (D) and HOLDING (B) are both true; the operation STACK (B, D) can be performed, producing the model.

ONTABLE (A) ˄ ONTABLE(C) ˄ ONTABLE (D) ˄ ON (B, D) ˄ ARMEMPTY.

All of the pre-conditions for PICKUP(C) are now satisfied, so it too can be executed. Since all the pre-conditions for STACK(C, A) are true, so it also can be executed.

Now, consider the second part of the original goal, ON (B, D). But this has already been satisfied by the operations which were used to satisfy the first sub-goal. The reader should ascertain for himself; that ON (B, D) can be popped off the goal stack.

We then check the combined goal, the last step towards finding the solution:

ON(C, A) ˄ ON(B, D) ˄ ONTABLE(A) ˄ ONTABLE(D)

to make sure that all the four parts hold good, so the problem is solved.

The answer by the planner can be (the order of the operators will be):

i. UNSTACK (B, A)

ii. STACK (B, D)

iii. PICKUP(C)

iv. STACK(C, A)

Planning Methods: State-Space Search & Goal Stack | Artificial Intelligence

Method # 1. Planning with State-Space Search:

Method # 2. Goal Stack Planning: