Procedural Knowledge Representation: Production Rules

In this article we will discuss about:- 1. Rules or Production Rules for Procedural Knowledge Representation 2. Inference Making of Procedural Knowledge Representation 3. Non-Deductive Inference Method 4. Illustration of Forward/Backward Chaining 5. Advantages of Production Systems 6. Problems with Production Rules 7. Applicability of Production Rules.

A production system consists of three components; the data base, the rules and the interpreter.

The data base or the working memory represents all the knowledge of the system at any given moment. It can be thought of as a simple data base of facts which are true of the domain at that time. The contents of data base change as facts are added or deleted according to the application of the rules.

Rules or Production Rules for Procedural Knowledge Representation:

These are operators, as will be seen shortly, which are applied to the knowledge in the data base and change the state of the production system, in some way, usually by changing the contents of the data base. The production rules are also some time called condition-action rules. If the condition of a rule is true according to the data base at that moment the action associated with the rule is performed.

ADVERTISEMENTS:

Production rules are usually unordered, in the sense that the sequence in which rules will be applied depends on the current state of the data base. The rule whose condition matches the state of the data base will be selected. If more than one rule matches the conflict resolution.

Rules provide a formal way of representing recommendations, directives, or strategies; they are often appropriate when domain knowledge results from empirical associations developed through years of experience of solving problems in an area.

Rules are expressed as IF-THEN statements, as shown below:

1. If the pH of the spill is less than 6,

ADVERTISEMENTS:

Then the spill material is an acid.

2. If the spill material is an acid

And the spill smells like vinegar,

Then the spill material is acetic acid.

ADVERTISEMENTS:

3. If a flammable liquid was spilled,

Then call the fire department.

These are rules which might exist in a crisis management problem, as it happened recently in Mumbai high waters, for containing oil and chemical spills. Rules are sometimes written with arrows (→) to indicate the IF and THEN portions of the rules.

For example, rule 1 in this notation would look like:

When the If portion of a rule is satisfied by the facts in the database the action specified by the THEN in the database portion is performed. When this happens the rule is said to fire or execute. A rule interpreter (Fig. 6.7) compares the IF portions of rules with facts in the database and executes the rule whose IF portion matches the facts.

The rule’s action part may modify the set of facts in the knowledge base by adding the then portion of the rule fired in the data base added as new fact as shown in Fig. 6.8.

The new facts added to the data (knowledge) base can themselves be used for forming matches with the IF portion of rules, as illustrated in Fig. 6.9.

The action taken when the rule fires may directly affect the real world, as shown in Fig. 6.10.

This matching of rule IF portions of rule, to the facts in the data base can produce what are called inference chains. The inference chain formed from successive execution of rules 1 and 2 is shown in Fig. 6.11. The inference chain indicates how the system used the rules to infer the identity of the spill material. An expert system’s inference chains can be displayed to the user to help explain how the system reached its conclusions.

This type of inference process is found in the expert system PROSPECTOR’S dialogue.

The production rules can also be written by object-attribute-value triplet e.g., and person-age-value may be one such triplet a production rule can be represented as

If (person age above-18) and (person wife nil) and (person sex male)

THEN (person eligible for marriage)

Here person is a variable and using constants the production rule in OAV form becomes:

If (Ram age 19) and (Ram wife nil) and (Ram sex male)

THEN (Ram eligible for marriage)

Inference Making of Procedural Knowledge Representation:

There are two important ways in which rules can be used in a rule based system to draw inference; one is called forward chaining and the other backward chaining. This spill material example just presented above used forward chaining. Fig. 6.12, shows in more detail how forward chaining works for a simple set of rules.

The rules in this example use letters as facts (propositions/premises) in the data base to explain the situations or concepts.

For example:

Let us study how these rules work. We shall assume that each time the set of rules is tested against the data base, only the first (top most) rule which matches is executed. That is why in Fig. 6.12, the rule (A → D) is only executed once even though it matches the data base every time.

The first rule which fires is (A → D) because A is already in the data base. As a consequence of that rule, the existence of D is inferred and D is placed in the data base. That causes the second rule C and D → F to fire, and as a consequence F is inferred and placed in the data base. This in turn causes the third rule F and B → Z to fire. Placing Z in the data base.

This technique is called forward chaining because the search for new information seems to be proceeding in the direction of the arrow separating the left and right- hand sides of the rules. The system which uses information on the left-hand side to derive information on the right is shown in Fig. 6.13. The inference chain produced by the example in Fig. 6.12, is shown in Fig. 6.13.

Suppose we had used Fig. 6.13, (the system) with the express goal of determining whether or not situation Z existed. We might think that it worked quite well, finding quickly fact that Z did exist. Unfortunately, this is just over simplification. In actuality the process of drawing inference is not that simple as this, a real expert system would not have just three rules; it would have hundreds or even thousands of them.

If we used a system that large just to find out about Z, many rules would be fired which had nothing to do with Z. A large number of inference chains and situations could be derived in the process which were valid but unrelated to Z. So if our goal is to infer one particular fact Z, forward chaining could waste both time and money.

In such a situation backward chaining might be more cost effective. With this inference technique the system starts with what it wants to prove. For example, to find if the situation Z exists and as such executes those rules which are relevant to its establishment Fig. 6.14. shows how backward chaining would work using the rules from the above crisis management example.

Inference chain or inference net formed by connection between evidence and hypotheses (there is shown only by connecting rules) are used in generating possible inference from the rules. The evidence in an expert system (say PROSPECTOR) can be any logical combination of pieces of evidence e.g.,

E₁ and E₂ and E₃

E₁ or E₂

E₁ and (E₂ or E₃).

On the other hand hypothesis (H) is always a single concept e. g. H₂can be used in the if portion of a rule to suggest or imply other hypotheses, as shown below.

H₂→ H₁(LS, LN)

H₂ (LS, LN) indicates the hypothesis H₁, limited by certain factors LS and LN. LS indicating how encouraging it is to our belief in the hypothesis to find the evidence present, while LN indicates how discouraging it is to find the evidence absent.

In step 1 the system is told to establish (If it can) that situation Z exists. It first checks the data base for Z and when that fails, searches for rules which conclude Z, that is, have Z on the right side of the arrow. It finds the rule F and B → Z, and decide that it must establish F and B in order to conclude Z.

In step 2 the system tries to establish F, first checking the data base and then finding a rule which concludes F. From the rule, C and D → F, the system decided that C and D must be established to conclude F.

In steps 3 through 5 the system finds C in the data base but decides it must establish A before it can conclude D. It then finds A in the data base.

In steps 6 through 8 the system executes the third rule to establish D, then executes the second rule to establish F, and finally executes the first rule to establish the original goal, Z. The inference chain created here is identical to the one created by forward chaining, (left as and exercise to the students), the difference in the two approaches depends on the method in which data and rules are searched.

Interference chain formed through backward chaining is shown if fig. 6.15.

Non-Deductive Inference Method for Procedural Knowledge Representation:

We consider three non-deductive forms of inferencing. These are not valid forms of inferencing, but they are nevertheless very important. We use all three methods often in everyday activities where we draw conclusions and make decisions. The three methods we consider here are abduction, induction, and analogical inference,

1. Abductive Inference:

Abductive inference is based on the use of known causal knowledge to explain or justify a (possibly invalid) conclusion. Given the truth of proposition Q and the implication P → Q, conclude P. For example, people who have had too much to drink tend to stagger when they walk. Therefore, it is not unreasonable to conclude that a person who is staggering is drunk even though this may be an incorrect conclusion. People may stagger when they walk for other reasons, including dizziness from twirling in circles or from some physical problem.

We may represent abductive inference with the following, where the cover the implication arrow is meant to imply a possible causal relationship.

Abductive inference is useful when known causal relations are likely and deductive inferencing is not possible for lack of facts.

2. Inductive Inference:

Inductive inferencing is based on the assumption that a recurring pattern, observed for some event or entity, implies that the pattern is true for all entities in the class. Given instances P (a₁), P (a₂)… P (a_k), conclude that ᗄx P(x). More generally, given P(a₁) → Q(b₁), P(a₂) → Q(b₂),…, P(a_k) → Q (b_k), conclude ᗄx, y P(x) → Q(y).

We often make this form of generalization after observing only a few instances of a situation. It is known as the inductive leap. For example, after seeing a few white swans, we incorrectly infer that all swans are white (a type of Australian swan is black), or we conclude that all Irishmen are stubborn after discussions with only a few.

We can represent inductive inference using the following description:

Inductive inference, of course, is not a valid form of inference, since it is not usually the case that all objects of a class can be verified as having a particular property. Even so, this is an important and commonly used form of inference.

3. Analogical Inference:

Analogical inference is a form of experiential inference. Situations or entities which are alike in some respects tend to be similar in other respects. Thus, when we find that situation (object) A is related in certain ways to B, and A’ is similar in some context to A, we conclude that B’ has a similar relation to A’ in this context. For example, to solve a problem with three equations in three unknowns, we try to extend the methods we know in solving two equations in two unknowns.

Analogical inference appears to be based on the use of a combination of three other methods of inference, abductive, deductive and inductive. We depict this form of inference with the following description, where their above the implication symbol means is related to.

Analogical inference, like abductive and inductive is a useful but invalid form of commonsense inference.

Illustration of Forward/Backward Chaining in Procedural Knowledge Representation:

The simplest form of a rule-based production system consists of three parts, a knowledge base (KB) consisting of a set of rules (as few as 50 or as many as several thousand rules may be required in an expert system), a working memory, and a rule interpreter or inference engine.

The interpreter inspects the LHS of each rule in the KB until one is found which matches the contents of working memory. This causes the rule to be activated or to “fire” in which case the contents of working memory are replaced by the RHS of the rule. The process continues by scanning the next rules in sequence or restarting at the beginning of the knowledge base.

Backward chaining and forward chaining are strategies used to specify how rules contained in the knowledge base are to be executed.

The two mechanisms can easily be understood by the following example:

Consider the following rules:

1. If Weather is Sunny

AND distance < = 20 miles

THEN transportation is bicycle.

2. IF transportation is bicycle

THEN no passenger insurance is considered.

3. IF no passenger insurance is considered.

THEN transportation insurance cost = 0.

Backward Chaining:

Suppose that we want to establish the fact that Goal “transportation insurance cost = 0” assuming that we only know-

(i) The whether is sunny and

(ii) Distance is ≤ 20 miles.

Backward chaining method works backward from the conclusion

Therefore, it is true that “transportation is Bicycle”. Therefore, it is true that: “No passenger insurance is considered”. Therefore, it is true that, the “Transportation insurance cost = 0”.

We have tried to establish all the facts needed to reach that goal. This reasoning method is called backward chaining. In general backward chaining is applied when a goal or a hypothesis is chosen at the searching point for the problem solving. Backward chaining is also known as goal directed, top down or consequent – driven reasoning.

We now give examples of:

Forward Chaining:

(The same data, as above is assumed). The forward chaining mechanism goes forward from antecedents to the conclusions they generate. Goal is same as above. Suppose, we want to prove that “Transportation insurances cost = O” assuming, we know that “weather is sunny and distance = 15 miles.”

Thus, we have established from “weather is sunny” and distance <= 20 miles” that “transportation is bicycle” we have established that “No passenger insurance is considered”. Finally, from that we have established that” Transportation insurance cost = 0″ which is our goal.

Forward chaining, is also known as data − driven, bottom up, and antecedent − drive inferencing. It is best used to solve problems in which data is to be used as the starting point for problem shine.

A combination of forward chaining and backward chaining can be used to get to problem solution quickly when dealing with large search inferences and complex problems. The combination of bottom – up and top – down reaches quicker results.

For example 8-puzzle and Water Jug problem, where goal is known in advance are the examples of forward chaining or forward reasoning. We have to look for the states through which goal can be reached. The domain specific knowledge can be deployed by expanding the states from the known starting state. The generation of states from their predecessor states may be continued till the goal is reached.

Diagnosis of a disease of a patient by a patient, for example, in diagnosis of fever symptoms such as blood pressure, blood and urine tests are the data to go to the conclusion through backward chaining. Another example of backward chaining can be going home from an unknown place. Here neighbouring states of the goal are better known the neighbouring states of the initial state.

Factors which influence the question of whether it is better to reason forward or backward:

(a) Are there more possible start states or goal states? We would like to move from the smaller set of states to larger set of states as it is easier to find goal.

(b) In which direction the branching factor (the average no. of nodes generators) greater? We would like to move in the direction with the lower branching factor.

(c) Will the program be asked to justify its reasoning process to a user? If so, it is important to proceed in the direction which corresponds more closely with the way the user will think. Justification is essential in expert systems.

(d) What kind of event is going to trigger a problem solving episode? If it is the arrival of new fact, forward reasoning makes sense. If it is a query to which response is desired, backward reasoning is more natural.

If the number of nodes at each step grows exponentially with the number of steps then we can search both; forward from the start state and backward from the goal state simultaneous. Simultaneously until the two paths meet in between sometimes the two searches may pass each other, resulting in more work than it would have taken for one of them, on its own, to have finished.

Advantages of Production Systems in Procedural Knowledge Representation:

Some of the features of production systems which make them particularly an appealing form of knowledge representation for expert systems include:

1. Expressiveness and Intuitiveness:

Experience in working with human experts indicates heavy repetition of the theme, Well, in the case of so-and-so I usually do such- and-such. This theme maps quite naturally into the IF… THEN format of production rules. Production rules essentially tell us what to do in a given situation. Since many expert systems are organised in terms of advice on what to do, this property of production systems is particularly natural for representing knowledge.

2. Simplicity:

The uniform structure of the IF… THEN syntax in rule-based systems provides an attractive simplicity for the representation of knowledge. This feature improves the readability of the production rules and the communication between various parts of a single program. Production rules, by their very syntax, tend to be self-documenting.

3. Modularity and Modifiability:

By their very nature, production rules encode discrete pieces of information which are generally unrelated to other production rules unless there is an explicit production rule relating them. Information can be treated as a collection of independent facts which may be added to or deleted from the system with essentially no deleterious side effects.

This modular feature of production systems allows for the incremental improvement and fine tuning of production systems with no degradation of performance. A simple, skeletal system can be rapidly developed and gradually fleshed out as more information becomes available.

4. Knowledge Intensive:

The three-part structure of production systems (rule interpreter, knowledge base, working memory) provides a very effective separation of the knowledge base from the rule interpreter or inference engine. Thus, the inference engine can be general purpose and work equally effectively, in principle, on various knowledge bases.

The knowledge base composed of production rules, in turn, is essentially “pure knowledge” since it need contain no control or programming information. Since each production rule is equivalent to a concise and unambiguous English sentence, the problem of semantics is solved by the very structure of the representation.

Problems with Production Rules of Procedural Knowledge Representation:

While many features of production systems make them desirable for representing and reasoning with knowledge of real-world objects and situation they are not immune from the problems plaguing other AI systems.

Among these problems are:

1. Opacity:

Although the individual production rules maybe models of clarity, the combined operation and effect of the control program may be relatively opaque. That is, production systems may make it difficult to see the forest (control strategy) because of the trees (the interaction between the rule interpreter and individual production rules). Again, this difficulty is rooted in the lack of hierarchy which must be queried often to follow the operation of the analysis.

2. Inefficiency:

Sometimes several of the rules become active during execution. A more intelligent control strategy would reduce this problem. Since production rules are basically “democratic” in their structure and contribution to the system, there are serious difficulties in creating any hierarchy among the rules. This implies exhaustive search through all the production rules for each cycle of the control program.

3. Inability to Learn:

Simple rule-based systems do not have the ability to automatically modify or add to the rule base. These characteristics would be essential for any learning system. Human experts know when to “break the rules” in exceptional cases. Unless there is some provision for adding and modifying rules based on experience, the production system cannot learn.

4. Conflict Resolution:

In an ideal production system, all rules continually monitor the condition of the global database and fire instantaneously when their IF condition is true. In this picture, the set of production rules act like a set of demons discussed under frames. In practice, however, the control structure cycles through the set of rules checking to see which rules have their conditions satisfied, i.e., which are applicable.

Since the firing of one rule may change the activation of other rules, the control structure allows only one rule to fire per cycle. If more than one rule is activated in a given cycle, the control structure must determine which rule to fire from this conflict se t of active rules. This selection is called conflict resolution.

A number of methods for conflict resolution have been proposed including:

1. Rank the rules in a list according to priority and fire the first activated rule i.e., the rule with the highest priority. This, combined with redundancy avoidance, was the strategy used in many expert systems. Its virtue lies in its simplicity, and by ordering the rules in the approximate order of their firing frequency, this can be made a relatively efficient strategy.

2. Form the conflict set, fire the rule with the most strict condition. This is known as the longest matching strategy. Its advantage is that the discrimination power of a strict condition is greater than of a more general condition. A rule with a strict condition effectively “injects more knowledge into the database” when it fires.

3. Fire the most recently used rule of the conflict set. Its advantage, interpreted in terms of search for a solution, is that it represents a depth-first search which follows the path of greatest activity in generating new knowledge in the database.

4. Fire the rule from the conflict set with the most recently used variable. This strategy is possible as long as firing the rule did not contribute information. Its advantage is similar to that of the previous strategy.

5. Fire the rule most recently added to the set of rules. This is a strategy possible only for dynamic knowledge base systems in which the production rules themselves may be added, deleted, or modified during execution. This again would provide greater efficiency by enhancing depth-first search. Such systems, however, represent a much higher level of abstraction and complexity.

6. Compute an execution time priority and fire the rule with the highest priority. This priority may be some function of the above rankings and is similar to strategy underlying the state evaluation function.

7. Simply fire all applicable rules of the conflict set. This strategy is equivalent to treating’ rules as demons and can lead to problems. For instance, firing the first applicable rule may change the condition on the second active rule from TRUE to FALSE.

If the second rule fires anyway, it is acting on erroneous information. If, through feedback of the new status of the database, it does not fire, the strategy reduces to a cyclic firing of applicable rules which represents a departure from the stated strategy.

The intelligent design of conflict resolution strategies is one of the current areas of research in AI. The particular choice of strategy affects both the sensitivity of the production system (the ability to respond quickly to changes in the database) and its stability (the ability to carry out long sequences of actions).

Applicability of Production Rules in Procedural Knowledge Representation:

These appropriate domains may be characterised as follows:

1. Domains in which the structure of the knowledge resembles the structure of production rules. Clinical medicine is frequently cited as an example of such a knowledge domain consisting of many independent facts with no underlying theoretical structure such as that supporting domains like physics and mathematics. However, there is growing concern that the independence and modularity of rules may be illusory since it is difficult to predict the effect of adding extra rules on program behaviour.

2. Domains in which actions are required which are relatively independent of other actions and thereby naturally represented by the THEN part of independent production rules. A typical application with this characteristics is a medical patient-monitro-systems with dependent sub-processes such as mechanical or electromagnetic finite element analysis programs.

3. Domains in which knowledge itself is distinct from the application to which it will be put. In such systems the taxonomy for tree identification can be developed quite independently from potential uses. This is an contrast to systems in which the process and the knowledge are inextricably related such as sailing, skiing, or dating.

Several other features have been incorporated into various production systems, all of which improve the performance of the system and in general enhance its intelligence. One of the most important of these feature is the justification or explanation capability of an expert system.

This feature is particularly important in medicine where medical experts require detailed justification for any diagnosis, whether it issues from natural or artificial intelligence. This ability to trace the line of reasoning leading to a given conclusion is relatively easy to incorporate in the operation of production systems.

Another feature which has been build into certain production systems is that of knowledge acquisition, or in common jargon, learning. Systems with this feature can actually modify or add to their set of production rules based on past experience. By any reasonable definition, this qualifies the production systems (or experts systems) as artificial or machine learning.

A final feature present in many production systems is the capability of dealing correctly with inexact knowledge and probability. Problems involving uncertainties or probabilistic reasoning are generally classified in the domain of fuzzy logic or fuzzy reasoning. This is a basic problem facing most areas of human endeavour there may be no clean yes or no answer in a given situation, only a set of possible solutions with varying likelihoods of being correct.

This is particularly true in which areas as medicine in which both human and machine diagnosis must be phrased in terms of probabilities. The expert system MYCIN illustrates, how production systems can incorporate uncertainty within the statement of the production rule and correctly propagate such uncertainties through to the final diagnosis.

Procedural Knowledge Representation: Production Rules | AI

Rules or Production Rules for Procedural Knowledge Representation:

Inference Making of Procedural Knowledge Representation:

Non-Deductive Inference Method for Procedural Knowledge Representation:

Illustration of Forward/Backward Chaining in Procedural Knowledge Representation:

Advantages of Production Systems in Procedural Knowledge Representation:

Problems with Production Rules of Procedural Knowledge Representation:

Applicability of Production Rules in Procedural Knowledge Representation: