In this article we will discuss about:- 1. Introduction to Reasoning with Uncertain Knowledge 2. Types of Reasoning(Revisited) 3. Ways of Dealing.

Introduction to Reasoning with Uncertain Knowledge:

Reasoning probably brings to mind logic puzzles, but it is something which we do every day of our lives. Reasoning in AI is the process by which we use the knowledge we have to draw conclusions or infer something new about a domain of interest. It is necessary part of what we call “intelligence”. Without the ability to reason we are doing little more than a lookup when we use information.

In fact, this is the difference between a standard data base system and a knowledge base. Both have information which can be accessed in various ways but the data base, unlike the knowledge base in the expert system, has no reasoning facilities and can therefore answer only limited specific questions.

What are the types of reasoning we come across? How do we know what to expect when we go on a train journey? What do we think when our friend is annoyed with us? How do we know what will happen if our car has a flat battery? Whether we are aware of it or not, we will use a number of different methods of reasoning, depending on the problem we are considering and the information that we have before us.

ADVERTISEMENTS:

The three everyday situations mentioned above illustrate three key types of reasoning which we use. In the first case we know what to expect on a train journey because of our experience of numerous other train journeys. We infer that the new journey will share common features with the examples.

We are aware of the first example called induction, which can be summarised as generalisation from cases seen to infer information about cases unseen. We use it frequently in learning about the world around us. For example, every crow we see is black; therefore we infer that all crows are black. If we think about it, such reasoning is unreliable, we can never prove our inferences to be true, we can only prove them to be false. Take the crow again.

To prove that all crows are black we would have to confirm that all crows which exist, have existed or will exist are black. This is obviously not possible. However, to disprove the statement, all we need is to produce a single crow which is white or pink.

So at best we can amass evidence to support our belief that all crows are black. In spite of its unreliability inductive reasoning is very useful and is the basis of much of our learning. It is used particularly in machine learning.

ADVERTISEMENTS:

The second example we considered was working out why a friend is annoyed with us, in other words trying to find an explanation for our friend’s behaviour. It may be that this particular friend is a stickler for punctuality and we are a few minutes late to our rendezvous. We may therefore infer that our friend’s anger is caused by we being late.

This is abduction, the process of reasoning back from something to the state or event which caused it. Of course this too is unreliable; it may be possible that our friend is angry for some other reason (perhaps we had promised to telephone him before coming to him but had avoided). Abduction can be used in cases where the knowledge is incomplete, Abduction can provide a “best guess” given, the available evidence.

The third problem is usually solved by deduction: we have knowledge about cars such as “if the battery is flat the headlights won’t work”; we know the battery is flat so we can infer that the lights won’t work. This is the reasoning of standard logic.

Indeed, we would express our car problem in terms of logic given that:

ADVERTISEMENTS:

a = the battery is flat and b = the lights won’t work and the axioms

ᗄx: a(x) b(x)

a (my car).

We can deduce b the light of my car won’t work (my car).

ADVERTISEMENTS:

However, we cannot deduce the inverse: that is if we know b, we cannot deduce a; the battery of my car is flat if the lights of my car won’t work. This is not permitted in standard logic. If lights don’t work we may use abduction to derive this explanation. However, it could be wrong; there may be another explanation for the light failure (for example, a bulb may have blown or the battery connections may be loose).

Deduction is probably the most familiar form of explicit reasoning. It can be defined as the process of driving the logically necessary conclusion from the initial premises.

For example:

Elephants are bigger than dogs

ADVERTISEMENTS:

Dogs are bigger than mice

Therefore

Elephants are bigger than mice

However, it should be noted that deduction is concerned with logical validity, not actual truth.

Consider the following example; given the facts, can we reach the conclusion by deduction?

Some dogs are greyhounds

The greyhounds run fast

Therefore

Some dogs run fast.

The answer is no. We cannot make this deduction because we do not know that all greyhounds are dogs. The fast dogs may therefore be the greyhounds which are not dogs. This of course is non-sensual in terms of what we know (or more accurately have induced) about the real world, but it is perfectly valid based on the premises given. We should therefore be cautious: deduction (also called analogical inference) does not always correspond to natural human (common sense) reasoning.

Types of Reasoning(Revisited):

Coming in different flavours reasoning can progress in one of two directions; forward to the goal or backward from the goal. Both are used in AI in different circumstances.

Forward reasoning (also referred to as forward chaining, data-driven reasoning, bottom- up or antecedent driven) begins with known facts and attempts to move towards the desired goal. This is a kind of deductive reasoning. Backward reasoning (backward chaining, goal-driven reasoning, top-down, consequence driven or hypothesis-driven) begins with the goal and sets up sub-goals which must be solved in order to solve the main goal. This is a type of inductive reasoning.

Imagine that you hear a man bearing your family name died interstate a hundred years ago and that solicitors are looking for descendants. There are two ways in which you could determine if you are related to the dead man. The first method is to follow through your family tree from yourself to see if he appears.

The second method is to trace his family tree to see if it includes you. The first is an example of forward reasoning and the second of backward reasoning. In order to decide which method to use, we need to consider the number of start and goal states (move from the smaller to the larger), the more states there are the easier it is to find one and the number of possibilities which need to be considered at each stage (the fewer the better).

In the above example, there is one start state and one goal state. However, if we can use forward reasoning there will be two possibilities to consider from each side, (each person will have two parents), whereas with backward reasoning there may be many more (even today the average number of children is 2.4, though at the beginning of the last century it was far more at least in India).

In general backward reasoning is most applicable in situations where a goal or hypothesis can be easily generated (for example, in mathematics or medicine), and where data must be acquired by the problem solver (for example, a doctor asking for vital signs and other information in order to prove or disprove a hypothesis).

If there are more possible outcomes than there are initial states a backward chaining process will produce the fastest search with shortest path. Forward reasoning on the other hand, is useful where most of the data is given in the problem statement but where the goal is unknown or where there are large number of possible goals.

For example, a system which analyses geological data, in order to determine which minerals are present, falls into this category. Forward reasoning is better in the cases where there are more initial states than the goal states.

A control strategy can adopt even both forward and backward chaining. In that case, a start is made at an initial state as well as goal state and work in both directions. In some cases the search path and the computer time can be shortened by taking such a bidirectional approach. The search terminates when the forward and backward paths intersect at some nodes.

Ways of Dealing Reasoning with Uncertain Knowledge:

We looked at knowledge and considered how different knowledge representation schemes allow us to reason. Recall, for example that standard or classical logics allow us to infer new information from the facts and rules which we have. Such reasoning is useful in that it allows us to store and utilise information efficiently (we do not have to store everything).

However, such reasoning assumes that the knowledge available is complete (or can be inferred) and correct and that it is consistent. Knowledge added to such systems never makes previous knowledge invalid. Each new piece of information simply adds to the knowledge.

This is monotonic reasoning. Monotonic reasoning can be useful in complex knowledge bases since it is not necessary to check consistency (as required in expert systems) when adding knowledge or to store information relating to the truth of knowledge. It therefore saves time and storage.

However, if knowledge is incomplete or changing an alternative reasoning system is required. There are a number of ways of dealing with uncertainty.

We shall consider some of them, briefly:

1. Dempster shaffer theory

2. Fuzzy reasoning.

1. Demster Shaffer Theory :

Deals with the distinction between uncertainty and ignorance. Rather than computing the probability of a proposition it computes the probability that the evidence supports the proposition. This measure of belief is called a belief function, Bel (x).

In Bayesian network technique, degree of belief assigned to a proposition with given evidence, is a point, whereas in D.S. theory we consider sets of propositions and to each set we assign an interval [Belief, Plausibility]. 

in which the degree of belief must lie. We use probability density function m and for all subsets of ɸ for an exhaustive universe of mutually exclusive hypotheses Θ (called frame of discernment) and for all subsets of ɸ. Belief (represented as Bel) measures the strength of evidence in a set of propositions. This lies in the range 0-1 when no evidence means belief 0 and certainty: means belief is 1.

Plausibility (PI) is given by

PI (S) = 1- Bel (∼ S)

where S is set of proposition

In particular if we have certain evidence in favour of ∼ S then Bel (∼ S) will be 1.

and PI (S) is zero

and Bel (S) is also zero.

The belief-plausibility interval defined above measures not only our level of belief in some set of proposition but also the amount of information we have. This can be illustrated with the help of an example.

Suppose a person with doubtful character (shady character) approaches you to bet for Rs.500, telling that on the next flip on his coin heads would appear. You think for a moment that coin may not be fair. You have no evidence on the fairness or unfairness of the coin. Then as per the D-S theory, with no evidence whatsoever (on coin being fair or unfair)

Bel (Heads) = 0

Bel (−Heads) = 0

That is D−S theory has no intuitive faculty.

Now suppose you consult an expert of coins and he assets that with 90% certainty the coin is fair so you become 90% sure that P (Heads) = 0.5.

Now D-S theory gives Bel (Heads) = 0.9 x 0.5 = 0.45

and similarly bel (∼ Heads) = 0.45

There is still 10% point gap which is not accounted for by the evidence. Actually dempster rule shows how to combine evidence to give new value for belief and shafer’s work extends this into a complete computational model.

Since D-S theory deals not with point probability but with probability interval, the width of the interval might be an aid in deciding when do we need to acquire evidence.

In the present example before acquiring the expert’s testimony probability interval for Heads is [0, 1] and this gets reduced to [0.45, 0.55] after the expert’s testimony is received for the coin (information about fairness of coin).

However, there are no clear guidelines for how to do this and there is no clear meaning for what the width of an interval means. For example, knowing whether the coin is fair would have a significant impact on the belief that it will come up heads and detecting an asymmetric weight would have an impact on the belief that the coin is fair.

Consider another example:

Diagnosis problem, as a case of exhaustive universe of mutually exclusive hypotheses. This can be called the frame of discernment, written as Q.

This may be consisting of the set (Alg, Flu, Coe, Pne) following symptoms, where:

Alg – Allergy

Flu – Flue

Coe – Cold

Pne – Pneumonia 

Probability density function m is defined not for elements of Q but for all subsets of it is for ɸ and all subsets a m (p) is the amount of belief in the subset (p) of there is no prior evidence in is 1.0.

But when it becomes known through an evidence (at level of 0.6) that the correct diagnosis is in the set {Flu, Col, Pne}

then m gets updated as

{Flu, Col, Pne} (0.6)

{Θ} (0.4)

that is belief is assigned to set of symptoms {Flue, Col, Pne} the remainder of belief still continues to be in the layer set Θ.

Thus, in order to be able to use m and belief and plausibility in D-S. Theory we define functions which enable us to combine m’s from multiple sources of evidence.

Our goal is to attach some measure of belief, m, to the various subsets Z or Q. m is sometimes called the probability density function for a subset of Q. Realistically not all evidence is directly supportive of individual elements of Q. In fact, evidence most often supports different subsets Z of Q.

In addition, since the elements of Q are assumed mutually exclusive evidence in favour of some may have an effect on our belief in others. In purely Bayesian section, we address both of these situations by listing all the combinations of conditional probabilities. In the D-S theory, we handle these interactions by directly manipulating the sets of hypotheses.

There are 2n subsets of Q. We must assign m so that the sum of all the m values assigned to the subsets of Q is 1. Although dealing with 2n values may appear intractable, it usually turns out that many of the subsets will never need to be considered because they may have no significance in the problem domain; so their associated value m may tend to be zero to 3 above.

D-S theory is an example of an algebra supporting the use of subjective probabilities in reasoning, as compared with the objective probabilities of Bayes. In subjective probability theory, we build a reasoning algebra, often by relaxing some of the constraints of Bayes. It is sometimes felt that subjective probabilities better reflect human expert reasoning.

So we conclude uncertain reasoning by saying that D-S allows us to combine:

(i) Multiple sources of evidence for a simple hypothesis.

(ii) Multiple sources of evidence for different hypothesis.

2. Fuzzy Reasoning:

Probabilistic reasoning and reasoning with certainty factors deal with uncertainty using principles from probability to extend the scope of standard logics. An alternative approach is to change the properties of logic itself. Fuzzy sets and Fuzzy logic do just.

In classical set theory an item, say a is either member of set A or it is not. So a meal at a restaurant is either expensive or not expensive and a value must be provided to delimit set membership. Clearly, however, this is not the way we think in real life. While some sets are clearly defined (piece of fruit is either an orange or not an orange), others are not (qualities such as size, speed and price are relative).

Fuzzy set theory extends a classical set theory to accommodate the notion of degree of set membership. Each item is associated with a value between 0 and 1, where 0 indicates that it is not a member of the set and 1 that if in definitely a member. Values in between indicate a certain degree of membership.

For example, although we may agree with the inclusion of cars, Honda, and Maruti in the fast (car) {using logic}, we may wish to indicate that one is faster than the other. This can be possible in Fuzzy set theory.

Here the value in the bracket is a degree of set membership. Fuzzy logic is similar in that it attaches a measure of truth to facts. A predicate P is given. Value between 0 and 1 (as in fuzzy sets).

So the predicate fast (car) can be represented as:

Fast car (Honda 1.5 = 0.9)

Standard logic operators such as AND, OR, NOT are applicable and interpreted as: