Structure of Intelligence -- Copyright Springer-Verlag © 1993 |
In mathematical logic, deduction is analyzed as a thing in itself, as an entity entirely independent from other mental processes. This point of view has led to dozens of beautiful ideas: Godel's Incompleteness Theorem, the theory of logical types, model theory, and so on. But its limitations are too often overlooked. Over the last century, mathematical logic has made tremendous progress in the resolution of technical questions regarding specific deductive systems; and it has led to several significant insights into the general properties of deductive systems. But it has said next to nothing about the practice of deduction. There is a huge distance between mathematical logic and the practice of logic, and mathematical logic seems to have essentially lost interest in closing this gap.
Let us consider, first of all, deduction in mathematics. What exactly is it that mathematicians do? Yes, they prove theorems -- that is, they deduce the consequences of certain axioms. But this is a highly incomplete description of their activity. One might just as well describe their work as detecting analogies between abstract structures. This process is just as universal to mathematical practice as the deduction of consequences of axioms. The two are inseparable. No one proves a theorem by randomly selecting a sequence of steps. And very little theorem proving is done by logically deducing sequences of steps. Generally, theorems are proved by intuitively selecting steps based on analogy to proofs one has done in the past. Some of this analogy is highly specific -- e.g. proving one existence theorem for partial differential equations by the same technique as another. And some of it is extremely generalized -- what is known as "mathematical maturity"; the ability, gleaned through years of studious analogical reasoning, to know "how to approach" a proof. Both specific and general analogy are absolutely indispensable to mathematical research.
Uninteresting mathematical research often makes use of overly specific analogies -- the theorems seem too similar to things that have already been done; frequently they merely generalize familiar results to new domains. Brilliant research, on the other hand, makes use of far subtler analogies regarding general strategies of proof. Only the most tremendously, idiosyncratically original pieceof work does not display numerous analogies with past work at every juncture. Occasionally this does occur -- e.g. with Galois's work on the unsolvability in radicals of the quintic. But it is very much the exception.
It might be argued that whereas analogy is important to mathematics, deduction from axioms is the defining quality of mathematics; that deduction is inherently more essential. But this does not stand up to the evidence. Even in Galois's work, there is obviously some evidence of analogical reasoning, say on the level of the individual steps of his proof. Although his overall proof strategy appears completely unrelated to what came before, the actual steps are not, taken individually, all that different from individual steps of past proofs. Analogical reasoning is ubiquitous, in the intricate details of even the most ingeniously original mathematical research.
And we must not forget Cauchy, one of the great mathematicians despite his often sloppy treatment of logical deduction. Cauchy originated a remarkable number of theorems, but many of his proofs were intuitive arguments, not deductions of the consequences of axioms. It is not that his proofs were explicitly more analogical than deductive -- they followed consistent deductive lines of thought. But they did not proceed by rigorously deducing the consequences of some set of axioms; rather they appealed frequently to the intuition of the reader. And this intuition, or so I claim, is largely analogical in nature.
It is clear that both deduction and analogy are ubiquitous in mathematics, and both are present to highly varying degrees in the work of various mathematicians. It could be protested that Cauchy's proofs were not really mathematical -- but then again, this judgment may be nothing more than a reflection of the dominance of mathematical logic during the last century. Now we say that they are not mathematical because they don't fit into the framework of mathematical logic, in which mathematics is defined as the step-by-step deduction of the consequences of axioms. But they look mathematical to anyone not schooled in the dogma of mathematical logic.
In sum: it is futile to try to separate the process of deduction of the consequences of axioms from the process of analogy with respect to abstract structures. This is true even in mathematics, which is the most blatantly deductive of all human endeavors. How much more true is it in everyday thought?
Let S be any set, and let I={I1, I2, ..., In} be a subset of S, called the set of assumptions. Let SN denote the Cartesian product SxSxSx...xS, taken N times. And let T={T1,T2,...,Tn} be a set of transformations; that is, a set of functions each of which maps some subset of SN into some subset of S. For instance, if S were a set of propositions, one might have T1(x,y)= x and y.
Let us now define the set D(I,T) of all elements of S which are derivablefrom the assumptions I via the transformations T. First of all, it is clear that I should be a subset of D(I,T). Let us call the elements of I the depth-zero elements of D(I,T). Next, what about elements of the form x=Ti(A1,...,Am), for some i, where each Ak=Ij for some j? Obviously, these elements are simple transformations of the assumptions; they should be elements of D(I,T) as well. Let us call these the depth-one elements of D(I,T). Similarly, we may define an element x of S to be a depth-n element of D(I,T) if x=Ti(A1,...,Am), for some i, where each of the Ak is a depth-p element of D(I,T), for some p<n. Finally, D(I,T) may then be defined as the set of all x which are depth-n elements of D(I,T) for some n.
Deductive reasoning is nothing more or less than the construction of elements of D(I,T), given I and T. If the T are the rules of logic and the I are some set of propositions about the world, then D(I,T) is the set of all propositions which are logically equivalent to some subset of I. In this case deduction is a matter of finding the logical consequences of I, which are presumably a small subset of the total set S of all propositions.
Contemporary mathematical logic is not the only conceivable deductive system. In fact, I suggest that any deductive system which relies centrally upon Boolean algebra, without significant external constraints, is not even qualified for the purpose of general mental deduction. Boolean algebra is very useful for many purposes, such as mathematical deduction. I agree that it probably plays an important role in mental process. But it has at least one highly undesirable property: if any two of the propositions in I contradict each other, then D(I,T) is the entire set S of all propositions. From one contradiction, everything is derivable.
The proof of this is very simple. Assume both A and -A. Then, surely A implies A+B. But from A+B and -A, one may conclude B. This works for any B. For instance, assume A="It is true that my mother loves me". Then -A="It is not true that my mother loves me". Boolean logic implies that anyone who holds A and -A -- anyone who has contradictory feelings about his mother's affection -- also, implicitly, holds that 2+2=5. For from "It is true that my mother loves me" he may deduce "Either it is true that my mother loves me, or else 2+2=5." And from "Either it is true that my mother loves me, or else 2+2=5" and "It is not true that my mother loves me," he may deduce "2+2=5."
So: Boolean logic is fine for mathematics, but common sense tells us that human minds contain numerous contradictions. Does a human mind really use a deductive system that implies everything? It appears that somehow we keep our contradictions under control. For example, a person may contradict himself regarding abortion rights or the honesty of his wife or the ultimate meaning of life -- and yet, when he thinks about theoretical physics or parking his car, hemay reason deductively to one particular conclusion, finding any contradictory conclusion ridiculous.
It might be that, although we do use the "contradiction-sensitive" deduction system of standard mathematical logic, we carefully distinguish deductions in one sphere from deductions in another. That is, for example, it might be that we have separate deductive systems for dealing with physics, car parking, domestic relations, philosophy, etc. -- so that we never, in practice, reason "A implies A+B", unless A and B are closely related. If this were the case, a contradiction in one realm would destroy only reasoning in that realm. So if we contradicted ourselves when thinking about the meaning of life, then this might give us the ability to deduce any statement whatsoever about other philosophical issues -- but not about physics or everyday life.
In his Ph.D. dissertation, daCosta (1984) conceived the idea of a paraconsistent logic, one in which a single contradiction in I does not imply that D(I,T)=S. Others have extended this idea in various ways. Most recently, Avram (1990) has constructed a paraconsistent logic which incorporates the "relevance logic" discussed in the previous paragraph. Propositions are divided into classes and the inference from A to A+B is allowed only when A and B are in the same class.
I suggest that Boolean logic is indeed adequate for the purpose of common-sense deduction. My defense of this position comes in two parts. First, I believe that Avron is basically right in saying that contradictions are almost always localized. To be precise, I hypothesize that a mind does not tend to form the disjunction A+B unless %%[(St(A%v)-St(v)]-[St(B%w)-St(w)]%% is small for some (v,w).
I do not think it is justified to partition propositions into disjoint sets and claim that each entity is relevant only to those entities in the same set as it. This yields an elegant formal system, but of course in any categorization there will be borderline cases, and it is unacceptable to simply ignore them away. My approach is to define relevance not by a partition into classes but rather using the theory of structure. What the formulation of the previous paragraph says is that two completely unrelated entities will only rarely be combined in one logical formula.
However, there is always the possibility that, by a fluke, two completely unrelated entities will be combined in some formula, say A+B. In this case a contradiction could spread from one context to another. I suspect that this is an actual danger to thought processes, although certainly a rare one. It is tempting to speculate that this is one possible route to insanity: a person could start out contradicting themselves only in one context, and gradually sink into insanity by contradicting themselves in more and more different contexts.
This brings us to the second part of the argument in favor of Boolean logic. What happens when contradictions do arise? If a contradiction arises in a highly specific context, does it remain there forever, thus invalidating all future reasoning in that context? I suspect that this is possible. But, as will be elaborated in later chapters, I suggest that this is rendered unlikely by the overallarchitecture of the mind. It is an error to suppose that the mind has only one center for logical deduction. For all we know, there may be tens of thousands of different deductive systems operating in different parts of the brain, sometimes perhaps more than one devoted to the same specialized context. And perhaps then, as Edelman (1987) has proposed in the context of perception and motor control, those systems which fail to perform a useful function will eventually be destroyed and replaced. If a deductive system has the habit of generating arbitrary propositions, it will not be of much use and will not last. This idea is related to the automata networks discussed in the final chapter.
One thing which is absolutely clear from all this is the following: if the mind does use Boolean logic, and it does harbor the occasional contradiction, then the fact that it does not generate arbitrary statements has nothing to do with deductive logic. This is one important sense in which deduction is dependent upon general structure of the mind, and hence implicitly on other forms of logic such as analogy and induction.
When deduction is formulated in the abstract, in terms of assumptions and transformation, it is immediately apparent that deductive reasoning is incapable of standing on its own. In isolation, it is useless. For why would there be intrinsic value in determining which x lie in D(I,T)? Who cares? The usefulness of deduction presupposes several things, none of them trivial:
1. the elements of I must be accepted to possess some type of validity.
2. it must be assumed that, if the elements of I are important
in this sense, then the elements of D(I,T) are also valid in this sense.
3. it must be the case that certain of the elements of D(I,T) are important in some sense.
The first requirement is the most straightforward. In mathematical logic, the criterion of validity is truth. But this concept is troublesome, and it is not necessary for deduction. Psychologically speaking, validity could just as well mean plausibility.
The second requirement is more substantial. After all, how is it to be known that the elements of D(I,T) will possess the desired properties? This is a big problem in mathematical logic. Using predicate calculus, one can demonstrate that if I is a set of true propositions, every statement derivable from I according to the rules of Boolean algebra is also true. But Boolean algebra is a very weak deductive system; it is certainly not adequate for mathematics. For nontrivial mathematics, one requires the predicate calculus. And no one knows how to prove that, if I is a set of true propositions, every statement derivable from I according to the rules of predicate calculus is true.
Godel proved that one can never demonstrate the consistency of anysufficiently powerful, consistent formal system within that formal system. This means, essentially, that if validity is defined as truth then the second requirement given above can never be verified by deduction.
To be more precise: if validity is defined as truth, let us say T is consistent if it is the case that whenever all the elements of I are true, all the elements of D(I,T) are true. Obviously, in this case consistency corresponds to the second requirement given above. Godel showed that one can never prove T is consistent using T. Then, given a deductive system (I,T), how can one deductively demonstrate that T is consistent? -- i.e. that the second requirement given above is fulfilled? One cannot do so using T, so one must do so in some other deductive system, with a system of transformations T1. But if one uses T1 to make such a demonstration, how can one know if T1 is consistent? If T1 is inconsistent, then the demonstration means nothing, because an in an inconsistent system one can prove anything whatsoever. In order to prove T1 is consistent, one must invoke some T2. But in order to prove T2 is consistent, one must invoke some T3. Et cetera. The result is that, if validity is defined as truth, one can never use deduction to prove that the results of a given set of transformations are valid.
Yet we believe in mathematics -- why? By induction, by analogy, by intuition. We believe in it because, at bottom, it feels right. It's never led us wrong before, says induction. It worked in all these other, similar, cases, so it should work here -- says analogy. Even if validity is defined as truth, a recourse to induction and analogy is ultimately inevitable.
If validity is defined, say, as plausibility, then the situation is even worse. Clearly, any true statement is plausible, so that it's at least as hard to justify plausible reasoning as it is to justify "certain" reasoning. And, furthermore, the very concept of "plausibility" refers to induction and analogy. In sum, I contend that, in general and in specific cases, deduction is only justifiable by recourse to induction and analogy.
ANALOGY GUIDES DEDUCTION
Finally, let us consider the third requirement for the usefulness of deduction: certain of the elements of D(I,T) must be somehow important. Otherwise deduction would simply consist of the haphazard generation of elements of D(I,T). This is not the case. In mathematics or in everyday life, one wants to deduce things which are useful, beautiful, interesting, etc. This gives rise to the question: how does one know how to find the important elements of D(I,T)?
It seems clear that this is a matter of analogical reasoning. For instance, suppose one has a particular entity x in mind, and one wants to know whether x is an element of D(I,T). Or suppose one has a particular property P in mind, and one wants to find an element x of D(I,T) which has this property. How does one proceed? To an extent, by intuition -- which is to say, to an extent, one does not consciously know how one proceeds. But insofar as one makesconscious decisions, one proceeds by considering what has worked in the past, when dealing with an entity x or a property P which is similar to the one under consideration.
For example, when studying mathematics, one is given as exercises proofs which go very much like the proofs one has seen in class or in the textbook. This way one knows how to go about doing the proofs; one can proceed by seeing what was done in similar cases. After one has mastered this sort of exercise, one goes on to proofs which are less strictly analogous to the proofs in the book -- because one has grasped the subtler patterns among the various proofs; one has seen, in general, what needs to be done to prove a certain type of theorem.
Above I argued that deduction is only justifiable by analogy. Here the point is that deduction is impotent without analogy: that in order to use deduction to work toward any practical goal, one must be guided by analogy. Otherwise one would have no idea how to go about constructing a given proof.
This is, I suggest, exactly the problem with automatic theorem provers. There are computer programs that can prove simple theorems by searching through D(I,T) according to a variety of strategies. But until these programs implement some form of sophisticated analogy -- systematically using similar strategies to solve similar problems -- they will never proceed beyond the most elementary level.
USEFUL DEDUCTIVE SYSTEMS
Another consequence of this point of view is that only certain deductive systems are of any use: only those systems about which it is possible to reason by analogy. To be precise, let x and y be two elements of D(I,T), and let GI,T(x) and GI,T(y) denote the set of all proofs in (I,T) of x and y respectively.
Definition 8.1: Let (I,T) be any deductive system, and take a>0.
Let U equal the minimum over all v of the sum a%v%+B, where B is the average, over all pairs (x,y) so that x and y are both in D(I,T), of the correlation coefficient between d#[St(x%v)-St(x),St(y%v)-St(v)] and dI[GI,T(x),GI,T(y)]%. Then (I,T) is useful to degree U.
The relative distance dI[GI,T(x),GI,T(y)] is a measure of how hard it is to get a proof of x out of a proof of y, or a proof of y out of a proof of x. If v were assumed to be the empty set, then %d#[St(x%v)-St(x),St(y%v)-St(v)] - d[GI,T(x),GI,T(y)]% would reduce to %dI(x,y) - d[GI,T(x),GI,T(y)]%. The usefulness U would be a measure of how true it is that structurally similar theorems have similar proofs.
But in order for a system to be useful, it need not be the case that structurally similar theorems have similar proofs. It need only be the case that there is some system for determining, given any theorem x, which theorems y are reasonablylikely to have similar proofs. This system for determining is v. In the metaphor introduced above in the section on contextual analogy, v is a codebook. A deductive system is useful if there is some codebook v so that, if one decodes x and y using v, the similarity of the resulting messages is reasonably likely to be close to the similarity of the proofs of x and y.
The constant a measures how much the complexity of the codebook v figures into the usefulness of the system. Clearly, it should count to some degree: if v is excessively complex then it will not be much use as a codebook. Also, if v is excessively complex then it is extremely unlikely that a user of the system will ever determine v.
Mathematically speaking, the usefulness of traditional deductive systems such as Boolean algebra and predicate calculus is unknown. This is not the sort of question that mathematical logic has traditionally asked. Judging by the practical success of both systems, it might seem that their usefulness is fairly high. But it should be remembered that certain parts of D(I,T) might have a much higher usefulness than others. Perhaps predicate calculus on a whole is not highly useful, but only those parts which correspond to mathematics as we know it.
It should also be remembered that, in reality, one must work with dS rather than d#, and also with a subjective estimate of % %. Hence, in this sense, the subjective usefulness of a deductive system may vary according to who is doing the deducing. For instance, if a certain codebook v is very complicated to me, then a deductive system which uses it will seem relatively useless to me; whereas to someone who experiences the same codebook as simple, the system may be extremely useful.
DEDUCTION, MEMORY, INDUCTION
If the task of intelligence is essentially inductive, where does deduction fit in? One way to approach this question is to consider a deductive system as a form of memory. Deduction may then be understood as an extremely effective form of data compaction. Instead of storing tens of thousands of different constructions, one stores a simple deductive system that generates tens of thousands of possible constructions. To see if a given entity X is in this "memory" or not, one determines whether or not X may be derived from the axioms of the system. And, with a combination of deduction and analogy, one can determine whether the "memory" contains anything possessing certain specified properties.
Of course, a deductive system is not formed to serve strictly as a memory. One does not construct a deductive system whose theorems are precisely those pieces of information that one wants to store. Deductive systems are generative. They give rise to new constructions, by combining things in unforeseeable ways. Therefore, in order to use a deductive system, one must have faith in the axioms and the rules of transformation -- faith that they will not generate nonsense, at least not too often.
How is this faith to be obtained? Either it must be "programmed in", or it must be arrived at inductively. AI theorists tend to implicitly assume that predicate calculus is inherent to intelligence, that it is hard-wired into every brain. This is certainly a tempting proposition. After all, it is difficult to see how an organism could induce a powerful deductive system in the short period of time allotted to it. It is not hard to show that, given a sufficiently large set of statements X, one may always construct a deductive system which yields these statements as theorems and which is a pattern in X. But it seems unlikely that such a complex, abstract pattern could be recognized very often. What the AI theorists implicitly suggest is that, over a long period of time, those organisms which did recognize the pattern of deduction had a greater survival rate; and thus we have evolved to deduce.
This point of view is not contradicted by the fact that, in our everyday reasoning, we do not adhere very closely to any known deductive system. For instance, in certain situations many people will judge "X and Y" to be more likely than "X". If told that "Joe smokes marijuana", a significant proportion of people would rate "Joe has long hair and works in a bank" as more likely than "Joe works in a bank". It is true that these people are not effectively applying Boolean logic in their thought about the everyday world. But this does not imply that their minds are not, on some deeper level, using logical deduction. I suspect that Boolean logic plays a role in "common sense" reasoning as in deeper intuition, but that this role is not dominant: deduction is mixed up with analogy, induction and other processes.
To summarize: recognizing that deductive systems are useful for data compaction and form generation is one thing; exalting deduction over all other forms of thought is quite another. There is no reason to assume that deduction is a "better", "more accurate" or "truer" mode of reasoning than induction or analogy; and there is no reason to believe, as many AI theorists do, that deduction is the core process of thought. Furthermore, it seems very unlikely that deduction can operate in a general context without recourse to analogy. However, because deduction is so effective in the context of the other mental processes, it may well be that deduction is essential to intelligence.