Chaotic Logic -- Copyright Plenum Press © 1994

Back to Chaotic Logic Table of Contents

Chapter Seven

SELF-GENERATING SYSTEMS

    In his recent book Self-Modifying Systems in Biology and Cognitive Science (1991), George Kampis has outlined a new approach to the dynamics of complex systems. The key idea is that the Church-Turing thesis applies only to simple systems. Complex biological and psychological systems, Kampis proposes, must be modeled as nonprogrammable, self-referential systems called "component-systems."

    In this chapter I will approach Kampis's component-systems with an appreciative but critical eye. And this critique will be followed by the construction of an alternative model of self-referential dynamics which I call "self-generating systems" theory. Self-generating systems were devised independently of component-systems, and the two classes of systems have their differences. But on a philosophical level, both formal notions are getting at the same essential idea. Both concepts are aimed at describing systems that, in some sense, construct themselves. As I will show in later chapters, this is an idea of the utmost importance to the study of complex psychological dynamics.

7.1 COMPONENT-SYSTEMS

    A component-system, as defined by Kampis, consists of a collection of components, each of which can act on other components to produce new components. More precisely,

An abstract component-system can be defined by the following properties:

a) - there is a finite set of non-dividable and permanent building blocks, drawn from a given pool

b) - there is an open-ended variety of the different types of admissible components, built up from the building blocks according to some composition rule (which may be explicit or implicit)

c) - the components of the system are assembled and disassembled by the processes of the system such that every admissible component is also realizable. (p.199)

    For illustrative purposes, Kampis suggests that the reader visualize the "non-dividable building blocks" as LEGO blocks, and the "admissible constructions" as different possible structures buildable out of LEGO blocks. One must merely imagine that each LEGO structure contains some appropriate means for acting on other LEGO structures to produce new LEGO structures.

    The main biological example of a component system is a "molecular soup" full of organic molecules acting on one another to form new molecules. Psychologically, on the other hand, one is supposed to think of ideas acting on each other to produce new ideas. The central thesis of Self-Modifying Systems is that biological and psychological systems, being component-systems, are fundamentally uncomputable. This thesis combines two distinct claims:

     Claim 1: Formal component-systems display uncomputable behavior.

     Claim 2: Formal component-systems are good models for biological and psychological systems.

    The first claim is a mathematical result, which Kampis calls his "Main Theorem." The second claim, on the other hand, is obviously a scientific hypothesis.

    In this section I will explore these two claims in some detail. This exploration will lead us to a new class of systems called self-generating systems -- a class of systems which is different from, but overlapping with, Kampis's class of component-systems. The contrast between self-generating systems and component-systems will shed a great deal of light on the fundamental issues of system theory.

7.1.1. Quantum and Stochastic Computation

    Before pursuing Kampis's main thesis in any more detail, I will first explore the meaning of the term "computable." My attitude toward computation has been influenced immensely by David Deutsch's (1985) work on quantum computation. Deutsch has demonstratedmathematically that any system modelable by the equations of quantum physics can be simulated to within arbitrary accuracy by a "quantum computer". A quantum computer is different from an ordinary Turing machine. However, it cannot compute any functions besides those which an ordinary Turing machine can compute.     Deutsch's "Quantum Church-Turing Thesis" states that every physically realizable algorithm can be represented as a program for a quantum computer. In fact, this is not really a thesis but a theorem. In this respect it is far more impressive than the ordinary Church-Turing Thesis.

    There is also another Church-Turing Thesis, intermediate between the standard one and the quantum version. One may define a stochastic computer as a Turing machine which is capable of doing "random coin tosses." Then the Stochastic Church-Turing Thesis states that every algorithm can be represented as a program for a stochastic computer. Deutsch has shown that stochastic computation is a less general model than quantum computation; and I will make use of this result now and again in the following.

    Kampis's proof of the uncomputability of component-systems -- Claim 1 above -- says nothing about quantum or stochastic computers. It speaks only of Turing machine computation. In the following I will argue that this omission is important -- that Kampis's component-systems, although they are not Turing computable, may sometimes be computable by stochastic computers. Because stochastic computation is a less general model than quantum computation, this implies that at least some component-systems are explicable in terms of quantum physics.

    One necessary requirement of any theory of complex systems is agreement with microscopic physics. Those component-systems which are not quantum computable, are in contradiction to the principles of physics. What this means is that, in a physical sense, the class of component-systems is too broad.

    Actually, there is a hole in this argument -- a tiny hole, but one which must be duly noted. "Agreement with microscopic physics" is not strictly synonymous with "agreement with quantum physics." In his best-seller The Emperor's New Mind, Roger Penrose briefly discusses Deutsch's theorem, but he dismisses it on the grounds that quantum mechanics will soon be replaced by a unified theory of quantum gravity. The unified theory of quantum gravity, Penrose conjectures, will imply that computable systems are fundamentally uncomputable in the strongsense of being able to compute non-Turing-computable functions.

    The weak point of Penrose's argument, however, is that none of the existing approaches to quantum gravity show any promise of implying uncomputability. For instance, string theory (Green et al, 1987) is similar to quantum theory in its general mathematical form -- it depends on the "quantization" of a classical domain using the Feynman path summation formula. So, if some form of string theory is correct, then it would seem that there is no hope for Penrose's idea.

7.1.2. Kampis's "Main Theorem"

    We have broken down Kampis's central thesis into two claims. The first of these, the "Main Theorem" of Self-Modifying Systems, is as follows:

     Main Theorem. In a component-system it is not possible to know the names and the encoding (the meaning) of the names before the system produces the respective components.... The behaviour of component-systems is fully uncomputable and unpredictable because the produced new observables are different from the earlier ones.

The basic idea here is that the temporal sequence of states of a component-system is in general an uncomputable sequence. Since no Turing machine program can generate an uncomputable sequence, component-systems must be uncomputable.

    It seems to me that the key point is clause (c) of the definition, which says that "every admissible component is realizable." Suppose one assumes that the set of all admissible components is uncomputable, and that the dynamics of a component-system are capable of leading to any admissible component. Then it follows logically that the dynamics of a component-system cannot be specified by any program. For, if one assumed the opposite, one would obtain a contradiction -- one would have an uncomputable set of entities obtainable from a computer program.

    Let us go back to the LEGO metaphor. It would be easy to build a computable LEGO universe following Kampis's instructions. For the set of all LEGO structures is countable, and may therefore be mapped into the set of binary sequences, in a one-to-one manner. And each binary sequence may be represented as a Turingmachine program, i.e. as a map from binary sequences to binary sequences. Therefore, using Turing machines, each LEGO structure could be interpreted as a function acting on other LEGO structures. The only problem with this arrangement is that it does not satisfy clause (c) of the definition of component-system. Not every LEGO structure is realizable by our dynamics. Only some computable subset of LEGO structures is realizable.

    But now -- and here is where my thinking differs from Kampis's -- suppose one adds a random element to one's Turing machine. Suppose each component of the Turing machine is susceptible to errors! Then, in fact, every possible LEGO structure becomes realizable! Structures may have negligibly small probability, but never zero probability! This is an example of a component-system which is computable by a stochastic Turing machine.

    Deutsch has shown that quantum Turing machines are more general than stochastic Turing machines. So what I have shown is that component-systems are perfectly realizable in terms of the equations of quantum mechanics. This implies that there is absolutely no problem with the statement that "molecular soups" or brains are component-systems. But there is a problem with the statement that these systems are "fundamentally uncomputable." The Turing model of computation is not in general physically adequate. But quantum computation, something quite similar to Turing computation, is always physically adequate, at least so far as our present knowledge of physics goes. And something that is quantum computable is not, in a philosophical sense, fundamentally uncomputable.

7.1.3. Self-Constructing Robots

    To put these ideas in sharper focus, let us now turn to a metaphor which Kampis introduces around the middle of the book: the self-constructing robot. This idea is a natural extrapolation of modern industrial technology.

    Right now, in Japan, there are robotized factories -- factories in which routine assembly-line tasks are carried out by robots rather than people. These are not humanoid robots like C3PO in Star Wars. They look like what they are: sophisticated factory tools. But their capabilities are astounding -- they combine the spatial common sense of a human worker with the speed and precision of a calculator. In fact, it is not unlikelythat, somewhere in Japan, there are detailed plans for a factory in which robots are used to build more robots.

    And, of course, the industrial use of robots is not restricted to manufacturing. It is well within contemporary technology to use robots for repair. It is not yet profitable to use robots to repair robots, but this is because of simple technical problems, not fundamental engineering obstacles.

    The point of all this is: if a robot can repair other robots, why not itself? And if a robot can repair itself, why not reconstruct itself, even when it is not broken? It is not too far beyond current technology to build a robot that reconstructs itself. There is no reason not to build a robot whose software (brain) tells it how to reconstruct its hardware (body, including brain). Such a self-constructing robot would embody an enchanting sort of loop: self constructing new self, which constructs new self, which.....

    But finally, suppose that someone builds a self-constructing robotw which is, however, imperfect. It sometimes makes slight random errors; its arms don't always move quite exactly the way they are supposed to. Then in classic chaotic form, as time goes by, these slight random errors can be expected to build up into large errors. One has a fundamentally unpredictable sequence of machines. There is no telling exactly what the robot will make of itself, given say fifty years time.

    To me, a self-constructing robot with errors seems like a wonderfully creative thing! But Kampis's argument is precisely the opposite. In one particularly striking passage, Kampis characterizes a component-system as "a strange computer in which also the software is identified with the hardware." Elaborating, he declares that

a component system is a computer which, when executing its operations (software) builds a new hardware.... [W]e have a computer that re-wires itself in a hardware-software interplay: the hardware defines the software and the software defines new hardware. Then the circle starts again. (p. 223)

    To me, this sounds exactly like the self-reconstructing robot which I have just finished talking about. But Kampis has something else up his sleeve. He does not believe in the Church-Turing thesis. He believes that a "component-system" is nonprogrammable, inthe sense that no algorithm, no set of rules, can completely describe its behavior. He follows up the previous quotation with a warning to the computationally-inclined:

[A] sceptical reader could say: that's not a big deal. With current-day industrial robot technology this should be possible. Robots are automata; they are computers. They can assemble other robots, maybe even themselves. They have a complete behavior algorithm. So, by analogy, component-systems, too, can have one.

    But this is not as easy a matter as it sounds. In a robot the whole software is ready-made and completely defined from the beginning on, and is stored in an accessible form; in a component-system, according to the above story, the "algorithm" is nowhere stored completely; software and hardware define each other without any of them being complete or independent.

The paradigm case of a component-system, according to Kampis, is a "soup" of organic cells. Each cell acts on each other cell, thus creating other cells, and there is no distinction between software and hardware.

    But let us consider, once again, our imperfect self-constructing robot. This robot is programmed to modify its own hardware, but it is susceptible to random error. Then it is quite possibly true that no computer program can predict the behavior of the robot. For the collection of all possible times and places for random error is very large, and the collection of sets of times and places for random error is even larger. To predict the behavior of the robot, a computer program would have to predict what would happen to the robot given each possible set of random errors. But, for any program of finite length, there is some set of random errors which cannot be compressed into any program of that length.

    The moral of the story is that, in the case of the self-reconstructing robot, stochastic computation does what Turing computation does not. It gives the potential for true flexibility; for self-referential creation of the fundamentally, indisputably new. While component-systems cannot be Turing computable, they can be stochastically computable. This observation casts a revealing light on the distinction between component-systems and Turing machines.

7.1.4. Creativity

    To put the same point another way, I respectfully accuse Kampis of having an overly mystical notion of creativity. He complains that computer programs can never create anything beyond what has been put into them -- a very old argument. This is true in the same sense that mathematical theorems are never original creations -- they are all contained in the basic axioms of mathematics. But, even if one accepts this strict notion of creativity, it still does not follow that stochastic computers are noncreative.

    In general, a stochastic computer has a range of output that is incredibly wide, often uncomputable. In fact, one may very easily construct a stochastic computer that has the capability to construct anything whatsoever, just by chance. So stochastic self-reconstructing computers suffer from no lack of potential creativity. How much of this creativity is actualized depends on the intricate interaction of the deterministic and stochastic components.

    This brings us to a basic principle of systems theory: the essence of creativity is the interplay between rules and randomness. This ancient concept, which received its modern form in the work of Ross Ashby (1954), is one of the most humanly meaningful implications of the computer revolution. It is humbling to realize that even the most marvelous works of the greatest geniuses -- Einstein's General Theory of Relativity, Goethe's Faust, Beethoven's Fifth Symphony -- were produced by a complex combination of random chance with strict, deterministic rules. Kampis does not wish us to accept this. But if one is to accept that modern physical science applies to neural processes, then one must, I suggest, accept the equation of creativity with quantum computation.

    As an afterthought, it is worth briefly questioning the role that random chance plays here. From the point of view of any one computer -- be it Turing, quantum or stochastic -- there are certain deterministic sequences of events that are fundamentally indistinguishable from random sequences of events. These are sequences whose algorithmic complexity exceeds the algorithmic complexity of the computer who is doing the distinguishing. Gregory Chaitin (1974, 1987) has shown that this statement is essentially a form of Godel's Incompleteness Theorem.

    So, from any one subjective point of view, there is no way of telling if some perceived entity is stochastically computed, or just plain Turing computed. Now, although component-systems are not Turing computable, we have seen that they can be stochasticallycomputable. It follows that, from any one subjective point of view, a component-system might as well be Turing computable! To any particular entity, "random" just means "too complex for me to understand."

7.2    SELF-GENERATING SYSTEMS

    Nowhere in Self-Modifying Systems does Kampis give an adequate formal definition of "component-system." To my mind, this is the only sizeable flaw in an otherwise outstanding book. This omission is particularly crucial in that it makes it difficult to mount conceptual attacks against Kampis's nonprogrammability thesis.

    Kampis gives a fairly good reason for this significant omission:

[C]oncepts of formal dynamics do not fit well to component-systems.... [W]hen we consider component-systems as systems which produce components from components from components, we may, by the same token, think of transformations producing directly other transformations: ft: ft --> ft'.

    There is a formal problem with this idea. From a mathematical-logical point of view no mathematical function can belong to its own domain or range. However, the functions that describe component-systems try to do exactly this, if we take them literally. (p.212)

But in fact, two hundred pages later, Kampis admits that this complaint is not strictly accurate. He refers to Lofgren's (1968) demonstration that the existence of functions belonging to their own domain or range is independent of the ordinary axioms of set theory.

    As it happens, Lofgren is not the only researcher to point out the existence of set theories in which functions can belong to their own domain or range. From my point of view, Paul Aczel's AFA axiom provides a much more elegant approach to constructing such unusual functions. So, let us digress for a few paragraphs to describe the AFA axiom.

7.2.1. Hypersets

    In mathematics, one defines complex concepts in terms of simpler concepts. But this process must bottom out somewhere -- there must be something that is sosimple that there is nothing simpler in terms of which to define it. In modern mathematics, this elementary concept is usually taken to be the "set." The term "set" may be defined intuitively, as a collection of entities -- but of course, this is circular, since what is a "collection" if not a "set"?

    Mathematicians take this intuitive definition of a set, and then postulate certain simple rules for dealing with sets. All the complex constructions of modern mathematics can be expressed in terms of sets, and a great many of the theorems of modern mathematics can be proved by using the rules of set theory. However, in the 1930's Kurt Godel showed that, given any particular list of rules for manipulating sets, there are some mathematical theorems that can be expressed in the language of sets, but cannot obtained by using the rules on the list. Essentially this is because each list of rules has a certain finite amount of "algorithmic complexity", and cannot be used to prove theorems that possess an algorithmic complexity in excess of this amount.

    At first, mathematicians were very loose about what qualified as a set. They dealt with finite sets like {1,2,3}, infinite sets like (1,2,3,...}, the set of all fractions, or the set of all numbers on the number line; and also with abstract sets far more esoteric than these. But since the concept of set never caused them any trouble, they had no motivation to fiddle with it.

    But then, around the turn of the century, Bertrand Russell noted a problem. He said, consider the set containing all sets that do not contain themselves. And he asked: does this set in fact contain itself? The trouble is, what if it does contain itself? Then it is not a set that does not contain itself -- so it cannot be an element of itself, it cannot contain itself. But, on the other hand, if it does not contain itself then it must be an element of itself, since it is the set of all sets that do not contain themselves. A serious problem!

    Incidentally, for years this "Russell Paradox" has been formulated as follows: "There is a town in which the barber shaves all men who do not shave themselves. Who shaves the barber?" However, someone has wittily pointed out that this version is less potent than the original. The solution is: the barber is a woman!

    In order to avoid Russell's Paradox, a rule was added onto the original axioms of set theory: no set can contain itself as an element. More generally, there can be no "descending chain of membership" -- no set can Acan contain as an element a set which contains A as an element, and so on.

    Many mathematicians were uncomfortable with this rule, since it was not "natural", it was simply appended onto the list of rules in order to avoid contradiction. And this discomfort became especially acute after Godel came out with his Incompleteness Theorem. For Russell's Paradox is essentially a variation on the old Paradox of Epiminides the Cretan: "This sentence is false." But Godel's Theorem is an even cleverer variation on this same ancient paradox. Godel showed that, by implementing a clever scheme of coding, one can use any mathematical system to form the proposition X = "This proposition cannot be proved true or false within this mathematical system." If X can be proved true, then it must be false, in which case it is true, so it cannot be proved true. But if X can be proved false, then it must be false, so it has been proved true, and has therefore not been proved false.

    Godel, with his ingenious Incompleteness proof, showed that self-reference cannot be banned from mathematics anyway, no matter how hard you try. This made Russell's elaborate theory of Types seem even more excessive than it had before. But still, since mathematicians never seemed to have any use for sets that contained themselves as elements, they simply accepted the axiom and went on doing mathematics.

    However, while working with various models of complex systems such as ecosystems and brains, I found that I did have a need for sets that contained themselves as elements. I spent a long time trying to concoct ways of avoiding this problem. But then, while sightseeing in Cambridge and browsing through the MIT Bookstore, I came across a little book by Paul Aczel, entitled Non-Well-Founded Sets. This book describes a research programme in mathematical logic, active since the late 1960's, aimed at constructing a consistent set theory involving sets that can contain themselves as elements.

    The most easily applicable result of this intriguing research programme is the concept of a hyperset. A labeled graph is defined as any collection of dots with a symbol drawn next to each dot, and arrows drawn between the dots. Aczel's "AFA Axiom" implies that every finite graph corresponds to some set. For instance the graph

            Subjectivism

        A     Objectivism

            Mysticism

corresponds to the set A = {objectivism, subjectivism, mysticism}. And the graph

                    Subjectivism

        A             Objectivism

                    Mysticism

                Botulism

                Aneurism

corresponds to the set A = {{objectivism, subjectivism, mysticism}, botulism, aneurism}.

    But what of the graph

        A    ?

This graph corresponds to the set A whose only element is A -- the set A = {A}. This is a "non-well-founded" set.

    And, similarly, the graph

        A        botulism

                    objectivism

                    subjectivism

corresponds to the set A = { A, botulism, {objectivism, subjectivism, A} }.

    To the mathematically indoctrinated mind, all this is incredibly liberating! Just as the paradoxes of quantum physics free the mind from objectivism, so hyperset theory frees the mind from the stifling preconception that, if A contains B, B cannot in turncontain A. Common sense tells us that, if mind is a part of physical reality, then physical reality cannot possibly be a part of mind. And common sense tells us that, if you are a part of my subjective reality, then I cannot possibly be a part of your subjective reality. But hyperset theory tells us that in this case common sense is wrong.

    According to Godel's Theorem, once can never mathematically prove that a complicated mathematical theory is consistent, devoid of self-contradictions. But Aczel has shown that, if there are contradictions in hyperset theory, then there are also contradictions in plain ordinary mathematics, the kind that every scientist uses to make calculations. This is as good a consistency result as one could hope for. One may confidently say: there are mathematical objects that contain one another as elements.

    In the following pages, I will not need to use any of the technical mathematics of hyperset theory. However, I will find it convenient to talk about sets that contain one another in a "circular" way. Hyperset theory ensures that this is okay, that I am not contradicting myself any more than a physicist is when he deals with algebraic or differential equations.

    This section began, if you recall, with Kampis's observation that "from a mathematical-logical point of view no mathematical function can belong to its own domain or range. However, the functions that describe component-systems try to do exactly this, if one takes them literally." It is easy to see that with hypersets, mathematical logic has transcended the limitation to which Kampis is referring. One particular type of hyperset is the hyperfunction -- the function which is contained in its domain or range.

    A hyperfunction maps hyperfunctions and/or other entities into hyperfunctions and/or other entities. Because it is a "function," it is not allowed to map any one thing into two different things. To deal with the more general case, I will introduce the term hyperrelation. A hyperrelation maps hyperrelations and/or other entities into hyperrelations and/or other entities; and it may map one thing into as many other things as it likes.

    These odd constructions, hyperrelations, are the first step on the path toward a cognitive equation. For they give us a straightforward way to talk about components that truly transform one another.

7.2.1.1. A More Formal Treatment (*)

    Recall that, in order to avoid Russell's Paradox, a rule called the Axiom of Foundation was added onto the original axioms of set theory: no set can contain itself as an element. More generally, one is not allowed to have an infinite descending sequence

    ... an+1 an ... a1 b

A set b which contains no such sequence is well-founded. All the traditional sets of mathematics -- the sets involved in geometry, calculus, topology, etc. -- are well-founded. But, for example, S = {S} is not well-founded, because it leads to the infinite descending sequence

    ... S S ... S S

And { 1, { 1, { 1, ... is not well-founded, even though it has the natural "solution"

    x = {1,x}

Many mathematicians were uncomfortable with this rule, since it was not "natural", it was simply appended onto the list of rules in order to avoid contradiction. But since they never had any use for sets that contained themselves as elements, they simply accepted the axiom and went on doing mathematics.

    Paul Aczel (1988) was one of the few who decided to do something about his discomfort. He constructed the "non-well-founded sets" which, following Jon Barwise and Larry Moss (1991), I have called hypersets. According to Aczel's approach, the path to hypersets begins with graphs. A digraph (G,E) consists of a set G of entities called "nodes," and a set E of ordered pairs of nodes, these pairs being called edges. The most common examples of graphs are finite graphs, as in Figure 1; however, the concept of an infinite graph presents no difficulties. If (n,m) is an element of E, I will write n-->m, and call m the child of n, and n the parent of m. Fix a set A of tags. Then a tagged digraph (G,E,t) is a digraph together with a function t that assigns a tag drawn from A to each childless node of G.

    Next, define an accessible pointed graph (apg) (G,E,t,p) to consist of a tagged digraph together with a distinguished node p which has the property that every node can be reached by some finite path from p. And define a decoration of an apg as a set-valued function d with domain G, satisfying

    d(n) = t(n)

if n is childless and

    d(n) = {d(m):n-->m}

otherwise. That is, a decoration assigns to each childless node its tag, and to each parent node n those nodes m which are its children.

    Finally, let us say that an apg pictures a set b if there is a decoration d of the graph so that d(p)=b; that is, so that b is the set which decorates the distinguished node. This permits us to state Aczel's Anti-Foundation Axiom (AFA), which characterizes hypersets:

     Every apg pictures a unique set.

According to this definition, all the sets of standard set theory are still sets. But there are other sets too. Anything which is a set according to this definition, but not the classical definition is a hyperset. For example, consider again the following graph:

        A

There is only one node, so it must be the distinguished node. What is the unique set pictured by this apg? It must be the set A = {A}! However, according to the ordinary axioms of set theory, no set can contain itself.

    Hypersets can contain themselves. One might at first think that this would lead to contradictions, but Aczel has shown that if ordinary set theory is consistent, so is set theory augmented by AFA. In addition, he has proved a very useful result called the Solution Lemma. Roughly speaking, the Solution Lemma states that every system of equations in indeterminates x,y,z,..., say

    x=a(x,y,...)

    y=b(x,y,...)

    ...,

has a unique hyperset solution.

    For a deep mathematical treatment of hypersets, the reader is referred to Aczel's (1988) original monograph on the subject. However, the clearest discussion of the fundamentals of hyperset theory which I have found is ina delightful little book by Jon Barwise and John Etchemendy entitled The Liar (1988). This book contains a more rigorous statement of the Solution Lemma.

    As an example, let us construct a function which maps itself into itself -- a function so that f = f(f). If f takes no other arguments besides itself, then by the standard definition f = { (f,f) } = { { f, {f} } }. f is then the solution of the system of equations

w = {f}

x = {f,w}

f = {x}

Graphically, one has

            w        f

                    x        

More generally, if f=f(y) and if is only defined on y, f is the solution of

w={f}

x={y,w}

f={x}

And it is clear that, in a similar manner, one may cast any system of expressions of the form

f1(f1,...,fn,x1,x2,...) = fi(1)

...                                      (*)    

fn(f1,...,fn,x1,x2,...) = fi(n)

in the form required by the Solution Lemma and thus obtain a hyperset solution.

    Getting back to Self-Modifying Systems, I suggest that component-systems should be conceptualized in terms of systems of hyperfunctional equations of the form (*) given above; and, more generally, hyperrelational equations defined in a similar way.

7.2.1.2. Fuzzy Hypersets (*)

    Although fuzzy sets are now commonplace in artificial intelligence, so far as I know fuzzy hypersets have never before been discussed. Fortunately, therewould appear to be no particular problems involved with this useful idea: the basic mathematics of fuzzy hypersets, at least as far as I have worked it out, is completely straightforward.

    The simplest example of a fuzzy hyperset is the set x defined by:

    dx(x) = c,

    dx(y)=0 for y not equal to x.

Here, if c=0, one has an ordinary well-founded set, namely the empty set. If c=1, one has the set x={x}. Otherwise, one has something inbetween the empty set and x={x}.

     Each fuzzy hyperset is characterized by a fuzzy apg, which is exactly like an apg except that each link of the graph has a certain number in [0,1] associated with it. The Fuzzy AFA then states that each fuzzy apg corresponds to a unique fuzzy set. It is easy to see that the natural analogue of the Solution Lemma holds for fuzzy hypersets. And, of course, the consistency of fuzzy hypersets with the axioms of set theory (besides the axiom of reducibility) follows trivially from the fact that each fuzzy hyperset is, in fact, a hyperset under AFA.

7.2.2. Self-Generating Systems

    Kampis's examples of component-systems are both relevant and elegant: mobile interattracting LEGO blocks, enzyme systems, self-modifying robots,.... However, for reasons given above, I do not find his formal definition of component-system entirely satisfactory. Thus I think it is worthwhile to define a closely related type of system called a "self-generating system."

    A self-generating system, at each time, consists of a collection of components which are modelable as "finitely given hyperrelations" -- meaning that they are defined by their actions on a finite number of different possible components. Each component may be thought of as having a certain degree of membership in the system, with the constraint that the total degrees of all the components should be finite.

    Self-generating dynamics is defined as a two-stage process. First, universal action: each component acts on each other component with a certain probability, yielding different new components with different probabilities. Then, transformation: these resultant components are transformed in some way, yielding a new collection ofcomponents. The results of transformation are then fed back into the first step, and used as fodder for universal action.

    The transformation rule may be stochastic -- for instance, it may make errors. It may change f into w by mistake 4% of the time. Or, on the other hand, it may simply make a random addition or deletion to the definition of each component a certain percentage of the time. Given f which does not act on g at all, it may randomly define the action of f on g, or it may define the action of f on g by some complex formula combining deterministic and stochastic elements. By use of randomness, the transformation rule could generate dynamics that are fundamentally unpredictable, in the sense of being non-Turing-computable.

    The connection between self-generating systems and component-systems is quite simple. Not every component-system is a self-generating system; but I propose that every physically useful component-system is actually a self-generating system. Note that, since the definition of self-generating systems is phrased in terms of hyper systems, it is perfectly natural for the number of components to shift in the course of evolution.

    Now, the converse is not true: not every physically useful self-generating system is a component-system. The reason is that the self-generating dynamical equation is capable of describing totally deterministic processes. Component-systems can only be obtained if the function R is allowed to contain random elements, although rather good simulations of component-systems can be obtained using chaotic or "pseudorandom" deterministic functions. Thus the class of component-systems and the class of self-generating systems possess a nontrivial intersection. I propose that real complex systems lie in this intersection.

7.2.2.1. A More Formal Treatment (*)

    Define an hyperrelation to be finitely given if its associated apg is finite, and the labels of its associated apg are all encodable as finite binary sequences. Line up the set of all finitely given hyperrelations in some arbitrary order: f1, f2, f3,.... Given this ordering, an hypersystem may be defined as an infinite vector C = (p1,...,pn,...), where the pi are nonnegative and the sum p1 + p2 + ... + pn + ... is finite. (In functional-analytic lingo, the set of hypersystems is therefore isomorphic to the space l1+).

    The entry pi is to be interpreted as the degree with which fi belongs to the hypersystem represented by C. For instance, in the context of enzyme systems, pi would denote the concentration of the enzyme fi in the solution in question. In the deterministic case, an important but special situation, all the pi are assumed to be 1 or 0.

    Also, I will use the notation xit to denote "inanimate objects": entities which can be acted on, but cannot act. I will not refer to these very often in the following, but they may be useful in certain applications.

    To each hypersystem Systemt, associate a set of "action products"

        A[Systemt] = {fit(fjt), fit(fjt,xkt)}

Here the indices i and j run over all hyperrelations that have a nonzero probability of membership in Systemt.

    Next, define a "filtering operation" Yt which determines, based on the degrees of the elements in Systemt, the degrees of the elements in A[Systemt]. The only restriction is that, if either f and g have degree zero in Systemt, then Yt cannot assign f(g) or f(g,x) a nonzero probability. In the deterministic case, Yt is only a formality, for it may be assumed that all probabilities are either 1 or 0. But in the most extreme fuzzy case, it may come about that A is the formality, leaving Yt to do most of the work.

    Finally, one may collapse these two operations into one composite operation R[Systemt] = Yt(A[Systemt]) -- the "Raw Potentiality" operation. Then, where T is some stochastically computable function mapping hypersystems to hypersystems, one may define a self-generating system as an iteration

    Systemt+1 = T( R[Systemt] ),      (**)

Here the hypersystem System1 is considered to be given a priori, yielding a dynamic iteration.

7.2.3. Hypersets and Physical Reality

    One would like to think of component-systems and self-generating systems as models of real physical complex systems. At first sight, however, there seems to be a serious obstacle in the way of this interpretation. Hyperrelations are peculiar set-theoretic objects. Formally, in Aczel's construction they are defined asequivalence classes of sets of ordinary sets (this construction is somewhat analogous to the construction of real numbers as equivalence classes of Cauchy sequences of rationals). This places them in a cardinality class far, far above the countable computable sets, and also far, far above the Hilbert- space-defined sets which quantum mechanics associates with physical reality.

    However, interpreted properly, a finite system of finitely given hyperrelations does not violate the stochastic Church-Turing thesis, since any such system of equations can be simulated on a stochastic computer. A stochastic computer can never actually contain hyperrelations, but if they are finitely given, it can simulate their behavior easily enough. After all, manipulations with finitely given hyperrelations are merely manipulations with finite graphs!

    Physically, what does this mean? While quantum physics does not permit the existence of physical hypersets, it does permit physical events that are effectively modeled as finite labeled graphs. Now, suppose that the interactions of some of these physical events can be modeled as interactions between finite labeled graphs, and that these graph interactions are usefully describable using self-generating systems or component-systems. Then these mathematical systems are emergent patterns in physical reality. No contradiction. No problem. Physical reality can simulate component-systems; or, to put it another way, the reality of component-systems can be understood as a "virtual reality" running on the hardware of quantum reality.

    In sum, I suggest that Kampis's picture of complex system behavior is fundamentally right. Complex systems consist of components that act on one another to create new components. Thus they effectively violate the hierarchy of logical types; they contain emergent patterns which are usefully modelable in terms of stochastic systems of hyperrelations.

    But, on the other hand, as already mentioned, Kampis's theory of creativity is flawed. The creativity of a complex system is due both to the unfolding of the rules implicit in its components, and to the mutation of these rules by random error. The opposition which Kampis has set up, between computation and component-systems, is in my opinion a false one. The difference between simple systems and complex systems is not that the former are computable, but that the latter contain emergent structures which are modelable in terms of stochastic hyperfunctional iterations (self-generating systems). The concept of self-generating systems makes this point in a very clear way.

7.2.3.1. A Binary Model of Hypersets (*)

    To make this conclusion more concrete, let me construct a specific computational scenario which gives rise to hypersets as a natural model. Suppose one has a finite collection of computable relations f, g, h,..., each of which maps binary sequences into binary sequences. Then one may represent each relation by a finite code sequence, e.g.

sf = 010100101111010010...01

sg = 010111101001001010...10

....

And one may define the action f(g) as the result of the following two-step process:

    1) letting the program f act upon the sequence sg, producing a new sequence sfg.

    2) Decoding sfg into a program, by selecting the first (in alphabetic order) from among all programs h for which the Hamming distance d(sh,sfg) is at its absolute minimum. This "first closest" program is taken as f(g).

    The relations f, g and h are computable relations -- but there is aboslutely nothing wrong with thinking of them as hyperrelations, acting directly on each other. The whole system may be elegantly modeled as a system of hyperrelations, without ever referring to the underlying bit-string manipulations. This requires no new information, only a shift in point of view.     I will refer to this system for deriving hyperrelations from computable relations as the basic computational model.

7.3 MAGICIANS AND ANTIMAGICIANS

Coauthored with Harold Bowman

    Our treatment of self-generating systems has up to this point been purely formal. However, one may also describe self-generating systems in much a less mathematical way, in the guise of self-referential sentences (Hofstadter, 1985). For example, suppose that the complete formal definitions of the hyperrelations f, g and h are given by:

f(f) = g, f(g) = h, f(h) = f, f(g,h) = g

g(g) = g, g(f) = h, g(f,h) = f

h(f,g) = h, h(g,h) = g, h(h) = f

This looks terrible. To make it a little prettier, let us rename "f," "g," and "h" as "Fanny," "Geronimo" and "Hattie." Then one may represent this same collection of definitions as follows:

This sentence, which is named Fanny, turns itself into Geronimo, it turns     Geronimo into Hattie, it turns Hattie into itself, and it turns the pair 'Geronimo and Hattie' into Geronimo.

This sentence, which is named Geronimo, turns itself into itself, it turns Fanny     into Hattie, and it turns the pair 'Fanny and Hattie' into Fanny.

This sentence, which is named Hattie, turns itself into Fanny, it turns the pair     'Geronimo and Hattie' into Geronimo, and it turns the pair 'Fanny and Geronimo' into Hattie.

This says the same thing as the preceding group of mathematical definitions, but it is a little more colorful. The best way to visualize the situation is to think of Fanny, Geronimo and Hattie as a group of three over-active magicians. Each one has a spell to turn each one into someone. For instance, Fanny has a spell to turn herself into Geronimo; she has a spell to keep Hattie the same as she was, and the has a spell to turn the combined group Geronimo/Hattie into Geronimo only.

    Recall that a self-generating system is a stochastically computable rule for evolving populations of finitely given hyperrelations. This is a very general definition; it leaves a lot of freedom. For starters, therefore, let us consider the simplest possible situation. Given a collection of hyperrelations {f,g,h,...}, one can form a vast variety of "compounds" of the form f(g), f(g,h), h(g), and so forth. One of the very simplest self-generating systems says: "given a collection of hyperrelations, replace it with the collection of all compounds which one can form from it."

    For instance, one may think about self-generating systems in the context of our earlier Fanny/Hattie/Geronimo example. Each magician had spells for changing magicians (or group of magicians) into othermagicians. So the simplest self-generating system involving these magicians consists of

    1) each magician applying all the spells she knows, thus creating a new collection of magicians

    2) the new collection of magicians then applying all the spells they know

    3) etc.

    It is easy to see that, in the case of our simple example, if one starts with all the magicians present, then all the magicians are so proficient that the group of three magicians will persist forever. If one starts with just Fanny and Geronimo, they will immediately produce Hattie, and the same will be true. Similarly, if one starts with just Fanny and Hattie, they will immediately create Geronimo. But on the other hand, if one starts with Geronimo and Hattie, the two of them will never be able to produce Fanny. Or if one starts with just Fanny, she will immediately turn herself into Geronimo, who will then perpetuate herself forever....

    A group of somewhat less proficient magicians is provided by the following set of rules:

f(f) = g

f(g) = f

g(g) = g

g(f) = h

h(f) = g

If one takes

{f,g}                time = 1

this rule produces the collection {f(f),f(g),g(f),g(g)} = {g,f,g,h} = {f,g,h}, or

{f,g,h}            time = 2

And iterated once again, the rule produces {f(f),f(g),f(h), g(f),g(g),g(h),h(f)}, or

{f,g,h}            time = 3

In this particular case, after two steps, our dynamical rule has reached a fixed point. No matter how many times one keeps iterating, one will keep on obtaining {f,g,h}.

    In "magician-language", what one has here is

This sentence, whose name is Fanny, turns itself into Geronimo and turns     Geronimo into Fanny.

This sentence, whose name is Geronimo, turns itself into itself and turns     Fanny into Hattie

This sentence, whose name is Hattie, turns Fanny into Geronimo

Starting from Fanny and Geronimo, as above, one will immediately get all three magicians.

    More generally, a self-generating system is simply a rule which determines a "range" collection of hyperrelations from the compounds formed by another "domain" collection of hyperrelations. In these simple examples, I have taken collection of compounds itself as the range collection. But this is not the only way to do things. As an example, one could consider the rule: "given a collection of hyperrelations, replace it with two hyperrelations randomly drawn from the collection of all compounds which one can form from it." The reader may determine for herself the possible evolutionary courses which our collection {f,g} may take under this rule.

7.3.1. Antimagicians

    Both of the "magician" examples discussed above were of the simplest kind. All possible compounds were generated and kept. However, this type of self-generating system does not appear to be capable of generating particularly complex behaviors. To get the full range of dynamical behaviors, one must provide some way for compounds to eliminate one another. For instance, in addition to our three faithful magicians Fanny, Geronimo and Hattie, one may introduce three antimagicians, called anti-Fanny, anti-Geronimo and anti-Hattie. And one may modify the rules of our game accordingly. At each time step, the following three processes are executed:

first, all magicians cast all their spells

second, all magicians whose anti-magicians have been created are eliminated

third, all anti-magicians are eliminated

    For instance, suppose one has

This sentence, named Fanny, turns itself into Geronimo, turns Geronimo into     Hattie, and turns Hattie into anti-Fanny

This sentence, named Geronimo, turns itself into Fanny, turns the pair 'Fanny     and Geronimo' into Hattie, and turns Hattie into anti-Hattie

This sentence, named Hattie, turns itself into Hattie, turns Fanny into Hattie,     and turns Geronimo into Fanny

Without anti-magicians, Hattie would be self-perpetuating. But the anti-magicians change all that. Suppose one starts with

Geronimo and Hattie     time 1

Then this evolves into the interim population

Fanny, Hattie, anti-Hattie

The magician and its anti-magician self-destruct, leaving

Fanny                 time 2

Then Fanny creates Geronimo, who creates Fanny, who creates Geronimo, and so on ad infinitum...

Geronimo             time 3

Fanny                 time 4

Geronimo             time 5

Fanny                 time 6

Geronimo             time 7

Fanny                 time 8

...                 ...

    On the other hand, suppose one starts with

Fanny, Geronimo         time 1

Then one has

Geronimo, Fanny, Hattie     time 2

and from this one gets the interim population

Fanny, anti-Fanny, Geronimo,

Hattie, anti-Hattie

resulting in

Geronimo             time 3

and yielding the same attracting cycle as before:

Fanny                 time 4

Geronimo             time 5

Fanny                 time 6

Geronimo             time 5

Fanny                 time 6

...                 ...

    This sort of behavior, with Hattie and Fanny appearing and then disappearing, is much much easier to set up with anti-magicians than without then. I will show a little later exactly how much more computational power is yielded by the introduction of anti-magicians.

    To get even more interesting behavior, one must stochasticize the dynamics. For example, one could replace the three processes of our "antimagician" iteration with the following three processes, to be executed at each time step:

first, each spell of each magician is cast with a certain fixed

    probability (call it p)

second, all magicians whose anti-magicians have been created are eliminated

third, all anti-magicians are eliminated

This creates an unpredictable iteration: if one runs it several times, one may obtain many different results, because there is no telling which spells will be chosen. The reader is encouraged to explore the consequences of "stochasticizing" our Fanny/Geronimo/ Hattie example.

7.3.2. Self-Generation and Computation

    So self-generating systems can be considered as models of real systems. But what kind of behavior can they model? In fact it is not too hard to prove that they can model any kind of behavior at all. Harold Bowman and I have constructed a very simple argument which shows that self-generating systems are capable of universal computation. This means that any possible behavior can be mimicked by some self-generating system.

    Specifically, as hinted above, it turns out that simple systems of the Fanny/Geronimo/Hattie variety are not enough. One needs to introduce anti-magicians as well. But if one does this, then one very easily obtains a recipe for constructing a self-generating system tosimulate any given computer. The basic idea is that systems with anti-magicians give one the ability to express the two fundamental operations of conjunction (AND) and negation (NOT). Since all computers can be built of AND and NOT gates, it follows (with a little work) that this type of system is a universal computer.

    The AND gate is easy; it can be done without anti-magicians. Let's say one wants Fanny to create Josie only if both Geronimo and Hattie are present. Then one needs merely say

This sentence, named Fanny, turns the pair 'Geronimo and Hattie' into Josie

By simply not specifying Fanny to turn Geronimo individually or Hattie individually into Josie, one makes Fanny into an AND function.

    Of course, in the context of the whole system, it is possible that Geronimo or someone else will turn Geronimo or Hattie into Josie -- but if one wants Fanny to be a good AND, one must design one's system to prevent this from happening at the same time that Fanny is operating as an AND.

    Incidentally, it is worth noting that

This sentence, named Fanny, turns Fanny into Fanny, turns the pair 'Geronimo and     Hattie' into Josie, and turns Hattie into Geronimo

serves the purpose of executing the AND operation as well. Extra spells are allowed, so long as they do not interfere.

    To get NOT, on the other hand, one must proceed as follows. Let's say one wants Fanny to create Hattie only if Geronimo is not present. Then one needs to specify

This sentence, named Fanny, turns itself into Hattie, and turns Geronimo into anti-    Hattie

If Geronimo is not present, then Fanny produces Hattie, so Hattie is introduced into the next population (assuming no one else is out there producing anti-Hatties). But if Geronimo is present, then Fanny still acts on itself to produce Hattie, but it also acts on Geronimo to produce anti-Hattie. The two cancel out, and one is left with no Hattie (assuming no one else is out there producing Hatties).

7.3.3. Imperfectly Mixed Computation

    The biggest lesson of the computer revolution is that by piecing ANDs and NOTs together one can do just about anything. The reasoning of the past few paragraphs, elaborated appropriately, leads to the consequent conclusion that self-generating systems (in particular "antimagician" systems") can do just about anything.

    But this is just the tip of the iceberg. The next question is, what about stochastic "antimagician" systems? What if, at each stage, only a certain percentage of possible compounds are formed? This is the case, for example, in real chemical solutions: not every conceivable compounds forms at every moment. In chemical parlance, deterministic self-generating systems correspond to infinitely "well-mixed" solutions, whereas stochastic self-generating systems correspond to the more realistic case.

    It turns out that, even in the stochastic case, it is possible to construct "antimagician" systems which carry out universal computation -- to within any specified level of accuracy short of perfection. The trick is an appropriate use of redundancy. For instance, if instead of just doing NOT with one hyperfunction, one does it simultaneously with a sufficiently huge number of hyperrelations, one is bound to get it right an arbitrarily high percentage of the time.

    What this result shows is that one can build a viable computer in which each connection between components has only a certain probability of existing. One may build a computer out of "components" that circulate around each other, sometimes combining with one another to produce new components, sometimes not. Computation can self-organize from an imperfectly-mixed-up substrate.

    Applied to biological and psychological systems, this conclusion would seem to have profound consequences. The dual network model views the mind as a collection of processes, interacting with one another and constantly creating new processes. The ideas of this sections suggest that these general "processes" may perhaps be fruitfully modeled as interlocking self-referential statements -- as simple statements about how other processes, and they themselves, are to be transformed. This is an intriguing insight, and an important step on the path to the "cognitive equation" of Chapter Eight.

7.4. ARRAY COMPONENT-SYSTEMS (*)

    In Section 7.3 I gave a simple "reductionist" model of hyperset dynamics -- the "basic computational model." In this section I will briefly digress to describe a more interesting elaboration of the same fundamental concept. Instead of mapping functions into sequences in an arbitrary way, I will demonstrate how one might elegantly systematize the coding and decoding of functions and sequences.

7.4.1. Array Operations

    Let us begin with the concept of a rational array. An n-dimensional rational array may be defined inductively as a finite sequence of (n-1)-dimensional rational arrays, where a 1-dimensional rational array is just a finite sequence of rational numbers. Our eight basic operations will be operations on rational arrays.

    The most relevant examples are one, two and three-dimensional arrays; however, it is quite possible to envision psychological uses for arrays of higher dimension. For instance, spacetime is four-dimensional, and a five-dimensional array could therefore represent a scalar field over spacetime, an eight-dimensional array a vector field over spacetime, etc.

    Each rational array A comes equipped with a natural coordinate system, so that each rational stored in A has a unique coordinate vector (a1,...,an), ai a nonnegative rational. This coordinate system imposes a natural alphabetic order on the elements of A, which one may extend to subsets of A by defining subset B to come before subset C if B - C contains a point which comes before any point in C - B in the alphabetic ordering.

    Human sensory inputs can be expressed very naturally in terms of rational arrays. For instance, light on the retina forms a two-dimensional array; and sound waves on the eardrum form a one-dimensional array. Muscle movements can also be easily expressed in terms of rational arrays: when one has different amounts of stimulus sent to different points, the different points can be envisioned as elements of a three-dimensional rational array. And, to take a biological example, the interactions between proteins can also be effectively expressed in this way: the surface of a protein is just a rational array.

    In general, any continuous field can be approximated to within arbitrary accuracy by an rational array. For example, if one wants to approximate a field on the positive hyperoctant of Rn with 5 digits of accuracy, one may divide the positive hyperoctant of Rn up into alatticework of cubes of side 10-6, and construct an n-dimensional rational array containing one element for each cube.

    Of course, given sufficiently complex codings, one can dispense with the whole formalism of rational arrays and consider only binary sequences. But here I am not thinking in terms of abstract algorithmic information, I am rather thinking in terms of concrete information processing systems, for which "sufficiently complex codings" can present formidable practical obstacles.

7.4.1.1. Pointwise Operations

    The first four of our nine operations are addition, multiplication, negation, and maximization, which are defined pointwise. More explicitly, let A and B be two rational arrays. Then the sum of A and B is A+B, the product of A and B is written AB, the maximum of A and B is written A^B, and the negation of A is written -A. If a has coordinates (a1,...,an) in A, and a' has coordinates (a1,...,an) in B, then,

a+a' has coordinates (a1,...,an) in A + B,

aa' has coordinates (a1,...,an) AB,

max(a,a') has coordinates (a1,...,an) in A^B, and

-a has coordinates (a1,...,an) in -A.

if A and B are of different sizes, then (a1,...,an) is assumed to exist in A + B, AB and A^B only if it exists in both A and B.

    It is worth noting that this collection of operations is redundant in two ways. One the one hand, by combining negation and maximization one can generate any Boolean function, and thus any computable function, including addition and multiplication. Secondly, by combining multiplication and addition, one can generate any polynomial, and hence approximate any continuous function, including the maximum function, to within arbitrary accuracy. However, our goal here is not to give a minimal set of operations; it is to give an exhaustive set of basic operations.

7.4.1.2. Combinatory Operations

    Addition, multiplication, negation and maximization are all pointwise operations. Now I will introduce two operations that act on whole arrays rather than on an entry-by-entry basis.

    First, the cut-and-paste operator is a ternary operation which may be written C(A,B,S), where A and B are general rational arrays and S is a sequence of nonnegative integers. The expression C(A,B,S) is to be read: paste B into A, placing the first entry in B into position S in A.

    More explicitly, what this means is as follows. Suppose S = (s1,...,sk). Then if a has coordinate (s1 - r1,...,sk - rk) in A, where the rj are all nonnegative integers, then a has the same coordinate in C(A,B,S). But if b has coordinate (s1 + r1 - 1,...,sk + rk -1) in B, where the rj are all nonnegative integers, then b has the same coordinate in C(A,B,S).

    For instance, C( (1,2,3,4,5,6,7), (9,9,9,9,9,9,9,9), 5) =

(1,2,3,4,9,9,9,9). The number 5 indicates that the elements 5-8 of the sequence B = (9,9,9,9,9,9,9,9) are pasted onto the elements 1-4 of the sequence A = (1,2,3,4,5,6,7).

    Cut-and-paste also permits us to build higher-dimensional arrays out of lower-dimensional ones. For example, one has

C( (1,2), (3,4), (2,1) ) = 1 2

                    3 4

The coordinate (2,1) gives the point at which the array (3,4) is "pasted" onto the array (1,2).

    In general, as the name suggests, cut-and-paste permits us to form new arrays by combining parts of different old arrays.

    Next, the reduce operation allows one to take part of an array and consider it as an array in itself. This is of obvious utility as an adjunct to the cut-and-paste operation: it allows one to paste in parts of arrays rather than just whole arrays. The simplest way to define the reduce operation is as R(A,S,T), where A is an arbitrary rational array and S and R are lists of nonnegative integers, with the property that the i'th entry in S never exceeds the i'th entry in S. Write S = (s1,...,sn), R = (t1,...,tn). Then R(A,S,T) is the array composed of all elements of A whose coordinates lie "between" the arrays S and T. Explicitly, if a has coordinate (a1,...,an) in R(A,S,T), this means that a has coordinate (a1+s1-1,...,an+sn-1) in A, and ai + si < ti.

    Finally, substitution is a ternary operation which may be denoted by S(A,B,C), to be read: substitute A for B, everywhere B appears in C. The meaning of this isobvious in simple cases, for instance S(6,(3,4),(1,2,3,4,3,4,5,3,4)) =

(1,2,6,6,5,6).

    In general, there is an ambiguity here: what if two appearances of B overlap in C? However, this can be resolved by the following rule: if there are two appearances of B in C, and it is not possible to substitute A for both of them, substitute A for that appearance of B which occurs first in C.

    Note that substitution is a special instance of cut-and-paste. However, it is a very important instance. A large percentage of the patterns that we recognize are repetitions. For instance, the whole Behaviorist school of psychology is based on the recognition of repeated stimulus-response associations! As we shall see, repetitions can be easily expressed in terms of substitution.

    A special case of substitution is "change of notation." For instance, S(2,3,A) is the operation of replacing every 3 in A with a 2. The inclusion of substitution as a basic operation guarantees that no specific notation or "encoding" is essential to human thought.

7.4.1.3. Random Generation

    Our eighth operation, random generation, is the simplest of all. It may be defined as R(A), where A is a one-dimensional integer array. Its function is to create a random array of dimensions given by A, whose entries each have an equal probability of being either zero or one. This operation could be simulated fairly well in terms of the other operations, by using standard pseudo-randomness techniques; but for theoretical purposes I prefer to introduce true stochasticity.

    The choice of a 50% chance of a 0 or 1 in each entry is purely a matter of convention. Using the operation R(A) together with the previous seven operations, one may construct arrays so that each entry a has a different probability pa, and one may choose the pa's to be arbitrarily close to any number in [0,1].

7.4.1.4. Decoding

    Finally, let us consider the operation of decoding. This is in a way the most fundamental operation of all. Let us assign each one of our fundamental operations a code number, according to the following arbitrary scheme:

addition = 1

multiplication = 2

negation = 3

maximization = 4

substitution = 5

cut-and-paste = 6

decoding = 7

random generation = 8

Next, let us arbitrarily assign the number 9 to stand for "open parenthese," and the number 10 to stand for "close parenthese." Finally, the integers greater than 10 will be understood to denote variables. Since the substitution operator is fundamental, the specifics of the encoding do not matter; they can always be renamed.

    Given this encoding, many integers sequences may be understood as sequences of operations.     This prepares us to define the decoding operator. Where A is a sequence (a one-dimensional integer array), and B1,...,Bk are arbitrary rational arrays, D[A,B1,...,Bk] is the array obtained by applying the sequence of operations encoded in A to the arrays B1,...,Bk, where the variable 'j' in A is taken to refer to the array Bj+10.

    For sake of generality, two notational conventions will be required. Not every sequence A yields a well-defined sequence of operations; but if the operation D is given a sequence A which does not yield a well-defined sequence of operations, it will be understood to give output consisting of the 0-dimensional array "0". Also, if A contains m>k variable names, D[A,B1,...,Bk] may be defined by D[A,B1,...,Bk,0,...,0], where each 0 represents an array of appropriate size and dimension containing all zero entries.

    Two simple examples are:

D[(1,9,9),(1,2,3)] = (1,2,3) + (1,2,3) = (2,4,6)

and

D[(1,9,2,10,9),(1,2,3),(4,2,1)] = (1,2,3) + (4,2,1)(1,2,3) = (5,6,6)

7.4.2. Array Component Systems

    Now, using these nine operations, I will give an example of a new kind of self-generating system -- a type of system called an array component system, or ACS. Let us begin with a list of arrays Vi of the form

    Vi = (Ai, Bi1,...,Bim(i,t)),

    t = 0,1,2,...; i = 1,2,...,N(t)

where the Ai are code sequences, and the Bij are rational arrays. Each such array Vi may be associated with a hyperfunction

    H(Vi) = fi,

defined by the equation

    fi(fjs) = H( D[Ai,Bi1,...,Bjm(j,t),Ajs, Bj1s,...,Bjm(j,t)s] )

The fi are the components.

    In other words, each component is associated with a code sequence, and a bunch of arrays. Applying one component to another means applying the code sequence of the first component to the list of arrays consisting of the first component's arrays, the second component's code sequence, and the second component's arrays. According to the conventions delineated above, it is possible for each component to act on other components which contain different numbers of arrays. "Missing" arrays are simply treated as zero arrays, and the control sequences Ai may potentially contain conditional expressions indicating how to deal with zero arrays.

    ACS's illustrate in a very concrete way how hyperrelations may emerge as natural models of systems which are, in themselves, quite well-founded and computable. There is nothing in any way mysterious about the Vi, nor about the idea that the various Vi can act on one another. But in order to express this idea mathematically, one cannot use ordinary functions; one needs to use hyperrelations. To put it another way: in order to express patterns relating to the interaction of Vi, the only efficient course is to use hyperrelations.

7.4.3. Immune Systems as ACS's

    Let us briefly consider a simple example, which will be taken up more thoroughly in Chapter Ten: immunodynamics.

    The immune system is complicated as well as complex, and it contains many different kinds of cells. But the simplest mathematical models deal only with B-cells, and that is what I will do here. Let us begin with the approach of de Boer et al (1990), in which each antibody type in the immune system is associated with an integer sequence of length N. To be realistic, of course,antibodies should be modeled as three-dimensional rational arrays, since they are three-dimensional objects; but for the points I am making here, it is immaterial whether antibodies are associated with 1-D or 3-D arrays.

    In the 1-D model, one may think of a B-cell as a pair Vi = (A,Bi), where Bi is an integer sequence, and A is an integer code sequence. The code sequence specifies what happens when one forms fi(fj), or in other words when one forms

    A(Bi,A,Bj).

    Specifically, what happens most of the time when fi(fj) is formed is nothing. But if the conditions are right, the effects can be drastic. Define the raw match between two B-cells Vi and Vj as the maximum number of consecutive bits in which the corresponding sequences Bi and Bj are different. And define the match between two sequences as max{0, raw match(Bi,Bj) - T}, where T is some given threshold. In terms of component-systems, then, one may think of the dynamics of the immune system as specifying how a B-cell Vi acts on those B-cells Vj for which match(Bi,Bj) is large, thus causing the creation of new antibodies.

    There is a great deal of biological subtlety involved here. In the crudest formal model, however, what happens is as follows: if fi(fj) is formed and match(Bi,Bj) is large, then with a certain probability, the cell Vj is killed by the cell Vi (in fact, this killing takes place indirectly, via antibodies; but I do not need to consider these details here). But when the proportion of B-cells with shape Bj that are killed falls within a certain critical range, then cells of this shape are stimulated to reproduce. New B-cells are created.

    Some of these new cells are identical to Vj, and some are new types Vk, which have have shape sequences similar, but not identical to Bj. This is somatic mutation. It is the creation of new B-cells by certain old B-cells acting on other old B-cells. There is randomness in the process because there is no deterministic way of telling exactly which new types of B-cells will be created.

    This B-cells-only model is an extreme oversimplication. But the more accurate models are similar in spirit. Cells in the immune system act on one another, thus stimulating one another to produce new cells. Sometimes these new cells are copies of old cells, but sometimes they are structurally novel. Thisis a simple example of a component-system; and I have indicated in a rough way how it can be modeled using systems of stochastically computable hypersets.