From Complexity to Creativity -- Copyright Plenum Press, 1997

Back to " From Complexity to Creativity" Contents

Part III. Mathematical Structures in the Mind





    Language is a highly complex system. Modern linguistics, however, pays little if any explicit heed to this complexity. It reflects the complexity of language, in its labyrinthine theorizing. But it does not attempt to come to grips with this complexity in any concrete way. Rather than focussing on abstract, emergent structures, it revels in the intricate, subtlely patterned details.

    In this chapter I will explore the parallels between language and other complex systems, in the specific context of sentence production. I will use a general mathematical model called L-systems to give a detailed psycholinguistic model of sentence production. The resulting model of sentence production has many relations with previous models, but also has its own unique characteristics, and is particularly interesting in the context of early childhood language. It fits in very naturally with the psynet model and the symbolic dynamics approach to complexity.

    L-systems were introduced by Aristid Lindenmayer to model the geometry and dynamics of biological growth (see Lindenmayer, 1978). Recent work in biology and computer graphics (Prusinciewicz and Hanan, 1989) has demonstrated their remarkable potential for the generation of complex forms, especially the forms of herbaceous plants. But, although L-systems were originally inspired by the theory of formal grammar, the possibility of their direct relevance to linguistic phenomena has not been considered.

    In the model given here, the production of a sentence is viewed as the iteration of an L-system. The L-system governs the process by which mental structures are progressively turned into sentences through a series of expansions and substitutions. At each stage of the iteration process, the need for accurate depiction of mental structures is balanced against the need for global simplicity and brevity, a type of "global iteration control" which is similar to that required for L-system modeling of woody plants (Prusinciewicz and Hanan, 1989).


    The basic idea of an L-system is to construct complex objects by successively replacing parts of a simple object using a set of rewrite rules. The rewriting is carried out recursively, so that structures written into place by a rewrite rule are subsequently expanded by further application of rewrite rules. The crucial difference between L-systems and the Chomskygrammars used in linguistics is the method of applying rewrite rules. In Chomsky grammars, rules are applied sequentially, one after the other, while in L-systems they are applied in parallel and simultaneously replace all parts of the object to which they are applied. This difference has computation-theoretic implications -- there are languages which can be generated by context-free L-systems but not by context-free Chomsky grammars. And it also, we will argue, has psycholinguistic implications in the context of sentence production.

    Before presenting formal details we will give a simple example. Suppose one is dealing with strings composed of "a"'s and "b"'s, and one has an L-system containing the following two rewrite rules:

X -> Y

X -> XY

Finally, suppose one's initial axiom is "a." Then the process of L-system iteration yields the following derivation tree:

Time        System State

0        X

1        Y

2        XY

3        YXY

4        XYYXY

5        YXYXYYXY



In going from time t to time t+1, all elements of the system at time t are replaced at once.

    This example points out a terminological conflict -- in the L-system literature, the system elements are referred to as letters, and the system states themselves, the strings, are referred to as words; but in our application to sentence production, the system states will be sentences and the individual system elements will be "words" in the usual sense. To avoid confusion we will eschew the usual L-system terminology and speak either of "words" and "sentences" or of "system elements" and "system states."

    More formally, let S be a set called the "collection of system elements," and let S* be the set of all system states constructible from arrangements of elements of S. In typical L-system applications S* is, more specifically, the set of all finite, nonempty sequences formed from elements of S. A context-free L-system is defined as an ordered triple <S,w,P>, where w is an element of S* called the axiom (X in the above example) and P, a subset of S x S*, is the set of rewrite rules. In accordance with the conventions of set theory a rewrite rule should be written (X,m), where X is a system element and m is a system state, but the notation X -> m is more intuitive and, following much of the literature, we will use it here. In this formulation X is the predecessor and m is the successor. If no rewrite rule is explicitly given for some specific predecessorX, it is assumed that the identity rule X->X applies to that system element.

    An L-system is deterministic if each system element a serves as the predecessor in only one rewrite rule. Stochastic L-systems allow system elements to appear in any number of rules, but each rule is associated with a certain probability. For instance, resuming our earlier example, one might have the rule set

X -> Y with probability .5

X -> YX with probability .5

Y -> XY with probability .8

Y -> Y with probability .2

This collection of rules will yield a different derivation every time. Each time an X comes up in the derivation, a random choice must be made, whether to replace it with Y or with YA. And each time a Y comes up a choice must be made whether to replace it with ab or to leave it alone.

    Next, a context-sensitive L-system has a more general set of rewrite rules. For instance, one may have rules such as

X<X>X -> X

Y<X>X -> Y

where the brackets indicate that it is the central X which is being rewritten. According to these rules, if the central X is surrounded by two other X's, it will be left alone, but if it is preceded by a Y and followed by an X, it will be replaced by a Y.

Turtle Graphics

    To apply L-systems to an applied problem, one must figure out a suitable interpretation of the strings or other system states involved. In the case of sentence production the interpretation is obvious, for, after all, L-systems are a modification of Chomsky grammars; but things are not so clear in the case of biological development, where many different interpretations have been proposed and used to advantage.

    For the purpose of constructing computer simulations of plant development, the turtle graphics interpretation has been found to be particularly useful. In this interpretation each element of a string is taken to correspond to a command for an imaginary "turtle" which crawls around the computer screen. The simplest useful command set is:

F    Move forward a step of length d

+    Turn right by angle a

When the turtle moves forward it draws an line and hence from a sequence of F's and +'s a picture is produced. But other commands are also convenient and the list of commands quickly multiplies, especially when one considers three-dimensional growth processes. Much of the recent work involves parametrized L-systems, with system elements such as F(d) and +(a), which include real number or real vector parameters.

    Using only very simple rewrite rules one may generate a variety of complicated forms. For instance, the rules



generate the fractal picture shown in Figure 5a. The command

-    Turn left by angle a

is used in addition to F and + , and the angle parameter is set at

a = 60 degrees

To produce this picture the L-system was iterated 4 times beginning from the axiom


The string resulting from this iteration was used to control a simulated "turtle": each time the turtle encountered an F it moved one unit forward, each time it encountered a + or - it performed an appropriate rotation, and each time it encountered an X or a Y it ignored the symbol and proceed to the next meaningful control symbol.

    Two additional control symbols which are extremely useful for modeling plant growth are the left and right brackets, defined as follows:

[    Push the current state of the turtle onto a pushdown stack

]    Pop a state from the stack and make it the current state of

    the turtle

This adds an extra level of complexity to the form generation process, but it does not violate the underlying L-system model; it is merely a complicated interpretation. The brackets permit easy modeling of branching structure. The turtle can draw part of a branch, push the current state onto the stack, then follow a sequence of commands for drawing a sub-branch, then pop back the previous state and continue drawing the original branch. Figure 5b displays examples of plant-like structures generated from bracketed L-systems. Figure 6 gives a more realistic-looking example, in which leaves generated by B-spline patches are arranged according to the path of a three-dimensional turtle.

    These models produce high-quality computer graphics, but they are not merely clever tricks; their efficiency and accuracy is due to their relatively accurate mimicry of the underlying biological growth processes. Plant growth is one of many biological phenomena which are governed by the process of repeated concurrent substitution that lies at the essence of the L-system model.

    The approach we have described is quite effective for modeling the development of herbaceous, or non-woody plants. Woody plants display the same fundamental branching structure, but they are more complex. First of all there is secondary growth, which is responsible for the gradual increase of branch diameter with time. And secondly, while for herbaceous plants genetic factors are almost the sole determinant of development, for woody plants environmental factors are also important. Competition between branches and competition between trees are both significant factors. Because of these competition factors, a substitution cannot be applied without paying heed to the whole structure of the tree at the current time. The success of a substitution depends on the "space" left available by the other branches. The course of a branch may change based on the positions of the other branches, meaning, in the turtle graphics interpretation, that the turtle can change position based on collisions with its previous path.

    Sentence production displays the same kind of complexity as the modeling of woody plants. The development of a sentence is not entirely internal, it can be influenced by the simultaneous development of other sentences (i.e. the sentences that will follow it). And there will often be competition between different parts of a sentence, each one vying to be further developed than the others. These complexities give a unique flavor to the L-system modeling of sentence production; but they are not an obstacle in the way of such modeling.


    Now let us turn the the main topic of the chapter, the application of L-systems to psycholinguistic modeling. Much less attention has been paid to the problem of language production than to the related problem of language acquisition (Anderson, 1983; Chomsky, 1975). Nevertheless, as many theorists have noted (Anisfeld, 1984; Butterworth, 1980; Jackendoff, 1987), no one has yet formulated an entirely satisfactory theory of sentence production. The concept of L-system, I will argue, is an important step on the way to such a theory. Sentence production is well understood as a process of messy, parallel, biologcal-fractal-style development.

    Before presenting the L-system approach to sentence production, however, we must first deal with a few preliminary issues, regarding the logical form of language and, especially, the relation between syntax and semantics in the language production process. These are troublesome, controversial issues, but they cannot be avoided entirely.

    Perhaps the first detailed psychological model of language production was that of the German neurologist Arnold Pick (1931). Pick gave six stages constituting a "path from thought to speech":

1) Thought formulation, in which an undifferentiated thought is     divided into a sequence of topics or "thought pattern," which is a "preparation for a predicative arrangement ... of actions and objects."

2) Pattern of accentuation or emphasis

3) Sentence pattern

4) Word-finding, in which the main content words are found

5) Grammatization -- adjustments based on syntactic roles of     content words, and insertion of function words

6) Transmission of information to the motor apparatus

    Pick's sequential model, suitably elaborated, explains much of the data on speech production, especially regarding aphasia and paraphasia. And it can also be read between the lines of many more recent multileveled models of speech production (Butterworth, 1980). For instance, Jackendoff's "logical theory" of language production rests on the following assumption:

    The initial step in language production is presumably the formulation of a semantic structure -- an intended meaning. The final step is a sound wave, which is a physical consequence of motions in the vocal tract. The job of the computational mind in production is therefore to map from a semantic structure to a sequence of motor instructions in the vocal tract. Again, the logical organization of language requires that this mapping be accomplished in stages, mapping from semantic structure to syntax, thence to phonology, thence to motor information.

Here we have three stages instead of six, and a rhetoric of computation, but the basic model is not terribly dissimilar.

    Jackendoff implicitly connects Pick's stages of sentence production with the standard ideas from transformational grammar theory (Radford, 1988). Chomsky's "deep structure" corresponds to the emphasis pattern and sentence pattern of Pick's steps 3 and 4; whereas Chomsky's "surface structure" is the result of Pick's step 5. Grammatical transformations take a deep structure, a primary, abstract sentence form, and transform it into a fully fleshed-out sentence, which is then to be acted on by phonological and motor systems.

    From this sequentialist perspective, the L-system model of sentence production to be presented here might be viewed as an explanation of precisely how grammatical transformations are applied to change a deep structure into a surface structure. In other words, they explain the transition from steps Pick's 3 and 4 to step 5; and they are internal to Jackendoff's "syntactic component" of the production system. I will call this the purely grammatical interpretation of the L-system model.

    On the other hand, one may also take a broader view of the process of sentence production. It seems plain that there is a substantial amount of overlap between Pick's steps 1-5. The "two" processes of idea generation and sentence production are not really disjoint. In formulating an idea we go part way toward producing a sentence, and in producing a sentence we do some work on formulating the underlying idea. In fact, there is reason to believe that the process of producing sentences is inseparable from the process of formulating thoughts. Nearly two decades ago, Banks (1977) described several experimental results supporting the view that

    [S]entences are not produced in discrete stages. Idea generation is not separable from sentence construction.... In typical speech production situations, sentences and ideas are produced simultaneously in an abstract code corresponding to speech. That is to say, our response mode determines the symbolic code used in thinking and influences the direction and structure of thought. Thought is possible in several modes, but one of the most common is speech.... [I]dea generation and sentence production represent the same functional process.

This view ties in with the anthropological theories of Whorf (1949) and others, according to which thought is guided and structured by language.

    The L-system model to be given here is to a large extent independent of the thought/language question. Under the "purely grammatical" interpretation, it may be viewed as a model of Pick's stages 2-5, i.e. as an unraveling of transformational syntax into the time dimension. But, under what I call the grammatical-cognitive interpretation, it may also be taken more generally, as a model of both thought formulation and sentence production, of all five of Pick's initial stages considered as partially concurrent processes. In order to encompass both possible interpretations, I will keep the formulation as general as possible. In the remainder of this section I will first explore the grammatical-cognitive interpretation in a little more detail, and then present a general model of language which is independent of the choice of interpretation.

Models of Conceptual Structure

    The "grammatical-cognitive" view of sentence production is closely related with Jackendoff's (1990) theory of conceptual semantics. According to conceptual semantics, ideas are governed by a grammatical structure not so dissimilar from that which governs sentences. For instance, the sentence

1) Sue hit Fred with the stick

would be "conceptually diagrammed" as followed:


|     ACT(SUE,FRED)     |



P1 R     P2    

     | |


The idea here is that semantic roles fall into two different tiers: one "thematic," dealing with motion and location; the other "active," dealing with agent and patient relations.

    This semantic diagram is very different from the standard syntactic diagram of the sentence, which looks something like


     / \    

/ \

NP             VP

| / \

|         V'        PP

| | |

|         V        NP    

| | |

Sue        hit        Fred with the stick

The problem with Jackendoff's theory is the connection between syntactic structures and semantic structures. Some simple correspondences are obvious, e.g. agents go with subjects, and patients go with objects. But beyond this level, things are much more difficult.

    Bouchard (1991) provides an alternative which is much less problematic. Bouchard's theory is based on an assumption called the universal bracketing schema, which states that "two elements may be combined into a projection of one of the two, subject to proper combinatory interpretation." The basic operation of conceptual semantics is thus taken to be projection. For instance, the sentence "Sue hit Fred with the stick" is diagrammed as follows:

[[[x[CAUSE[x[GO[TO y]]]] [with z]] [Time AT t] [Place AT p]]

where x = SUE, y = FRED, z = STICK

Each set of brackets groups together two entities, which are being jointly projected into one member of the pair. Bouchard's "relative theta alignment hypothesis" states that rank in this conceptual structure corresponds with rank in syntactic structure. The highest argument in the conceptual structure is linked to the subject position, the most deeply embedded argument is linked to the direct object position, and so on.

    The details of Jackendoff's, Bouchard's and other formal semantic theories are complicated and need not concern us here -- I will approach conceptual structure from a slightly different perspective. The key point, for now, is that the grammatical-cognitive connection is not a vague philosophical idea; it is a concrete hypothesis which has a great deal of linguistic support, and which has been explored in the context of many different concrete examples.

A General Axiomatic Model of Language

    In accordance with the above remarks, I will assume a very general and abstract model of linguistic and psycholinguistic structure. This model emanates directly from the psynet model, which views mental entities as a loosely-geometrically-organized "pool," freely interacting with each other, subject to a definite but fluctuating geometry.

    Given a finite set W, called the collection of "words," I will make seven assumptions. The first five assumptions should be fairly noncontroversial. The final two assumptions are onlynecessary for the grammatical-cognitive interpretation of the model, and not for the purely grammatical interpretation; they may perhaps be more contentious.

    We will assume:

    1) that there is a collection of linguistic categories, inclusive of all words in the language.

    2) that there is a finite collection of "basic linguistic forms," each of which is made up of a number of linguistic categories, arranged in a certain way. These forms may be represented as ordered strings (i1,...,in) where ik denotes category Ck. Standard examples are N V and N V N, where N = noun and V = verb.

    3) that there is a finite collection of rewrite rules fi, which allow one to substitute certain arrangements of categories for other arrangements of categories (e.g. N -> Adj N allows one to replace a noun with an adjective followed by a noun).

    4) that there is a space M of "mental structures" which arrangements of words are intended to depict. Examples from human experience would be pictures, memories, sounds, people, etc. In computer models these might be graphs, bitmaps, or any other kind of data structure.

    5) that there is a metric, implicit or explicit, by which sentences may be compared to elements of M, to determine the relative accuracy of depiction.

    6) that, referring back to (2), there is a collection of functions {gi, i=1,...,r} mapping M into M, so that when the arguments of fi are close to the arguments of gi, the value returned by gi is close to the value returned by fi (where "closeness" is measured by the metric guaranteed in Assumption 5).

    7) that there exist in the mind mixed "linguistic forms" which are partly composed of words and partly composed of other mental structures. These mixed structures are constructed by a combination of the linguistic rules fi and the semantic rules gi.     Assumptions 6 and 7, as stated above, are needed only for the grammatical-cognitive interpretation of the model. They are a formalization of the idea that sentence production and idea generation are two aspects of the same fundamental process. Because they are principles and not specific diagramming methods, they are less restrictive than the theories of Jackendoff and Bouchard, discussed above. However, they express the same fundamental intuitions as the diagramming methods of these theorists.

    Assumption 6 is a version of Frege's principle of compositionality which has been called the principle of continuous compositionality (Goertzel, 1994). Frege argued that, when we compose a complex linguistic form from simple ones using certain rules, the meaning of the complex form is composed from the meaning of the simple ones using related rules. Assumption 6 may be interpreted as a reformulation of the principle of compositionality in terms of a pragmatic account of "meaning," according to which the meaning of a sentence S is the collection of mental structures which are close to S under the metric guaranteed by assumption 5.

    Assumption 6 is necessary if one is going to postulate a continual "switching back and forth" between linguisticstructures and other mental structures. If one adopts Bouchard's model of conceptual structure then Assumption 6 immediately follows: her relative theta alignment hypothesis implies a close alignment of conceptual and syntactic structure. On the other hand, Jackendoff would seem to like Assumption 6 to hold for his conceptual grammar, but it is not clear whether it does (my suspicion is that it does not).

    Finally, Assumption 7 follows from the ideas of Banks, as quoted above; it also fits in nicely with Bouchard's model, in which semantic entities and syntactic entities are often roughly interchangeable. Indeed, phenomenologically speaking, this assumption is almost obvious: one may easily work out one part of thought in verbal form while leaving the other part temporarily vague, not explicitly linguistic.


    In this section, I will model sentence production as an iterative process which involves three key subprocesses:

1) expanding linguistic forms into more precise linguistic forms,     using L-system rewrite rules

2) substituting words for other mental structures, and

3) seeking at all times to minimize sentence length and     complexity

Assumptions 1 and 2 represent simple magician dynamics; assumption 3 is a constraint on magician dynamics which is posed by the fact that we are dealing the pattern/process magicians. Long, complex sentences are unlikely to be patterns in situations; short situations are more likely to be.

    The model is a complicated recursive process, and is not well depicted in words; thus I will present it in a kind of "pseudocode" as a collection of three interlinked procedures:

    sentence produce(mental structure T)

            First, choose a basic linguistic form with which to express the structure T. This form need not be filled out with words, it may be filled out with mental structures. It is an "emphasis pattern" and "sentence pattern"; a "deep structure."

            The basis for choosing a form is as follows: one chooses the form F which, after carrying out the process A = wordify( expand (F)), yields the sentence giving the best depiction of the structure T (as measured by the "metric" guaranteed by Assumption 5).

            The process returns the sentence A produced when, in the "choice" phase, F was given to wordify( expand()) as an argument.

    sentence wordify(linguistic form F)

            Any words in linguistic form F are left alone. Any mental structures T in linguistic form F are replaced by words or phrases which are judged to matchthem. These words or phrases may be expanded by expand to improve their match to T; but the chance of calling on expand decreases sharply with the length and complexity of the sentence obtained.        

    linguistic form expand(linguistic form F)

            This process acts on the all components of F, in parallel, using:

                a) various rewrite rules fi and gi.

                b) where appropriate, the process wordify

        The choice of a) or b), and the choice of transformation rules within a), is partly pseudo-random and largely dependent on analogy with previously produced sentences.

            For each rewrite rule fi or gi that one applies, one obtains a new linguistic form G. The acceptability of the result G is judged by the following criterion: Does the greater accuracy of depiction obtained by going from F to G outweigh the increased complexity obtained by going from F to G?

            In determining the accuracy of depiction and complexity obtained in going from F to G, one is allowed to apply expand again to G; but the chance of doing this decreases sharply with the length and complexity of the sentence involved.        

    Notice that the recursivity of expand and wordify could easily lead to an endless loop (and, in computational models, a stack overflow!) were it not for the stipulated sharp decrease in the probability of recursion as sentence length increases. Most sentences are not all that long or complex. An unusually high degree of elaboration can be seen, for example, in the sentences of Marcel Proust's Remembrance of Things Past, which sometimes contain dozens of different clauses. But this kind of elaboration is both difficult to produce, and difficult to understand. As a rough estimate, one might say that, in the process of producing a typical sentence, none of these processes will be called more than 5 - 20 times.

    The purely linguistic interpretation of the model assumes that the process produce will be given a well-worked out idea to begin with, so that no further use of the conceptual rewrite rules gi will be required. The grammatical-cognitive interpretation assumes that the argument of produce is vague, so that applications of the gi must coexist with applications of the fi, producing grammatical forms and clarifying ideas at the same time. More and more evidence is emerging in favor of the interpenetration of thought and language, a trend which favors the grammatical-cognitive interpretation. However, as stated above, the L-system model itself is independent of this issue.

    I have presented this model of production as a set of procedures, but this communicative device should not be taken to imply that the psychological processes involved are necessarily executed by rule-based procedures in some high-level language. They could just as well be executed by neural networks or any other kind of dynamical system.

    For instance, it is particularly easy to see how these linguistic processes could be carried out by a magician system. One need only postulate "wordification" magicians for transforming mental structures into words or phrases, and "transformation" magicians for transforming words into phrases. The above sentence production algorithm then becomes a process of applying elements of an magician population to one's basic linguistic form, while retaining at each stage copies of previous versions of one's linguistic form, in case current magician activities prove counterproductive. This interpretation fits in nicely with the psynet model. The focus here, however, is on the language production process itself and not its implementation as an autopoietic magician system.

A Simple Example

    To better get across the flavor of the model, in this section I will give a very simple "thought-experiment" regarding a particular sentence. I will show how, using the L-system model, this sentence might be produced. Consider the sentence:

2) Little tiny Zeb kisses the very big pine tree

    The first stage of the process of producing this sentence might be a mental structure expressible in terms of the basic linguistic form


and loosely describable as

(this baby here)(is touching with his mouth)(this tree here)

The entities in parentheses denote mental structures rather than linguistic structures.

    Now, how is this basic form expanded into a sentence? Each of the three elements is transformed, at the same time, by calls to wordify and expand. Wordification turns the thoughts into words or phrases, and expansion turns the words or phrases into longer phrases.

    For instance, one possible path toward production might be immediate wordification of all components. The wordify process might thus transform the form into the pre-sentence

Zeb kisses tree

Expansion rules then follow: first, perhaps,

N -> Det N

N -> Adj N

lead to

Little Zeb kisses the tree

The expansion process goes on: the rule

Adj -> Adj Adj

is applied twice to yield

Little tiny Zeb kisses the big tree

and finally

Adj -> Adv Adj

is applied, giving the target sentence.

    An account such as this ignores the "trial and error" aspect of sentence production. Generally "dead-end" possibilities will be considered, say

V -> Adv V

leading to

Little Zeb really kisses the tree

But these alternatives are rejected because they do not add sufficient detail. The addition of really, it may be judged, does not make the sentence all that much more similar to the original mental form -- it does not add enough similarity to compensate for the added complexity of the sentence.

    This example account of development follows the pure grammatical interpretation of the L-system model, in which expansion takes place only on the linguistic level. From the perspective of the grammatical-cognitive interpretation, however, this is only one among many possibilities. There is no need for the whole thing to be wordified at once. It may be that, instead, the development process follows a sequence such as

(this baby here)(is touching with his mouth) big tree

(this baby here)(is touching with his mouth) the big tree

Zeb (is touching with his mouth) the tree

Tiny Zeb (is touching with his mouth) the big tree

Tiny Zeb kisses the very big tree

Little tiny Zeb kisses the very big pine tree

Or, on the other hand, actual ramification may occur on the level of conceptual rather than syntactic structure:

(this baby here)(is touching with his mouth) big tree

(this little baby here)(is touching with his mouth) the big tree

(this really very very little baby here)

                (is touching with his mouth) the tree

Little tiny Zeb (is touching with his mouth) the big tree

Little tiny Zeb kisses the very big tree

Little tiny Zeb kisses the very big pine tree

    Under either interpretation, this is a very messy model: the whole sentence keeps expanding out in all directions at once, ateach time considering many possible expansions, accepting some and rejecting others. But this kind of haphazard, statistically structured growth is precisely the kind of growth that one sees in nature. It is exactly the kind of growth that one would expect to see in the productions of the language centers of the brain -- for the brain is, after all, a biological system!

Bracketed L-Systems, Reflexives, and Sentences

    The L-system expansion process is completely parallel; it makes no reference to the linear order in which the sentence is to be understood by the listener/reader. In some cases, however, this linear order is crucial for sentence understanding. Consider, for example:

3) the city's destruction by itself

3a) * itself's destruction by the city

The only difference here is that, in the legal sentence (3), the antecedent precedes the pronoun, while in the illegal sentence (3a), the antecedent follows the pronoun. This reflects the general rule of English grammar which states that reflexives cannot have independent reference, but must take their reference from an antecedent which is compatible (e.g. in case, gender and number). When producing a sentence such as this, the substitution rules involved must be order-dependent in a complex way, just as the rule N -> Adj N is order-dependent in a simple way.

    The phenomenon of reflexives may be very nicely understood in terms of bracketed L-systems. The idea is that the word itself is itself a "close-bracket" command, while the word city is follows by an implicit"open-bracket" command, so that (3) might be rewritten as

the city['s destruction by ]

Upon encountering the "[" the listener pushes the current state, city's, onto the stack, and upon encountering the next "]" the listener pops this off the stack and inserts it in the appropriate position.

    The trick, of course, is that the "[" is not explicitly there, so that what one really has is

the city's destruction by ]

where the listener must determine the position of the "[" based on syntactic and semantic rules and semantic intuition. The listener must also determine precisely what "state" should be pushed onto the stack: generally this is the coherent phrase immediately following the "[", but the identification of phrase boundaries is not always obvious.

    The illegal sentence (3a), on the other hand, is characterized by a close-bracket which does not match any possible preceding open-bracket positions. In

]'s destruction by the city

there is simply no preceding position into which the "[" might be placed! If this phrase were embedded in a larger sentence there would be candidate positions for the "[" but, in all probability, none of these would be satisfactory, and the sentence would still be illegal.

    A more complex construction such as

3) the city's terrible sewer system's destruction of itself's     broadcasting of itself throughout the region through propagation of a terrible odor

can also be cast in a similar way, i.e.

the city's [ terrible sewer system['s destruction of ]'s broadcasting of ] throughout the region through propagation of a terrible odor

But this is difficult to understand, reflecting no doubt the small stack size of human short term memory.

    This view of reflexives is closely related to Kayne's (1981) proposal that a sentence must define a unambiguous path through semantic space. The bracketed L-system model makes this "unambiguous path" proposal more concrete by connecting it with stack automata. The use of bracketed L-systems here is somewhat different from their use in computer graphics models of plant growth -- but the underlying interpretation in terms of a stack is the same, and so the parallel is genuine. In each case the brackets are a formal way of indicating something which the relevant biological system does without any explicit, formal instructions.


    The L-system model is a complex one, and detailed empirical validation will be a lengthy process. Some evidence supporting the model, however, may be found in the patterns of early childhood language. Young children use simpler transformation rules, and have a lower "complexity ceiling," factors which result in in much shorter and simpler utterances, in which the underlying production dynamics more apparent.

    The beginning of early childhood grammar is the two-word sentence. One attempt to explain two-word utterances is based on the grammatical formula

S -> NP VP

VP -> V NP

But, as pointed out by Anisfeld (1984), this formula is psychologically problematic. According to this formula, the V and the NP would seem to be tied more closely together than the NP and VP; but this is not what the experiments reveal. In fact the NP and the V are the more closely associated pair.

    This "problem" is resolved immediately by the L-systemmodel. The transformational grammar approach suggests the iteration

0    NP VP

1    NP V NP

driven by the substitution rule VP -> V NP. The L-system approach, on the other hand, suggests that the mental structure playing the role of the generation 0 VP is tentatively represented by a verb V, and the mental structure playing the role of the generation 0 NP is tentatively represented by a noun N. Then the verb is expanded into a verb phrase. Thus the iteration is better represented

0    N V

1    N V N

driven by the rule V -> V N. The reason N V is a more natural combination is because it occurs at an earlier step in the derivation process.

    More concrete evidence for the L-system view of child language is given by the phenomenon of replacement utterances. Braine (1971) observed that, in recordings he made of the speech of children 24-30 months old, 30-40% of the utterances produced were "replacement utterances," or sequences of more and more complex utterances spoken one after another in response to the same situation. The following are typical examples:

Stevie byebye car.

Mommy take Stevie byebye car.

Stevie soldier up.

Make Stevie soldier up, Mommy.

Car on machine.

Big car on machine.

Stand up.

Cat stand up.

Cat stand up table.

According to Anisfeld (1984), the children are unable to construct the final, expanded forms all at once, so they resort to a process of gradual construction. The initial utterance represents the most important information that the child wants to transmit -- usually a predicate but occasionally a subject. Later utterances add on more details.

    Viewed from the perspective of the L-system model, these "replacement sequences" become a kind of window on the process of sentence production. What these children are doing, the model suggests, is vocalizing a process that we all go through when producing a sentence. The utterances in a replacement sequence all express the same underlying mental structure. But, as the speaker proceeds through the sequence, the constraint on sentence length and complexity gradually becomes more relaxed, so that theiterative process of sentence production is cut off at a later and later point.

    For instance, the final example given above, "Cat stand up table," has a form that can be roughly given as

0    V

1    N V

2    N V N

(it is assumed that "stand up," to the child, is a single predicate unit). When the initial utterance "Stand up" is produced the need for brevity is so great that additional transformations are summarily rejected. The next time around, a number of possible transformation suggest themselves, such as V -> N V or V -> V N. Each one has a certain probability based on its usefulness for describing the situation as balanced against the complexity it adds to the sentence. The transformation V -> N V is selected. At the next stage, then, further transformations are suggested, say perhaps N -> Adj N or V -> Adv V, in addition to the two which arose at the previous stage. One of these is selected, according to the stochastic, parallel progressive substitution process, and then, since the length and complexity threshold has been reached, the sentence is produced.

    One final comment is in order. I have used the familiar categories N and V for these examples, but the L-system model is not restricted to these familiar categories. Braine (1976) has argued that the transformational approach to childhood grammar is misguided. Instead of using general grammatical rules, he suggests, young children produce utterances using an overlapping collection of narrowly constrained special-case rules. Evidence for this view is provided by the failure of linguists to provide a grammatical account of beginning speakers' two-word utterances. These ideas may contradict some of the narrower interpretations of transformational grammar, but are unproblematic for the L-system approach. An overlapping collection of special-case rules can be perfectly well captured by a nondeterministic, stochastic L-system (which is, broadly speaking, a kind of formal grammar). Children may well begin with special-case substitution rules, and gradually abstract more general rules involving categories like N and V. The question of whether there is a grammar is separate from the question of the specificity of the substitution rules.