AI Buddha versus AI Big Brother,
Voluntary Joyous Growth,
the Global Brain Singularity Steward Mindplex,
and Other Issues of Transhumanist Ethical Philosophy
This essay is relatively brief, but its theme extremely large: how to manage the development of technology and society, in the near to mid-term future, in such a way as to maximize the odds of a positive long-term future for the universe.
My conclusions are uncertain, but bold. I believe that the era of humanity as the “Kings of the Earth” is almost inevitably coming to an end. Unless we bomb or otherwise destroy ourselves back into the Stone Age or into oblivion, we are going to be sharing our region of the universe with powerful AI minds of one form or another. Potentially depending on decisions we make in the near or moderately near future, this may or may not lead to a fundamental alteration in the nature of conscious experience in our neck of the woods: a Transcension. And the dangers to humanity may be significant – an issue that must be very carefully considered.
I conclude that there are two strong options going forward, which I associate with the catch-phrases “AI Buddha” and “AI Big Brother.” More verbosely, these correspond to the alternatives of
· Creating an AI based on some variant of the principle of “Voluntary Joyous Growth,” and allowing it to repeatedly self-modify and become vastly superintelligent, having a potentially huge impact on the universe and posing dangers to the human race that must be carefully studied and managed
· Creating an AI dictator with stability as a main goal, to rule the human race, ensuring peace and prosperity and guaranteeing that no human creates overly advanced, “dangerous” technologies
Not surprisingly, I have a tentative preference for the Voluntary Joyous Growth scenario, but I believe that much more research (mostly, research with “primitive” AI’s that are nevertheless much more advanced than any AI’s we currently possess) is needed to fully understand the risks and rewards of each option.
My analysis is based on a few key assumptions. Chiefly, I assume that:
· The broad and rapid advance of human science and technology will continue to increase
· Once human science and technology have advanced adequately, “radical futurist” technologies such as artificial general intelligence (AGI), molecular nanotechnology (MNT), pharmacological human life extension and genetic engineering of wildly novel organisms
I recognize that these assumptions are not incontrovertibly true. There could be as-yet-unknown physical limits preventing the development of the radical futurist technologies; or, as I already noted, the human race could knock itself back to the Stone Age or oblivion or some other nontechnological condition. However, I think these assumptions are highly likely to be true; and they’re the premise for much (though not all) of the discussion to follow.
These assumptions are related to the notion of the “Singularity,” as introduced by Vernor Vinge in the 1980’s and more thoroughly developed by a host of recent futurist thinkers. To the reader who is unfamiliar with this breed of futurist thinking, I recommend the following works as prerequisites for the present discussion:
· Damien Broderick’s book The Spike
However, the points I’ll discuss here don’t necessarily require a Singularity as defined by these thinkers; they merely require something weaker that – borrowing a word from Damien Broderick’s novel of that name, and from some of John Smart’s writings -- I call a Transcension. A Singularity is a particular kind of Transcension, but not the only kind.
The basic idea of the Singularity is that, at some point, the advance of technology will become (from a human perspective) essentially infinitely rapid, thus bringing a fundamental change in the nature of life and mind. A key aspect of the Singularity concept is technological acceleration. Historical analysis suggests that the rate of technological increase is itself increasing – new developments come faster and faster all the time. At some point this increase will come so fast that we don’t even have time to understand how to use the N’th radical new development, before the N+1’s radical new development has come. Eventually technological progress will lead to the creation of powerful AI’s, and these AI’s, rather than humans, will be carrying out the bulk of technology development – thus allowing new innovations to emerge at superhuman pace. At this point, when dramatic new technologies and new ways of thinking develop daily or hourly, so fast that humans literally can’t keep up, the technological Singularity will be upon us.
Another aspect of the Singularity idea is psychological: the Singularity is envisioned as a radical transition in the nature of experience, not just technology.
When civilization and language and rational thought emerged, the nature of human experience changed radically. Or, to put it another way, the “human experience” as we now know it emerged from the experience of proto-human animals.
But there is no good reason to believe that the emergence of the modern human mind is the end state of the evolution of psyche. Indeed, the rub is this: While evolution might take millions of years to generate another psychological sea change as dramatic as the emergence of modern humanity, technology may do the job much more expediently. The technological Singularity can be expected to induce rapid and dramatic change in the nature of life, mind and experience.
That’s Singularity; what about Transcension? The basic idea of the Transcension is that at some point, the advance of technology will bring about a fundamental change in the nature of life and mind. The difference is that a Transcension can occur even if there is no exponential or superexponential growth in technology. It could occur, eventually, even with a linear or logarithmic advance in technology. In fact, I think that a Singularity scenario is extremely likely; but the points I’m going to make here are mostly valid for any Transcension, no matter how fast it occurs. Perhaps the biggest difference between the Transcension and Singularity concepts is that, if the Singularity idea is correct, then the Singularity is near and we’d better start worrying about it fast; whereas if a Transcension is going to occur 10,000 years from now, there’s no particular need for us to fuss about it at the moment.
The term “Singularity” tends to place an emphasis on the rapidity of change that is induced by exponentially or hyperexponentially accelerating advances in technology. And indeed, the suddenness or otherwise of the coming change is a very important practical point. However, the technologies involved – exciting as they are -- should be viewed mainly as enablers. The key point is that we may soon be experiencing a profoundly substantial change in the “order of being”. The point is that the way we experience the world, the way we human animals live life and conduct social affairs, is not the end state of mind-in-the-universe, but only an intermediate state on the way to something else. And the Transcension to this “something else” may well occur sooner rather than later.
But what is this something else? This is where things get interesting. One might contend that, even if we are on the verge of something far beyond our current ways of thinking, living and experiencing, our limited and old-fashioned human brains really don’t stand much chance of envisioning this new order of things in any detail. On the other hand, it seems, it would be foolish to not even try.
In fact, it seems quite possible that actions we take now may play a major role in shaping the nature of this nebulous-state-to-come, this post-Transcension, post-human order of being. One of the (many) great unknown questions of the Transcension is: how much effect does the way in which the Transcension is reached, have on the nature of mind and reality afterwards? There are many possibilities, e.g.
1. there are many qualitatively different post-Transcension states, and our choices now impact which path is taken
2. no matter what we do now, mind and reality will settle into the same basic post-Transcension attractor
3. a human-achieved Transcension will merely serve to project humans into a domain of being already occupied by plenty of other minds that have. The specifics of how humans approach the Transcension is not going to have any significant impact on this already-existent domain.
At this point, I have no idea how to assess the probabilities of these various options.
In the second two options, the only ethical question is whether the post-Transcension state-of-being will be better than the states that would likely exist without a Transcension. If yes, then we should work to bring about the Transcension – and once this is done, reality will take its course. If no, then we should work to avoid Transcension.
In the first option, the ethical choices are trickier, because some plausible post-Transcension states may be better than the states that would likely exist without a Transcension, whereas others may be worse. We then have to choose not only whether to seek or avoid Transcension, but whether to seek or avoid particular kinds of Transcension. In this case, it’s meaningful to analyze what we can do now to increase the probability of a positive Transcension outcome.
Of course, serious discussion of any of these options can’t begin until we define what a “positive” Transcension outcome really means
The following sections of the essay deal mainly with two obvious issues that come out of the above train of thought:
· What is a “positive outcome”? That is, what is an appropriate ethical or meta-ethical standard by which to judge the positivity or otherwise of a hypothetical post-Transcension scenario? A number of alternative, closely related approaches are presented here, mostly centered around an abstract notion I call the Principle of Voluntary Joyous Growth
· In the case that Option 3 above holds, then how can we encourage a positive outcome? Here my focus is on artificial general intelligence technology, which I believe will be the primary driver behind the Transcension (because it will be making the other inventions). I will argue that, in addition to teaching AGI’s ethical behavior, it is important to embody ethical principles in the very cognitive architecture of one’s AGI systems. (Specific ideas in this direction will be presented, and discussed in the context of the Novamente AI system.)
What is a good Transcension? Some people would say that the only good Transcension is a non-Transcension. These people think that using technology to radically alter the nature of mind and being is a violation of the natural order of things. But even among radical techno-futurists and others who believe that Transcension, in principle, may be a good things, there is nothing close to agreement on what it means for a post-Transcension world to be a “good” one.
For Eliezer Yudkowsky, the preservation of “humaneness” is of primary importance. He goes even further than most Singularity believers, asserting that the most likely path is a “hard takeoff” in which a self-modifying AI program moves from near-human to superhuman intelligence within hours or minutes – instant Singularity! With this in mind, he prioritizes the creation of “Friendly AI’s” – artificial intelligence programs with “normative altruism” (related to “humaneness”) as a prominent feature of their internal “shaper networks” (a “shaper network” being a network of “causal nodes” inside an AI system, used to help produce that AI system’s “supergoals”). He discusses extensively strategies one may take to design and teach AI’s that are Friendly in this sense. The creation of Friendly AI, he proposes, is the path most likely to lead to a humane post-Singularity world.
On the other hand, Ray Kurzweil seems to downplay the radical nature of the Singularity – leading up to, but not quite drawing, the conclusion that the nature of mind and being will be totally altered by the advent of technologies like AGI and MNT. At times he seems to think of the post-Singularity world as being a lot like our current world, but with funkier technology around; with AI minds to talk to and the absence of pesky problems like death, disease, poverty and madness. And clearly he sees this vision as a good one; he’s quite concerned to encourage ordinary non-techno-futurist people not to be afraid of the beckoning changes.
Damien Broderick’s novel Transcension presents a more ethically nuanced perspective. In his envisioned future, a superhuman AI rules over an Earth containing several different subregions, including
(When I read the book I for some reason assumed these humans were probably uploads unknowingly living on a simulated Earth; but when I showed Broderick an earlier version of this essay that mentioned this impression, he pointed out to me that the book clearly states the people are real bodies on the real Earth. I guess I have a serious case of simulation-on-the-brain!) Anyhow, at the end of the novel the Transcension occurs – an event in which the ruling superhuman AI mind decides that maintaining human lives isn’t consistent with its other goals. It wants to move on to a different order of being, and in preparation it uploads all humans from Earth into digital form, so it can more easily guarantee their safety and help with their development. (“Transcension” in the sense that I’m using it in this essay is a bit broader than the event in Broderick’s novel; in my terminology, his Transcension event is part of the overall Transcension in his fictional universe.)
Not all techno-futurists are as concerned with the future of human life or humane-ness. For example, the poster Metaqualia, in a series of emails on Yudkowsky’s SL4 email list, has argued for alternate positions, such as:
Clearly, given that we humans can’t agree on what’s good and valuable in the current human realm of life, it would be foolish to expect us to agree on what’s good and valuable in the post-Transcension world. But nevertheless, it seems it would be equally foolish to ignore the issue completely. It seems important to ask: What are the values that we would like to see guide the development of the universe post-Transcension?
This poses a challenge in terms of ethical theory, because for a value-system to apply beyond the scope of human mind and society, it has to be very abstract indeed – and yet there’s no use in a value-system so abstract that it doesn’t actually say anything. Thinking about the post-Transcension universe pushes one to develop ethical value-systems that are both extremely general and reasonably clear.
There may be many different value-systems of this nature; here I will discuss several of them, and their interrelationship:
Each of these is a very general, abstract ethical principle.  Specific ethical systems may come to exist, but the quality of an ethical system must be judged relative to the ethical principle it reflects. I will return to this point later.
I note again that I am only considering value-systems that are Transcension-friendly. Of course there are many other value-systems out there in the world today, and most of them would argue that the Transcension as I conceive it is ethically wrong. These value-systems are interesting to discuss from a psychological and cultural perspective, but they are not my concern in this essay.
2.1 Ethics, Rationality and Attractors
It is important to clearly understand the relationship between ethical principles and rationality. Once one has decided upon an ethical principle, one can use rationality to assess specific ethical systems as to how well they support the ethical principle. Below I will present two meta-ethical principles –
But one can’t choose a meta-ethical principle based on rationality alone either. Ultimately the selection and valuation process must bottom out in some kind of nonrational thought.
Reason is about drawing conclusions from premises using appropriate rules, whereas at the most abstract level, ethics is about what premises to begin with. We can push this decision back further and further – reasoning about ethical rules based on ethical systems, and reasoning about ethical systems based on ethical principles – but ultimately we must stop, and acknowledge that we need to make a nonrational choice of premises. I have chosen this stopping-point at the level of “abstract ethical principles” like the ones listed above.
Hume isolated this nonrational bottoming-out in “human nature,” the human version of “animal instinct.” Buddhist thought, on the other hand, associates it with the “higher self,” and the individual self’s recognition of its interpenetration with the rest of the universe and its ultimate nonexistence. My own view is that Buddhism and Hume are both partly right – but that neither has gotten at the essence of the matter. Hume is right that our hard-wired instincts certainly play a large role in such high-level, nonrational choices. And Buddhism is right that subtle patterns connecting the individual with the rest of the universe play a role here.
The crux of the matter, I believe, lies in the dynamical-systems-theory notion of an attractor. An attractor is a pattern that tends to arise in a dynamical system, from a wide variety of different preliminary conditions. A strict mathematical attractor must persist forever once entered into; but one may also speak of “probabilistic attractors” that are merely very likely to persist, or that may mutate slightly and gradually over time, etc. I think that part of “human nature” consists of peculiarities of the human mind/brain, whereas part of it consists of generic attractors that have appeared in the human psyche – or as emergents among human minds or between human minds and their environments -- because they generally tend to pop up in a lot of complex systems in a lot of circumstances.
One reason why some meta-ethics appear more convincing than others, then, is that these meta-ethics appear to be attractors: they are “universal attractors,” i.e. principles that arise as patterns in many different complex systems in many different situations. This doesn’t mean that they’re logically correct in the sense of following from some a priori assumption regarding what is good. Rather it means that, in a sense, they follow from the universe. This point will be returned to a little later.
Of course, we are still left with a selection problem, because there may be different universal attractors that contradict each other. Does the more powerful universal attractor win, or is this just a matter of chance, or context-dependent chance, or subtle factors we paltry humans can’t understand? I’ll leave off here and turn to slightly more concrete issues!
Firstly, Cosmic Hedonism refers to the ethical system that values happiness above all. In this perspective, our goal for the post-Transcension universe should be to maximize the total amount of happiness in the cosmos. Of course, the definition of “happiness” poses a serious problem, but if one agrees that Cosmic Hedonism is the right approach, one can impose the understanding of happiness as part of the goal for the post-Transcension period. The goal becomes to understand what happiness is, and then maximize it.
However, even if one had a crisp and final definition of happiness, there would be a problem with Cosmic Hedonism – a problem that I’ve come to informally refer to as the problem of the “universal orgasm.” The question is whether we really want a universe that consists of a single massive wave of universal orgasmic joy. Perhaps we do all want this, in a sense – but what if this means that mind, intelligence, life, humanity and everything else we know becomes utterly nonexistent?
The ethical maxim that I call the Principle of Joyous Growth attempts to circumvent this problem, by adding an additional criterion:
What does “growth” mean? A very general interpretation is: Increase in the amount and complexity of patterns in the universe. The Principle of Joyous Growth rules out the universal orgasm outcome unless it involves a continually increasing amount of pattern in the universe. It rules out a constant, ecstatically happy orgasmic scream.
Of course, maximizing two quantities at once is not always possible, and in practice one must maximize some weighted average of the two. Different weightings of happiness versus growth will lead to different practical outcomes, all lying within the general purvey of the conceptual Principle of Joyous Growth.
The Joyous Growth principle, without further qualification, is definitely not Friendly in the Yudkowskian sense. In fact it is definitively un-Friendly, in the sense that we humans are far from maximally happy -- and in this as well as other ways, we are basically begging to be transcended. A post-Transcension universe operating according to the Principle of Joyous Growth would not be all that likely to involve the continuation of the human race.
An alternative is to add a third criterion, obtaining a Principle of Voluntary Joyous Growth, i.e.
This means adopting as an important value the idea that sentient beings should be allowed to choose their own destiny. For example, they should be allowed to choose unhappiness or stagnation over happiness and growth.
Of course, the notion of “choice” is just as much a can of worms as “happiness.” Daniel Dennett’s recent book Freedom Evolves does an excellent job of sorting through the various issues involved with choice and freedom of will. While I don’t accept Dennett’s reductionist view of consciousness, I find his treatment of free will generally very clear and convincing.
Note that including choice as a variable along with two others implies that ensuring free choice for all beings is not an absolute commandment. Of course, given the extent to which human wills conflict with each other, free choice for all beings is not a possible opportunity. Given a case where one being’s will conflicts with another being’s will, the Voluntary Joyous Growth approach is to side with the being whose choice will lead to greater universal happiness and growth.
Voluntary Joyous Growth is not a simple goal, because it involves three different factors which may contradict each other, and which therefore need to be weighted and moderated. This complexity may be seen as unfortunate – or it may be seen as making the ethical principle into a more subtle, intricate and fascinating attractor of the universe.
2.3 Attractive Compassion
I should note that my goal in positing “Voluntary Joyous Growth” has been to articulate a minimal set of ethical principles. These are certainly not the only qualities that I consider important. For example, I strongly considered including Compassion as an ethical principle, since Compassion is, in a sense, the root of all ethics. However, it occurred to me that Compassion is actually a consequence of choice, growth and freedom. In a universe consisting of beings that respect the free choices of other beings, and that want to promote joy and growth throughout the universe, compassion for other beings is inevitable – because “being good to others” is generally an effective way to induce these others to contribute toward the joy and growth of the universe. Without the inclusion of choice, Joyous Growth is consistent with simply (painlessly) annihilating unhappy or insufficiently productive minds and replacing them with “better” ones; but assigning a value to choice gives a disincentive to dissolve “bad” minds and leads instead to the urge to help these minds grow and be joyful.
This ties in with the notion that compassion itself is a “universal attractor.” Or, a more accurate statement is: A modest level of compassion is a universal attractor. We can see this in the fact that it ensues from the combination of the universal attractors of Joy, Growth and Choice; and we can also see it in the evolution of human society. Most likely, compassion emerged in human beings because, in a small tribe setting, it is often valuable for each individual to be kind to the other individuals in the tribe, so as to keep them alive and healthy. This is the case regardless of whether the tribe members are genetically related to each other; it’s the case purely because, in many situations, the survival probability of an individual is greater if
So if humanity is divided up into tribes – because individual humans can survive better in groups than all alone – then compassion toward tribe members increases individual fitness. Compassion emerges spontaneously via natural selection, in situations where there is a group of minds which each has choice, and which (via growth) have the complexity to cooperate to some extent.
Note that absolute compassion doesn’t emerge from this tribal-evolutionary logic, but a moderate level of compassion does. Similarly, absolute compassion doesn’t come out of Voluntary Joyous Growth – but it seems that a moderate level of compassion does. It seems more likely that “moderate compassion” is a universal attractor than that “absolute compassion” a la Buddha or Mother Teresa is.
Interestingly, it’s harder to see how compassion would evolve among humans living in a large-group society like modern America. In this case, there’s not such a direct incentive for an individual to be kind to others. It may be that a population of rational-actor minds plunked into a large society would never evolve compassion to any significant degree. However, I suspect that without compassion, society would collapse into anarchy – and anarchy would give way to a tribal society … in which compassion would evolve, showing the power of compassion as an attractor once again!
Philip Sutton, on reviewing an earlier version of this essay, pointed out that I had omitted a value that is very important to him: sustenance and preservation of what already exists. On reading his comments, I reflected that I had made this omission because, in fact, this value – which I’ll call Nostalgia – is not all that important to me personally.
I myself am somewhat attached to many things that exist -- such as my family and friends, my self, my pets, Jimi Hendrix CD’s, Haruki Murakami novels and pinon pine trees and Saturday mornings in bed and long whacky email conversations, to name just a few – but I don’t consider this kind of attachment a primary value. I think it’s important that I, as a sentient being, have a choice to retain these things if they are important to me and contribute toward my self-perceived happiness. But I don’t see an intrinsic value in maintaining the past –whereas I do see an intrinsic value in growth and development.
However, I don’t see Nostalgia as a destructive or unpleasant value, and nor do I see it as contradictory with growth, joy or choice. The universe is a big place – and quite likely, many parts of it are not terribly important to any sentient being. It may well be possible to preserve the most important patterns that currently exist in the universe, and still use the remainder of the universe to create wonderful new patterns. The values of Growth and Nostalgia only contradict each other in a universe that is “full,” in the sense that every piece of mass-energy is part of some pattern that is nostalgically important to some sentient being. In a full universe one must make a choice, in which case I’ll advocate Growth … but it’s not clear whether such a thing as a full universe will ever exist. It may be that the process of growth will continue to open up ever more horizons for expansion.
2.5 Smigrodzki’s Meta-Ethic
An alternative approach, proposed by Rafal Smigrodzki in a discussion on the SL4 list, is to begin with an even more abstract sort of meta-ethic. Abstract though it is, the Principle of Voluntary Joyous Growth still imposes some specific ethical standards. On the other hand, Smigrodzki proposes a pure meta-ethic with no concrete content. In fact he proposed two different versions, which are subtly and interestingly different.
Smigrodzki’s first formulation was:
Find rules that will be accepted.
This principle arose in a discussion of the analogy between ethics and science, and specifically as an analogue to Karl Popper’s meta-rule for the scientific enterprise:
Popper’s meta-rule specifies nothing about the particular contents of any scientific theory or scientific research programme, it speaks only of what kinds of theories are to be considered scientific. Similarly, Smigrodzki’s meta-rule specifies nothing about what kinds of actions are to be considered ethical, it speaks only of what kinds of rule-systems are to be considered as falling into the class of “ethical rule-systems”: namely, rule-systems that are accepted.
One interesting thing about Smigrodzki’s meta-rule is how close it comes to the Principle of Voluntary Joyous Growth. To see this, consider first that the notion of “be accepted” assumes the existence of volitional minds that are able to accept or reject rules. So to find rules that will be accepted, it’s necessary to first find (or ensure the continued existence of) a community of volitional minds able to accept rules.
Next, observe that one version of the nebulous notion of “happiness” is “the state that a volitional mind is in when it gets to determine enough of its destiny by its own free choice.” This is almost an immediate consequence from the notions of happiness and choice. For, if happiness is what a mind wants, and a mind has enough ability to determine its destiny via free choice, then naturally the mind is going to make choices maximizing its happiness.
So, “Find rules that will be accepted” is arguably just about equivalent to “Create or maintain a community of volitional minds, and find rules that the this community will accept (thus making the community happy).”
But then we run up against the problem that not all minds really know what will make them happy. Often minds will accept rules that aren’t really good for them – even by their own standards – out of ignorance, stupidity or self-delusion. To avoid this, one wants the minds to be as smart, knowledgeable and self-aware as possible. So one winds up with a maxim such as: “Create or maintain a community of volitional minds, with an increasing level of knowledge, intelligence and self-awareness, and find rules that the this community will accept (thus making the community happy).”
Incidentally, Popper’s meta-rule of science also is susceptible to the “stupidity and self-delusion” clause. In other words, “Find conjectures that have more empirical content than their predecessors” really means “Find conjectures that seem to a particular community of scientists to have more empirical content than their predecessors” – and the meaningfulness of this really depends on how smart and self-aware the community of scientists is. The history of science is full of apparent mistakes in the assessment of “degrees of empirical content.” So Popper’s meta-rule could be revised to read “Find conjectures that have more empirical content than their predecessors, as judged by a community of minds with increasing intelligence and self-awareness.”
The notion of “increasing level of knowledge” can also be refined somewhat. What is knowledge, after all? One way to gauge knowledge is using the philosophy of science. Lakatos’s theory of research programmes suggests that a scientific research programme – a body of scientific theories – is “progressive” (i.e. good) if it meets a number of criteria, including
· suggesting a large number of surprising hypotheses, and
· being reasonably simple.
One interpretation of “increasing level of knowledge” is “association with a series of progressive scientific research programmes.”
Once all these details are put in place, my fleshing-out of Smigrodzki’s meta-rule (which may well make it fleshier than Smigrodzki would desire) becomes an awful lot like the Principle of Voluntary Joyous Growth. We have happiness, we have choice, and we have growth (in the form of growth of intelligence, knowledge and self-awareness). The only real difference from the earlier formulation of the Principle of Voluntary Joyous Growth is the nature of the growth involved: is it in the universe at large, or within the minds in a community that is accepting ethical rules?
After I presented him with this discussion of his meta-ethic, Smigrodzki’s reaction was to create a yet more abstract version of his meta-ethic, which he formulated as
"Formulate rules that make themselves into accepted rules make themselves come true"
(cause the existence of states of the universe, including conscious states, in agreement with goals stated in the rules).
"Formulate rules which, if applied, will as their outcomes have the goals explicitly understood to be inherent in these rules".
I rephrase these as
"Create goals, and rules that, if followed, will lead to the achievement of these goals"
"Create goals, and rules that, if followed, will lead to the achievement of these goals, with as few side-effects as possible."
This formulation is more abstract than – and inclusive of -- his previous proposal, which in this language was basically "Create goal-rule systems that will be accepted." To see the difference quite clearly, consider the "ethical" system:
DESTROY ALL LIVING BEINGS BY
a. STUDYING ALL OTHER LIVING BEINGS SCIENTIFICALLY TO DETERMINE HOW TO KILL THEM MOST EFFECTIVELY AND AT LOWEST RISK, AND THEN
b. KILLING THEM
2) FINALLY, KILLING ONESELF
This posits a goal and also some rules for how to achieve the goal. It is rational and consistent. So far as I can tell, it obeys Smigrodzki’s revised, more abstract meta-ethic. However, it seems to fail his former, more concrete meta-ethic, because at least among most of the sentient beings I know, it is unlikely to be accepted. (Now and then various psychopaths have of course accepted this "ethic," and attempted to put it into practice.)
So, in my view, by further abstracting his meta-ethic, Smigrodzki moved from
The difference between these lies in the key role of choice in the former (as hidden in the notion of “acceptance”). This highlights the key role of the notions of choice and will in ethics.
2.6 Joyous Growth Biased Voluntarism
Voluntary Joyous Growth, obviously, has a different relationship to Friendliness than pure Joyous Growth. Voluntary Joyous Growth means that, even if superhuman AI’s determine that joy and growth would be maximized if the mass-energy devoted to humans were deployed in some other way – even so, the choices of individual humans (whether to remain human or let their mass-energy be deployed in some other way) will still be respected and figured into the equation.
One could try to make Voluntary Joyous Growth more explicitly human-friendly by making choice the primary criterion. This is basically what’s achieved by my fleshed-out version of Smigrodzki’s meta-rule. In this version, the #1 ethical meta-principle is to let volitional minds have their choices wherever possible. Only when conflicts arise do the other principles – maximize joy and growth – come into play. This might be called “Joyous Growth Biased Voluntarism.” Joy and growth still may play a very big role here, because quite obviously, conflicts may arise quite frequently between volitional minds coupled in a finite universe. However, one can envision scenarios in which all inter-mind conflicts are removed, so that it’s possible to fulfill choices without considering joy and growth at all.
For instance, what if all the minds in the universe decide they all want to play video games and live in purely automated simulated worlds rather than worlds occupied with other minds? Then living in individual video-game-worlds of their choice may gratify them quite adequately: so they have maximum choice, but no opportunity for any factor besides choice to come into play. In this case minds may, consistently with Joyous Growth Biased Voluntarism, make themselves unhappy and refuse opportunities for growth unto eternity. In my personal judgment, this is a mark against Joyous Growth Based Voluntarism and in favor of simple Voluntary Joyous Growth with its greater flexibility. I suspect that Voluntary Joyous Growth is much closer to being a powerful attractor in the universe.
2.7 Human Preservationism and Cautious Developmentalism
A more extreme ethical principle, in the vein of Joyous Growth Biased Voluntarism, is what I call Human Preservationism. In this view, the preservation of the human race through the post-Transcension period is paramount. Where this differs from Joyous Growth Biased Voluntarism is that, according to Human Preservationism, even if all humans want to become transhuman and leave human existence behind, they shouldn’t be allowed to.
In fact, I don’t know of any serious transhumanist thinkers who hold this perspective. While many transhumanists value humanity and some personally hope that traditional human culture persists through the Transcension; transhumanists tend to be a freedom-centered bunch, and few would agree with the notion of forcing sentient beings to remain human against their will. But even so, Human Preservationism is a perfectly consistent philosophy of Transcension. There’s nothing inconsistent about wanting vastly superhuman minds and new orders of beings to come into existence, yet still placing an absolute premium on the persistence of the peculiarly human.
A (somewhat more appealing) variation on Human Preservationism is Cautious Developmentalism, a perspective I will discuss a little later on. The abstract principle here is: If things are basically good, keep them that way, and explore changes only very cautiously. In practical terms, the idea here is to preserve human life basically as-is, but to allow very slow and careful research into Transcension technologies, in such a way as to minimize any risk of either a bad Transcension or another bad existential outcome. In the most extreme incarnation of this perspective, the choice of how to approach the Transcension is deferred to future generations, and the problem for the present generation is redefined as figuring out how to set the Cautious Developmentalist course in motion.
Yudkowsky has proposed that “The important thing is not to be human but to be humane.” Enlarging on this point, he argues that 
Though we might wish to believe that Hitler was an inhuman monster, he was, in fact, a human monster; and Gandhi is noted not for being remarkably human but for being remarkably humane. The attributes of our species are not exempt from ethical examination in virtue of being "natural" or "human". Some human attributes, such as empathy and a sense of fairness, are positive; others, such as a tendency toward tribalism or groupishness, have left deep scars on human history. If there is value in being human, it comes, not from being "normal" or "natural", but from having within us the raw material for humaneness: compassion, a sense of humor, curiosity, the wish to be a better person. Trying to preserve "humanness", rather than cultivating humaneness, would idolize the bad along with the good. One might say that if "human" is what we are, then "humane" is what we, as humans, wish we were. Human nature is not a bad place to start that journey, but we can't fulfill that potential if we reject any progress past the starting point.
In email comments on an earlier draft of this paper, Yudkowsky noted that he felt my summary of his theory didn’t properly do it justice. Conversations following these comments have improved my understanding of his thinking; but even so, I’m not certain I fully “get” his ideas. So, rather than explicitly commenting on Eliezer’s Friendly AI theory here, I will introduce a theory called “Humane AI,” which I believe is somewhat similar to his approach, but may also have some differences. I will present some arguments describing difficulties with Humane AI, which may not all be problems with Friendly AI, in the sense that there may be solutions to these problems within Friendly AI theory that I don’t fully understand.
In Humane AI, one posits as a goal, not simply the development of AI’s that are benevolent to humans, but the development of AI’s that display the qualities of “humaneness,” where “humaneness” is considered roughly according to Yudkowsky’s description above. That is, one proposes “humaneness” as a kind of ethical principle, where the principle is: “Accept an ethical system to the extent that is agrees with the body of patterns known as ‘humaneness’.”
Now, it’s not entirely clear that “humaneness,” in the sense that Yudkowsky proposes, is a well-defined concept. It could be that the specific set of properties called "humaneness" you get depend on the specific algorithm that you use to sum together the wishes of various individuals in the world? If so, then one faces the problem of choosing among the different algorithms. This is a question for a future, more scientific study of human ethics.
The major problem with distinguishing “humaneness” from “human-ness” is to distinguish the "positive" from the "negative" aspects of human nature -- e.g. compassion (viewed as positive) versus tribalism (viewed as negative). The approach hinted at in the above Yudkowsky quote is to use a kind of “consensus” process. For instance, one hopes that most people, on careful consideration and discussion, will agree that tribalism although humanly universal, isn't good. One defines the extent to which a given ethical system is humane as the average extent to which a human, after careful consideration and discussion, will consider that ethical system as a good one. Of course, one runs into serious issues with cultural and individual relativity here.
Personally, I'm not so confident that people's "wishes regarding what they were" are generally good ones (Which is another way of saying: I think my own ethic differs considerably from the mean of humanity's.) For instance, the vast majority of humans would seem to believe that "Belief in God" is a good and important aspect of human nature. Thus, it seems to me, "Belief in God" should be considered humane according to the above definition -- it's part of what we humans are, AND, part of what we humans wish we were. But nevertheless, I think that belief in God -- though it has some valuable spiritual intuitions at its core – is essentially ethically undesirable. Nearly all ethical systems containing this belief have had overwhelming negative aspects, in my view. Thus, I consider it my ethical responsibility to work so that belief in God is not projected beyond the human race into any AGI's we may create. Unless (and I really doubt it) it's shown that the only way to achieve other valuable things is to create an AGI that contains such a belief system. Of course, there are many other examples besides "belief in God" that could be used to illustrate this point.
To get around problems like this, one could try to define humaneness as something like "What humans WOULD wish they were, if they were wiser humans" -- but of course, defining “wiser humans” in this context requires some ethical or meta-ethical standard beyond what humans are or wish they were.
So, in sum, the difficulties with Humane AI are
The second point here may seem bizarrely egomaniacal – who am I to judge the vast mass of humanity as being ethically wrong on major points? And yet, it has to be observed that the vast mass of humanity has shifted its ethical beliefs many times over history. At many points in history, the vast mass of humans believed slavery was ethical, for instance. Now, you could argue that if they’d had enough information, and carried out enough discussion and deliberation, they might have decided it was bad. Perhaps this is the case. But to lead the human race through a process of discussion, deliberation and discovery adequate to free it from its collective delusions – this is a very large task. I see no evidence that any existing political institution is up to this task. Perhaps an AGI could carry out this process – but then what is the goal system of this AGI? Do we begin this goal system with the current ethical systems of the human race – as Yudkowsky seems to suggest in the above (“Human nature is not a bad place to start…”)? In that case, does the AGI begin by believing in God and reincarnation, which are beliefs of the vast majority of humans? Or does the AGI begin with some other guiding principle, such as Voluntary Joyous Growth? My hypothesis is that an AGI beginning with Voluntary Joyous Growth as a guiding principle is more likely to help humanity along a path of increasing wisdom and humane-ness than an AGI beginning with current human nature as a guiding principle.
One can posit, as a goal, the creation of a Humane AI that embodies humane-ness as discovered by humanity via interaction with an appropriately guided AGI. However, I’m not sure what this adds, beyond what one gets from creating an AGI that follows the principle of Voluntary Joyous Growth and leaving it to interact with humanity. If the creation of the Humane AI is going to make humans happier, and going to help humans to grow, and going to be something that humans choose, then the Voluntary Joyous Growth based AGI is going to choose it anyway. On the other hand, maybe after humans become wiser, they’ll realize that the creation of an AGI embodying the average of human wishes is not such a great goal anyway. As an alternative, perhaps a host of different AGI’s will be created, embodying different aspects of human nature and humane-ness, and allowed to evolve radically in different directions.
2.9 Ethical Principles, Systems and Rules
My discussion of ethics has lived on a very abstract level so far -- and this has been intentional. I have sought to treat ethics in a manner similar to the philosophy of science. In science we have Popper’s meta-rule, and then we have scientific research programmes, which may be evaluated heuristically as to how well they fulfill Popper’s meta-rule: how good are they at being science? Then, within each research programme, we have a host of specific scientific theories and conjectures, none of which can be evaluated or compared outside the context of the research programmes in which they live. Similarly, in the domain of ethics, we have highly abstract principles like Smigrodzki’s meta-rule or the Principle of Voluntary Joyous Growth – and then, within these, we may have particular ethical rule-systems, which in turn generate specific rules for dealing with specific situations.
My feeling is that the specific ethical rule-systems that promote a given abstract principle in a human context are very unlikely to survive the Transcension. For instance, the standard ethics according to which modern Americans live involves a host of subtle compromises, involving such issues as
and so on and so forth. This complex system of compromises that constitutes our modern American practical ethics is not in itself a powerful attractor. It is largely in accordance with the Principle of Voluntary Joyous Growth – it tries to promote happiness, progress and choice – but I have no doubt that, Transcension or no, in a couple hundred years a rather different network of compromises will be in place. And post-Transcension, the practical manifestations of the Principle of Voluntary Joyous Growth will be very radically different.
(As an aside, it is clearly no coincidence that the Principle of Voluntary Joyous Growth harmonizes better with modern urban American ethics than with the ethics of many other contemporary cultures. More so than, say, Arabia or China or the Mbuti pygmies, American culture is focused on individual choice, progress and hedonism. And so, I’m aware that as a modern American writing about Voluntary Joyous Growth, I’m projecting the nature of my own particular culture onto the transhuman future. On the other hand, it’s not a coincidence that America and relatively culturally similar places are the ones doing most of the work leading toward the Transcension. Perhaps it is sensible that the cultures most directly leading to the Transcension should have the most post-Transcension-friendly philosophies.)
One thing this system-theoretic perspective says is: We can’t judge the modern American ethical system by any one judgment it makes – we can only judge, as a whole, whether it tends to move in accordance with the principle we choose as a standard (e.g. Voluntary Joyous Growth). And similarly, we can’t reasonably ask post-Transcension minds to follow any particular judgment about any particular situation – rather, we can only ask them to follow some ethical system that tends to move in accordance with some general principle we pose as a standard. (And this request is more likely to be fulfilled, a priori, if it constitutes a powerful attractor in the universe at large.)
Thus, I suggest that “Be nice to humans” or “Obey your human masters” are simply too concrete and low-level ethical prescriptions to be expected to survive the Transcension. On the other hand, I suggest that a highly complex and messy network of beliefs like Yudkowsky’s “humane-ness” is insufficiently crisp, elegant and abstract to be expected to survive the Transcension. Perhaps it’s more reasonable to expect highly abstract ethical principles to survive. Perhaps it’s more sensible to focus on ensuring the Principle of Voluntary Joyous Growth to survive the Transcension, than to focus on specific ethical rules (which have meaning only within specific ethical systems, which are highly context and culture bound) or the whole complex mess of human ethical intuition. Initially principles like joy, growth and choice will be grounded in human concepts and feelings -- in aspects of "humane-ness" -- but as the Transcension proceeds they will gain other, related groundings.
In terms of technical AI theory, this contrast between general principles and specific rules relates to the issue of “stability through successive self-modifications” of an AI system. If an AI system is constantly rewriting itself and re-rewriting itself, how likely is it that this or that specific aspect of the system is going to persist over time? One would like for the basic ethical goal-system of the AI to persist through successive rewritings, but it’s not clear how to ensure this, even probabilistically. The properties of AI goal-systems under iterative self-modification are basically unknown and will be seriously explorable only once we have some reasonably intelligent and self-modifiable AI systems at hand to experiment with. However, my strong feeling is that the more abstract the principle, the more likely it is to survive successive self-modification. A highly specific rule like “Don’t eat yellow snow” or “Don’t kill humans” or a big messy habit-network like “humane-ness” is relatively unlikely to survive; a more general principle like Voluntary Joyous Growth is a lot more likely to display the desired temporal continuity. I’m betting that this intuition will be borne out during the exciting period to come when we experiment with these issues on simple self-modifying, somewhat-intelligent AGI systems. And this intuition is followed up by the intuition mentioned above: that, among all the abstract principles out there, the ones that are more closely related to powerful attractors in the universe at large, are more likely to occur as attractors in an iteratively self-modifying AGI, and hence more likely to survive through the Trancension.
So, my essential complaint against Yudkowsky’s Friendly AI theory is that – quite apart from ethical issues regarding the wisdom of using mass-energy on humans rather than some other form of existence -- I strongly suspect that it’s impossible to create AGI’s that will progressively radically self-improve and yet retain belief in the “humane-ness” principle. I suspect this principle is just too non-universal to survive the successive radical-self-improvement process and the Transcension. On the other hand, I think a more abstract and universally-attractive principle like Voluntary Joyous Growth might well make it.
Please note that this is very different from the complaint that Friendly AI won’t work because any AI, once it has enough intelligence and power, will simply seize all processing power in the universe for itself. I think this “Megalomaniac AI” scenario is mainly a result of rampant anthropomorphism. In this context it’s interesting to return to the notion of attractors. It may be that the Megalomaniac AI is an attractor, in that once such a beast starts rolling, it’s tough to stop. But the question is, how likely is it that a superhuman AI will start out in the basin of attraction of this particular attractor? My intuition is that the basin of attraction of this attractor is not particularly large. Rather, I think that in order to make a Megalomaniac AI, one would probably need to explicitly program an AI with a lust for power. Then, quite likely, this lust for power would manage to persist through repeated self-modifications – “lust for power” being a robustly simple-yet-abstract principle. On the other hand, if one programs one’s initial AI with an initial state aimed at a different attractor meta-ethic, there probably isn’t much chance of convergence into the megalomaniacal condition.
2.10 Harmony with the Nature of the Universe
This leads us to a point made by Jef Albright on the SL4 list, which is that the philosophy of Growth ties in naturally with the implicit “ethical system” followed by the universe – i.e., the universe grows. In other words, Growth is a kind of universe-scale attractor – once one has a universe devoted to pattern-proliferation-and-expansion, one will likely continue to do so for quite a while … the newly generated patterns will generate yet more patterns, and so forth. It is also a “universal attractor,” in the sense of an attractor that is common in various dynamical subsystems of the universe.
I think there’s a similar philosophical argument that Voluntary Joyous Growth is also harmonious with the pattern of the universe – i.e. also holds promise as a universal attractor.
Regarding the Voluntary part – the evolution of life shows how powerful wills naturally emerge from the weaker-willed … and then continue to survive due to their powerful wills, and create yet more willed beings.
And if you believe humans have a greater and deeper capacity for joy than rocks or trilobites or pigs, then we can also see in natural evolution a movement toward increasing Joy. Joyful creatures interact with other Joyful creatures and produce yet more Joyful creatures – Joy wants to perpetuate itself.
On the other hand, the Friendly AI principle does not seem to harmonize naturally with the evolutionary nature of the universe at all. Rather, it seems to contradict a key aspect of the nature of the universe -- which is that the old gives way to the new when the time has come for this to occur.
Sure, there’s a certain quixotic nobility in maintaining ethics that contradict nature. After all, in a sense, technology development is all about contradicting nature. But in a deeper sense, I argue, technology development is all about following the nature of the universe – following the universal tendency toward growth and development. Modern technology may be in some ways a violation of biological nature, but it’s a consequence of the same general-evolutionary principle that led to the creation of biological forms out of the nonliving chemical stew of the early Earth. There is a quixotic beauty in contradicting nature -- but an even greater and deeper beauty, perhaps, in contradicting local manifestations of the nature of the universe while according with global ones. In breaking out of local attractor patterns but remaining wonderfully in synch with global ones.
All this suggests an interesting meta-principle for selecting abstract ethical principles, already hinted at above: namely,
All else equal, ethical principles are better if they’re more harmonious with the intrinsic nature of the universe – i.e. with the attractors that guide universal dynamics.
This suggests another possible modification to Smigrodzki’s meta-ethic, namely:
Find rules that will be accepted, and that are relatively harmonious with the attractors that guide universal dynamics.
However, this enhancement may be somewhat redundant, because I believe it’s true that rules, systems and principles that are more harmonious with the attractors that guide universal dynamics will tend to be accepted more broadly and for longer. Or in other words:
Attractors that are common in the universe, are also generally attractors for communities of volitional agents.
One thing this discussion brings to mind is Nietzsche’s discussion of “a good death.” Nietzsche pointed out that human deaths are usually pathetic because people don’t know when and how to die. He proposed that a truly mature and powerful mind would choose his time to die and make his death as wonderful and beautiful as his life. Dying a good death is an example of harmonizing with the nature of the universe – “going with the flow”, or following the “watercourse way” to use Alan Watts’ metaphorical rendition of Taoism. Counterbalancing the beauty of the Friendly AI notion with its quixotic quest to preserve humane-ness at all costs in contradiction to the universal pattern of progress, one has the hyperreal Nietzschean beauty of humanity dying a good death – recognizing that its time has come, because it has brilliantly and dangerously obsoleted itself. One might call this form of beauty the “Tao of Speciecide” – the wisdom of a species (or other form of life) recognizing that its existence has reached a natural end and choosing to end itself gracefully. As Nietzsche’s Zarathustra said, “Man is something to be overcome.”
It’s an interesting question whether speciecide contradicts the universal-attractor nature of Compassion. Under the Voluntary Joyous Growth principle, it’s not favored to extinguish beings without their permission. But if a species wants to annihilate itself, because it feels its mass-energy can be used for something better, then it’s perfectly Compassionate to allow it to do so.
Of course, I am being intentionally outrageous here – in my heart I don’t want to see the human race self-annihilate just to fulfill some Nietzschean notion of beauty, or to make room for more intelligent beings, or for any other reason. I have a tremendous affection for us hypercerebrated ape-beings. And as will be emphasized below, the course I propose in practice is a kind of hybrid of Cautious Developmentalism and Voluntary Joyous Growth. I am pursuing this line of discussion mainly to provide a counterbalance to what I see as an overemphasis on “human-friendliness” and human-preservation. Preserving and nurturing and growing humanity is an important point, but not the only point. To understand the Transcension with the maximum clarity our limited human brains allow, we need to think and feel more broadly.
I can see beauty in both of these extremes – Friendly AI and the Tao of Speciecide -- and I am not overwhelmingly attracted to either of them. I don’t know if it’s “best”, in a general sense, that humanity survives or not – though I have a very strong personal bias in favor of humanity’s persistence, and in practical terms I would never act against my own species. I am very strongly motivated to spread choice, growth and joy throughout the universe – and to research ways in which to do this without endangering humanity and what it has become and achieved.
OK – now, let’s get practical. Suppose that
Then we have the question of what we can do to encourage the post-Transcension world to maximally adhere to the Principle of Voluntary Joyous Growth.
My own thinking on this topic has centered on the development of artificial general intelligence. Partly this is because AGI is my own area of research, but mainly it’s because I believe that
Regarding the first point, I think it’s clear that, as soon as AGI comes about, it will radically transform the future development course of all other technologies. Furthermore, these other technologies – if their development initially goes more rapidly than AGI – are likely to rapidly lead to the development of AGI, so that their final development will likely be a matter of AGI-human collaboration. Suppose, for example, that molecular nanotechnology comes about before AGI. One of the many interesting things to do with MNT will be to create extremely powerful hardware to support AGI; and once AGI is built it will lead to vast new developments in MNT, biotechnology, AGI and other areas. Or, suppose that human biological understanding and genetic engineering advance much faster than AGI. Then, with a detailed understanding of the human brain, it should be possible to create software or hardware closely emulating human intelligence – and then improve on human intelligence in this digital form … thus leading to powerful AGI. I have a suspicion that MNT or biotech will lead to AGI capabilities before they will lead to AGI-independent Singularity-launching capabilities … though of course I’m well aware this suspicion could be wrong.
Regarding the second point, I think it’s clear that a molecular assembler or an advanced genetic engineering lab will be profoundly dangerous if left in the hands of (unreliable, highly ethically variant) human beings. Quite possibly, once technology develops far enough, it will become so easy for a moderately intelligent human to destroy all life on Earth that this will actually occur. There are many possible solutions to this problem, for instance:
Of these five possibilities, 4-6 are the ones that I consider to have the highest probability of successful eventuation. I think 3 is also somewhat plausible, and am highly skeptical of 1 and 2.
I think renunciation is highly unlikely given the likely practical benefits that each incremental step of technological advancement is likely to have. Basically, the vast majority of humans aren’t going to want to renounce technologies that they find gratifying. And a small set of renouncers won’t alter the course of technology development.
Potentially, though radical Luddites could force renunciation via mass civilization-destroying terrorist actions – I view this as far more likely than a voluntary mass renunciation of technology.
Next, raised as I was among Marxists, it’s hard for me to be optimistic about the “perfectibility of humanity” via any means other than uploading or radical neural modification. While social and cultural patterns definitely have a strong impact on each individual mind, it’s equally true that social and cultural patterns are what they are (and are flawed as they’re flawed) because of the intrinsic biological nature of human psychology. Traits like dishonesty, violence, paranoia and narrow-mindedness are part of the human condition and are not going to be eliminated via social engineering or education. So far, social and psychological engineering through pharmacology has been a mixed bag … but as technology advances, it seems clear that the only real hope for improving human nature lies in modifying the genome or the brain, hence physiologically modifying the nature of humanity.
On a purely scientific level, it’s hard to tell whether or not detailed human-brain or human-genome modification is “easier” than creating AGI. Pragmatically, however, it seems clear that these biological improvements would be very difficult to propagate throughout the human race -- due to the fact that so many individuals believe it’s a bad idea, and are unlikely to change their minds. AGI, on the other hand, can be achieved by a small group of individuals, and then have a definitive effect on the world at large, even if most individuals on Earth greet it with confused and ambiguous (or in some cases flatly negative) attitudes.
Finally, technological safeguards may be possible, but it’s hard to be confident in this regard: even if some radical, dangerous technologies can be safeguarded (as nuclear weapons are, currently, by the difficulty of obtaining fissile materials), all it will take is one hard-to-safeguard technology to lead to the end of us all. Certainly, it’s clear that -- given the increasing rate of advance of technology and its rapid spread around the globe – the only way the “technological safeguard” route could possibly work would be via a worldwide police state with Big Brother watching everyone. And, the aesthetics and ethics of this kind of social system not withstanding, it’s not clear to me that even this would be effective. Advanced surveillance and enforcement measures would lead to advanced countermeasures by rebel groups, including sophisticated hacker groups in First World countries as well as terrorists with various agendas (including Luddite agendas). I suppose that the only way to make technological safeguards work would be to:
· Create highly advanced technology, either AGI or MNT or intelligence-enhancing biotech or some combination thereof
· Keep this technology in the hands of a limited class of people, and use this technology to monitor the rest of the world, with the specific goal of preventing the development of any other technology posing existential risks
While this would necessarily involve the sort of universal surveillance associated with the term “Big Brother,” it certainly wouldn’t necessarily entail the sort of fascist control of thoughts and actions of the sort depicted in Orwell’s 1984. Rather, all that’s required is the specific control of actions posing significant existential risks to the human race (and any other sentients developed in the meantime). Rather than a “Big Brother”, it may be more useful to think of a “Singularity Steward” – an entity whose goal is to guide humanity and its creation toward its Singularity or other-sort-of-Transcension in a maximally wise way … or guide it away from Singularities and Transcensions if these are judged most-probably negative in ethical valence.
In fact, my suspicion is that the only way to make a Singularity Steward entity actually work would be to supply it with an AGI brain – though not necessarily an AGI brain bent on growth or self-improvement. Rather, one can envision an AGI system programmed with a goal of preserving the human condition roughly as-is, perhaps with local improvements (like decreasing the incidence of disease and starvation, extending life, etc.). This AGI – “AI Big Brother” aka the “Singularity Steward” -- would have to be significantly smarter than humans, at least in some ways. However, it wouldn’t need to be autonomous – in fact, it’s natural for this entity to depend on humans for its survival.
This steward AGI would need to be a wizard at analyzing massive amounts of surveillance data and figuring out who’s plotting against the established order, and who’s engaged in thought processes that might lead to the development and deployment of dangerous technologies. Perhaps, together with human scientists, it would figure out how to scan human brains worldwide in real-time to prevent not only murderous thoughts, but also thoughts regarding the development of molecular assemblers or self-modifying AI’s, or the creation of beings with intelligence competitive with that of the steward itself.
The problem of engineering a Singularity Steward AGI is rather different from the problem of engineering an AI intended to shepherd human minds through the Transcension. In the AI Big Brother case, one doesn’t want the AI to be self-modifying and self-improving – one wants it to remain stable. This is a much easier problem! One needs to make it a bit smarter than humans, but not too much – and one needs to give it a goal system focused on letting itself and humans remain as much the same as possible. The Singularity Steward should want to increase its own intelligence only in the presence of some external threat like an alien invasion.
In extreme cases one can envision a Singularity Steward feeling compelled to act in a fascistic way – for instance, intrusively modifying the brains of rebellious AGI researchers intent on launching the Singularity according to their pet theories. But if the goal is to prevent a dangerous, inadequately-thought-out Singularity, this may be the best option. To keep things exactly the way they are now – with the freedoms that now exist -- is to maintain the possibility of massive destruction as technology develops slightly further. We are not, right now, in a safe and stable sociopsychotechnological configuration by any means. This AI Big Brother option is not terribly appealing to me personally, because it grates too harshly against my values of growth, choice and happiness. However, I respect it as a logical and consistent possibility, which seems plausibly achievable based on an objective analysis of the situation we confront. And I can see that it may well be the best option, if we can’t quickly enough arrive at a confident, fully-fleshed-out theory regarding the likely outcome of iterated self-improvement in AGI systems.
The Singularity Steward idea ties in with the Cautious Developmentalism approach, mentioned earlier. Suppose we create a Singularity Steward – and then allow it to experiment, together with selected human scientists, with Transcension-related technologies. This experimentation must take place very slowly and conservatively, and any move toward the Transcension would (according to the Steward’s hard-wired control code) be made only based on the agreement of the Steward with the vast majority of human beings. Conceivably, this could be the best and safest path toward the Transcension.
In fact – Orwellian associations notwithstanding -- a Singularity-Steward-dominated society could potentially be a human utopia. Careful development of technology aimed at making human life easier – cheap power and food, effective medical care, and so forth – could enable the complete rearrangement of human society. Perhaps Earth could be covered by a set of small city-states, each one populated by like-minded individuals, living in a style of their choice. Liberated from economic need, and protected by the Steward from assault by nature or other humans, the humans under the Steward’s watch could live far more happily than in any prior human society. Free will, within the restrictions imposed by the Steward, could be refined and exercised copiously, perhaps in the manner of Buddhist “mind control.” And growth could occur spectacularly in non-dangerous directions, such as mathematics, music and art.
This hypothetical future is similar to the one sketched in Jack Williamson’s classic novel The Humanoids, although his humanoids (a robot-swarm version of an “AI Big Brother”) possessed the tragic flaw of valuing human happiness infinitely and human will not at all. While this flaw made Williamson’s novel an interesting one, it’s not intrinsic in the notion of a steward AGI. Rather, it’s quite consistent to imagine a Singularity Steward that values human free will as much as or more than human happiness – and imposes on human choice only when it moves in directions that appear plausibly likely to cause existential risks for humanity.
Of course, there’s one problem with this dream of a Singularity-Steward-powered human utopia: politics. An AGI steward, if it is ever created, is most likely to be created by some particular power bloc in order to aid it in pursuing its particular interests. What are the odds that it would actually be used to create a utopia on Earth? This is hard to estimate! What happens to politics when pre-Transcension but post-contemporary technology drastically decreases the problems of scarcity we have on Earth today?
This means that there are two ways a really workable Singularity Steward could come about:
· By transforming the global cultural and political systems to be more rational and ethically positive
· By a relatively small group of individuals, acting rationally with positive ethical goals, creating the Singularity Steward and putting it into play
This “relatively small group” could for example be an international team of scientists, or a group operating within the United Nations or the government of some existing nation. (Of course, these two paths are not at all mutually exclusive.)
This hypothesized transformation of global cultural and political systems ties in with the notion of the Global Brain, as explored in numerous writings by Valentin Turchin, Francis Heylighen, Peter Russell, the author , and various others. The general idea of the Global Brain is that computing and communication technologies may lead to the creation of a kind of “distributed mind” in which humans and AI minds both participate, but that collectively forms a higher level of intelligence and awareness, going beyond the individual intelligences of the people or AI’s involved in it. I have labeled this kind of distributed mind a “Mindplex” and have spent some effort exploring the possible features of Mindplex psychology. The Global Brain Mindplex, as I envision it, would consist of an AGI system specifically intended to collect together the thoughts of all the people on the globe and synthesize them into grander and more profound emergent thoughts – a kind of animated, superintelligent collective unconscious of the human race. Of course the innate intelligence of the AGI system would add many things not present in any of the human-mind contributors – but then the AGI feeds its ideas back to the mass of humans, who then think new thoughts that are incorporated back into the Global Brain Mindplex mix.
In the late 1990’s I was very excited about the Global Brain Mindplex – but then for a while I lost some of my enthusiasm for it, due to its relative unexcitingness when compared to the possibility of a broader and more overwhelming Transcension. However, I had been overlooking the potential power of the Global Brain Mindplex as a Singularity Steward. In fact, if one wishes to create a Singularity Steward AGI to help guide humanity toward an optimal Transcension, it makes eminent sense that this Steward should harness the collective thought, intuition and feeling power of the human race, in the manner envisioned for the Global Brain. The two visions mesh perfectly well together, yielding the goal of creating a Global Brain Mindplex with a goal of advocating Voluntary Joyous Growth but avoiding a premature human Transcension.
The advent of such a Global Brain Mindplex might well help achieve what has proved impossible via human means alone – the creation of rational and ethically positive social institutions. How to build such a Global Brain Mindplex is another question, however. What it will take is a group of people with a lot of money for computer hardware and software, a vast capability for coordinated creative activity, and genuinely broad-minded positive ethical intentions. Let us hope that such a group emerges.
A significant benefit of the Cautious Developmentalist approach is that it makes the lives of Transcension technology researchers easier and safer.
One may argue that
IF a Transcension of type Y is the best outcome according to Ethical System E
AND the odds of successfully launching a Transcension are a lot higher with the acceptance of a greater number of humans
THEN it is worth exploring whether either
a) a Transcension of type Y is acceptable to the vast majority of humans, or if not whether
b) there is a Transcension of type Y' that is also a very good outcome according to E, but that IS acceptable a lot more humans
If such a Transcension Y' is found, then it's a lot better to pursue Y' than Y, because the odds of achieving Y' are significantly greater.
For example, if
Y = a Transcension supporting Voluntary Joyous Growth
Y' = a Transcension supporting Voluntary Joyous Growth, but making every possible effort to enable all humans to continue to have the opportunity to live life on Earth as-is, if they wish to
then it may well be that the conditions of the above are met.
Now, one shouldn’t overestimate the extent to which Y' is acceptable to the vast mass of humans. After all, currently the US government has outlawed hallucinogens and many kinds of stem cell research, and requires government approval for putting chips in one's own brain. Alcor, a company providing cryonic preservation services, has been plagued with lawsuits by transhumanism-unfriendly people. So it's naive to think people won't stand in the way of the Transcension, no matter how inoffensive it’s made.
But, definitely Y' is easier to sell than Y, and will create less opposition, thus increasing odds of achievement. This is a strong argument for embracing a kind of mixture of Voluntary Joyous Growth with Cautious Developmentalism. Even if Voluntary Joyous Growth is one’s goal, the chances of achieving this in practice may be greater if a Cautious Developmentalist approach to this goal is taken – because the odds of success are greater if there is more support among the mass of humanity.
However, this doesn't get around my above-expressed skepticism as to the possibility of guaranteeing that "all humans [will] continue to have the opportunity to live life on Earth as-is, if they wish to." The problem is, I think it is not very easy to make this guarantee about post-Transcension dynamics. If I'm right, then the options come down to,
According to my own personal ethics – which value choice, joy and growth – the most ethically sound course is 2), which supports the free choice of humanity. So, in my view, the best hope is that through a systematic process of education, the majority of humans will come to realization 2) ... that although there are no guarantees in launching a Transcension, the rewards are worth the risks. Then democracy is satisfied and growth is satisfied. There is reason to be optimistic in this regard, since history shows that nearly all technologies are eventually embraced by humanity, often after initial periods of skepticism.
This line of thinking pushes strongly in the direction of the Global Brain Mindplex.
Now let’s set political issues aside and go back to pure Voluntary Joyous Growth. If one wants to launch a positive Transcension using AGI – or create a positive Global Brain Singularity Steward Mindplex -- then one needs to know how to create AGI’s that are likely to be ethically positive according to the Principle of Voluntary Joyous Growth? The key, it seems lies in the combination of two things:
· Explicit ethical instruction: Specific instruction of the AGI in the “foundational ethical principle” in question (e.g. Voluntary Joyous Growth)
· Ethically-guided cognitive architecture: Ensuring that the AGI’s cognitive architecture is structured in a way that implicitly embodies the ethical principle (so that obeying any principle besides the foundational ethical principle would seem profoundly unnatural to the system)
The first of these – explicit ethical instruction – is relatively (and only relatively!) straightforward. In general, this may be done via a combination of explicitly “hardwiring” ethical principles into one’s AI architecture, and teaching one’s AI via experiential interaction. Essentially, the idea is to bring up one’s baby AI to have the desired value systems, by interacting with it, teaching it by example, scolding it when it does badly, and – the only novelty here – spending a decent portion of one’s time studying the internals of one’s baby’s “brain” and modifying them accordingly. A key point is that one cannot viably instruct a baby mind only in highly abstract principles; one must instruct it in one or more specific ethical system, consistent with one’s abstract principles of choice. No doubt there will be a lot of art and science to instructing AI minds in specific ethical systems or general ethical principles; experimentation will be key here.
The second point – creating a cognitive architecture intrinsically harmonious with ethical principles – is subtler but seems to be possible so long as one’s ethical principles are sufficiently abstract. For instance, a focus on joy, growth and choice comes naturally to some AI designs, including the Novamente design under development by my collaborators and myself. Novamente may be given joy, growth and choice as specific system goals – along with more pragmatic short-term goals – but at least as importantly, it has joy, growth and choice implicitly embedded in its design.
Novamente is a multi-agent design, in which intelligence is achieved by a combination of semi-autonomous agents representing a variety of cognitive processes. Each particular Novamente system consists of a network of semi-autonomous units, each containing a population of agents carrying out cognitive processes and acting on a shared knowledge base.
It’s interesting to note that an emphasis on voluntarism is implicit in the multi-agent architecture, in which mind itself consists of a population of agents, each of which is allowed to make its own choices within the constraints imposed by the overall system. Rather than merely having ideas about the value of choice imposed on the system in an abstract conceptual way, the value of choice is embedded in the cognitive architecture of the AI system.
Not just the Novamente system as a whole, but many of its individual component processes, may be tuned to act so as to maximize joy and growth. For instance, the processes involved with creating new concepts may be rewarded for creating concepts that
· display a great deal of new pattern compared to previously existing concepts (“growth”)
· have the property that thinking about these concepts tends to lead to positive affect.
The same reward structure may be put into other processes, such as probabilistic logical inference (where one may control inference so as to encourage it to derive surprising new relationships, and new relationships that are estimated to correlate with system happiness).
The result is that, rather than merely having an ethical system artificially placed at the “top” of an AI system, one has one’s abstract ethical principles woven all through the system’s operations, inside the logic of many of its cognitive processes.
Finally there is the issue of information-gathering – does the AI system have the information to really act with the spread of joy, growth and truth throughout the universe as its primary goals? In order to encourage this, I have proposed the creation of a “Universal Mind Simulator” AI which contains sub-units dedicated to studying and simulating the actions of other minds in the universe. Assuming the Novamente AI architecture works as envisioned, it should be quite possible to configure a Novamente AI system in this way (even though universal mind simulation is not a necessary part of the Novamente architecture). Again, rather than just having “respect all the minds in the universe” programmed in or taught as an ethical maxim, the very structure of the AI system is being implicitly oriented toward the respecting of all minds in the universe. Personally, I find this kind of “AI Buddha” vision more appealing than “AI Big Brother” – but I also consider it even more risky.
Note the close relationship – but also the significant difference – between the Global Brain Mindplex design and the Universal Mind Simulator design. The former seeks to merge together the thoughts of various sentients into a superior, emergent whole; the latter seeks to emulate and study the thoughts of sentients as individuals. Obviously there is no contradiction between these two approaches; the two could exist in the same AI architecture – a Universal Brain AI Buddha Mindplex!
As already noted, this notion of ethically-guided cognitive architecture fits in much more naturally with abstract ethical principles like Voluntary Joyous Growth than with more specific ethical rules like “Be nice to humans.” It is almost absurd to think about building a cognitive architecture with “Be nice to humans” implicit in its logic; but abstract concepts like choice, joy and growth can very naturally be embodied in the inner workings of an AI system.
How then do we encourage a positive Transcension? Based on the considerations I’ve reviewed above, there seem to be two plausible options, summarized by the tongue-in-cheek slogan
AI Buddha versus AI Big Brother
Or, less sensationalistically rendered:
AI-Enforced Cautious Developmentalism
AI-Driven Aggressive Transcension Pursuit
My feeling is that the best course is as follows:
1. Research sub-human-level AI and other Transcension technologies as rapidly, intensely and carefully as possible, so as to gather the information needed to make a decision between Cautious Developmentalism and a more aggressively Transcension-focused approach. This needs to be done reasonably fast, because if humans, with our erratic and often self-destructive goal-systems, get to MNT and radical genetic engineering first, profound trouble may well ensue.
2. Present one’s findings to the human race at large, and undertake an educational programme aiming to make as many people as possible comfortable with the ideas involved, so that as many educated intelligent judgments as possible are able to weigh in on the matters at hand
3. If the dangers of self-modifying AGI seem too scary after this research and discussion period (for instance, if we discover that some kind of Evil Megalomaniacal AI seems like a likely attractor of self-modifying superintelligence), then
a. build an AGI Singularity Steward – quite possibly of the Global Brain Mindplex variety -- and try like hell to prevent human political issues from sabotaging the feat
b. proceed very slowly and carefully with Transcension-related research
4. If the dangers of self-modifying AGI seem acceptable as compared to other dangers, then
a. Create AGI’s as fast as possible
b. Teach the AGI’s our ethical system of choice
c. Teach the AGI’s – and perhaps more importantly, embody in the AGI’s cognitive architectures – our abstract ethical/meta-ethical principles of choice
5. In either case: Hope for the best!
This general plan is motivated by principles of growth and choice, but nevertheless, as explicitly stated it’s neutral as regards the precise ethical systems and principles used to guide the development of self-modifying AGI’s. Of course, this is a critical issue, and as discussed above, it’s a matter of both taste and pragmatics. We must choose systems and principles that we feel are “right,” and that we feel have a decent chance of surviving the Transcension to guide post-Transcension reality. The latter issue – which ethical systems and principles have a greater chance of survival – is in part a scientific issue that may be resolved by experimenting with relatively simple self-modifying AI’s. For instance, such experimentation should be able to tentatively confirm or refute my hypothesis that more abstract principles will more easily survive iterated self-modification. But ultimately, even this kind of experimentation will be of limited value, due to the very nature of the Transcension, which is that all prior understandings and expectations are rendered obsolete.
After significant reflection, my own vote is for the Principle of Voluntary Joyous Growth. Of course, I hope that others will come to similar conclusions – and I’ll do my best to convince them… both of the rational point that this sort of principle is relatively likely to survive the Transcension, and of the human point that this principle captures much of what is really good, wonderful and important about human nature. If we leave the universe – or a big portion of it -- with a legacy of voluntary joyous growth, this is a lot more important than whether or not the human race as such continues for millions of years. At least, this is the case according to my own value system – a value system that values humanity greatly, but not primarily because humans have two legs, two eyes, two hands, vaginas and penises, biceps and breasts and two cerebral hemispheres full of neurons with combinatory and topographic connections. I have immense affection for human creations like literature, mathematics, music and art; and for human emotions like love and wonder and excitement; and human relationships and cultural institutions … families, couples, rock bands, research teams. But what are most important about humanity are not these often-beautiful particulars, but the joy, the growth and the freedom that these particulars express – in other words, the way humanity expresses principles that are powerful universal attractors. At any rate, these are the human thoughts and feelings that lead me to feel the way I do about the best course toward the transhuman world. Let’s do our best to make the freedom to be human survive the Transcension – but most of all, let’s do our best to make it so that the universal properties and principles that make humanity wonderful survive and flourish in the “post-Transcension universe” … whatever this barely-conceivable hypothetical entity turns out to be….
In spite of my own affection for Voluntary Joyous Growth, however, I have strong inclinations toward both the Joyous Growth Guided Voluntarism and pure Joyous Growth variants as well. (As much as I enjoy enjoying myself, Metaqualia’s eternal orgasm doesn’t appeal to me so much!) I hope that the ethical principle used to guide our approach to the Transcension won’t be chosen by any one person, but rather by the collective wisdom and feeling of a broad group of human beings. Bill Hibbard is an advocate of such decisions being made by an American-style democratic process; I’m not so sure this is the best approach, but I’m also not in favor of a single human being or tiny research team taking such a matter into its own hands. A discussion of the various ways to carry out this kind of decision process would be interesting but would elongate the present discussion too far, and I’ll defer it to another essay.
Obviously, I’m very excited about the possibilities of the Transcension, and I have a certain emotional eagerness to get on with it already. However, I’m also a scientist and well aware of the importance of gathering information and doing careful analysis before making a serious decision. So I’ll end this essay on a less ecstatic note, and emphasize once again the importance of research. I’ve presented above a number of very major issues, which I believe will be elucidated via experimentation with “moderately intelligent,” partially-self-modifying AGI systems. And I’m looking forward very much to participating in this experimentation process – either with a future version of my Novamente AI system, or with someone else’s AGI should they get there first. Experimentation with other technologies such as genetic engineering, neuromodification and molecular nanotechnology will doubtless also be highly instructive.
Many of the ideas in this essay developed via discussions with others, including
· Frequent in-person chats with Izabela Lyon Freire, Moshe Looks, Kevin Cramer; and occasional in-person chats with Lucio Coelho de Souza & Eliezer Yudkowsky
· Discussions on the SL4 and AGI email lists, and in private emails, with a variety of individuals including Eliezer Yudkowsky, Metaqualia, Rafal Smigrodzki, Jef Allbright. Philip Sutton and Michael Vassar
Some of the ideas discussed here developed purely in the privacy of my own teeming brain; and of course the responsibility for any foolishness found here is primarily my own.
 Drexler, K. E. (1992) Nanosystems: Molecular Machinery, Manufacturing, and Computation. New York: John Wiley & Sons.
 Vinge, Vernor (1993). The Coming Technological Singularity: How to Survive in the Post-Human Era. This was published in the Whole Earth Review in 1993, and is available online, e.g. at
 Kurzweil, Ray (2004). The Singularity is Near. To appear.
 Kurzweil, Ray (2000). The Age of Spiritual Machines. Penguin.
 Broderick, Damien (2002). The Spike. St. Martin’s Press.
 Yudkowsky’s essay Staring into the Singularity
(http://yudkowsky.net/singularity.html) is something of a classic
 Smart, John (2002). Answering the Fermi Paradox: Exploring the Mechanisms of Universal Transcension; see also his audio CD Understanding the Singularity
 Damien Broderick says that, so far as he knows, the first person to systematically use the word Transcension in the context of transhumanism was Anders Sandberg, whose web page is at www.nada.kth.se/~asa/
 Broderick, Damien (2002). Transcension. Tor Books.
 Yudkowsky’s definition of “humaneness” is technical and complicated, and is related to his discussion of “programmer-independent morality” in Section 3.4.4 of Creating a Friendly AI, see http://www.singinst.org/CFAI/design/structure/friendliness.html
 Yudkowsky asserts that “humaneness” as he intends it is not an ethical principle. His definition of both “ethical principle” and “humaneness” probably both differ from mine in subtle ways.
 For a discussion of the role of attractors in the mind, see my book Chaotic Logic (1994, Plenum Press)
 Dennett, Daniel (2003). Freedom Evolves. Viking.
 I took this particular paraphrase of Popper from Lakatos’s analysis of Popper in The Methodology of Scientific Research Programmes
 See Lakatos, Imre (1980), The Methodology of Scientific Research Programmes, Cambridge University Press; see also my essay Science, Probability and Human Nature which presents a neo-Lakatosian perspective.
 Eliezer Yudkowsky to Nick Bostrom on WTA-Talk list, August 23, 2003
 Watts (1977). Tao: The Watercourse Way. Pantheon Books.
 Joy, Bill (2000). Why The Future Doesn’t Need Us, Wired, April 2000 issue. Online at http://www.wired.com/wired/archive/8.04/joy_pr.html
 See my 2003 essay “Mindplexes” in Dynamical Psychology: http://www.goertzel.org/dynapsyc/2003/mindplex.htm
 See Goertzel et al (2003), “Novamente: An Integrative Design for Artificial General Intelligence”, in the Proceedings of the Workshop on Agents and Cognitive Modeling at IJCAI-03; or, see the documents online at www.agiri.org
 Hibbard, Bill (2003). Superintelligent Machines. Kluwer Academic.