AI Buddha versus
AI Big Brother,
Voluntary Joyous
Growth,
the Tao of
Speciecide,
the Global Brain
Singularity Steward Mindplex,
and Other Issues of Transhumanist Ethical Philosophy
Contents
This essay is
relatively brief, but its theme extremely large: how to manage the
development of technology and society, in the near to mid-term future, in such
a way as to maximize the odds of a positive long-term future for the
universe.
My conclusions are
uncertain, but bold. I believe that the
era of humanity as the “Kings of the Earth” is almost inevitably coming to an
end. Unless we bomb or otherwise destroy
ourselves back into the Stone Age or into oblivion, we are going to be sharing
our region of the universe with powerful AI minds of one form or another. Potentially depending on decisions we make
in the near or moderately near future, this may or may not lead to a
fundamental alteration in the nature of conscious experience in our neck of the
woods: a Transcension. And this may or
may not lead to the demise of humanity – which may or may not be a terrible
thing.
I conclude that there
are two strong options going forward, which I associate with the catch-phrases
“AI Buddha” and “AI Big Brother.” More
verbosely, these correspond to the alternatives of
·
Creating an AI based on some variant of the principle of “Voluntary
Joyous Growth,” and allowing it to repeatedly self-modify and become vastly
superintelligent, having a potentially huge impact on the universe and
potentially obsoleting the human race
·
Creating an AI dictator with stability as a main goal, to rule the
human race, ensuring peace and prosperity and guaranteeing that no human
creates overly advanced, “dangerous” technologies
Not surprisingly, I
have a tentative preference for the Voluntary Joyous Growth scenario, but I
believe that much more research (mostly, research with “primitive” AI’s that
are nevertheless much more advanced than any AI’s we currently possess) is
needed to fully understand the risks and rewards of each option.
My analysis is based on
a few key assumptions. Chiefly, I
assume that:
·
The broad and rapid advance of human science and technology will
continue to increase
·
Once human science and technology have advanced adequately, “radical
futurist” technologies such as artificial general intelligence[1] (AGI), molecular nanotechnology[2] (MNT), pharmacological human life
extension and genetic engineering of wildly novel organisms
I recognize that these
assumptions are not incontrovertibly true.
There could be as-yet-unknown physical limits preventing the development
of the radical futurist technologies; or, as I already noted, the human race
could knock itself back to the Stone Age or oblivion or some other
nontechnological condition. However, I
think these assumptions are highly likely to be true; and they’re the premise
for much (though not all) of the discussion to follow.
These assumptions are
related to the notion of the “Singularity,” as introduced by Vernor Vinge[3] in the 1980’s and more thoroughly
developed by a host of recent futurist thinkers. To the reader who is unfamiliar with this breed of futurist
thinking, I recommend the following works as prerequisites for the present
discussion:
·
Ray Kurzweil’s book The Singularity is Near[4], and his earlier work The Age of
Spiritual Machines[5]
·
Damien Broderick’s book The Spike[6]
·
followed by a study of Eliezer Yudkowsky’s[7] and John Smart’s[8] more radical ideas.
However, the points
I’ll discuss here don’t necessarily require a Singularity as defined by these
thinkers; they merely require something weaker that – borrowing a word[9] from Damien Broderick’s novel of that
name,[10] and from some of John Smart’s writings
-- I call a Transcension. A Singularity
is a particular kind of Transcension, but not the only kind.
The basic idea of the
Singularity is that, at some point, the advance of technology will become (from
a human perspective) essentially infinitely rapid, thus bringing a fundamental
change in the nature of life and mind.
A key aspect of the Singularity concept is technological
acceleration. Historical analysis
suggests that the rate of technological increase is itself increasing – new
developments come faster and faster all the time. At some point this increase will come so fast that we don’t even
have time to understand how to use the N’th radical new development, before the
N+1’s radical new development has come.
Eventually technological progress will lead to the creation of powerful
AI’s, and these AI’s, rather than humans, will be carrying out the bulk of
technology development – thus allowing new innovations to emerge at superhuman
pace. At this point, when dramatic new
technologies and new ways of thinking develop daily or hourly, so fast that
humans literally can’t keep up, the technological Singularity will be upon us.
Another aspect of the
Singularity idea is psychological: the Singularity is envisioned as a radical
transition in the nature of experience, not just technology.
When civilization and
language and rational thought emerged, the nature of human experience changed
radically. Or, to put it another way,
the “human experience” as we now know it emerged from the experience of
proto-human animals.
But there is no good
reason to believe that the emergence of the modern human mind is the end state
of the evolution of psyche. Indeed, the
rub is this: While evolution might take millions of years to generate another
psychological sea change as dramatic as the emergence of modern humanity,
technology may do the job much more expediently. The technological Singularity can be expected to induce rapid and
dramatic change in the nature of life, mind and experience.
That’s Singularity;
what about Transcension? The basic idea
of the Transcension is that at some point, the advance of technology will bring
about a fundamental change in the nature of life and mind. The difference is that a Transcension can
occur even if there is no exponential or superexponential growth in
technology. It could occur, eventually,
even with a linear or logarithmic advance in technology. In fact, I think that a Singularity scenario
is extremely likely; but the points I’m going to make here are mostly valid for
any Transcension, no matter how fast it occurs. Perhaps the biggest difference between the Transcension and
Singularity concepts is that, if the Singularity idea is correct, then the
Singularity is near and we’d better start worrying about it fast;
whereas if a Transcension is going to occur 10,000 years from now, there’s no
particular need for us to fuss about it at the moment.
The term “Singularity”
tends to place an emphasis on the rapidity of change that is induced by
exponentially or hyperexponentially accelerating advances in technology. And indeed, the suddenness or otherwise of
the coming change is a very important practical point. However, the technologies involved –
exciting as they are -- should be viewed mainly as enablers. The key point is that we may soon be
experiencing a profoundly substantial change in the “order of being”. The point is that the way we experience the
world, the way we human animals live life and conduct social affairs, is not
the end state of mind-in-the-universe, but only an intermediate state on the
way to something else. And the
Transcension to this “something else” may well occur sooner rather than
later.
But
what is this something else? This is
where things get interesting. One might
contend that, even if we are on the verge of something far beyond our current
ways of thinking, living and experiencing, our limited and old-fashioned human
brains really don’t stand much chance of envisioning this new order of things
in any detail. On the other hand, it
seems, it would be foolish to not even try.
In fact, it seems
quite possible that actions we take now may play a major role in shaping the
nature of this nebulous-state-to-come, this post-Transcension, post-human order
of being. One of the (many) great unknown
questions of the Transcension is: how much effect does the way in which the
Transcension is reached, have on the nature of mind and reality
afterwards? There are many
possibilities, e.g.
1.
there
are many qualitatively different post-Transcension states, and our choices now
impact which path is taken
2.
no
matter what we do now, mind and reality will settle into the same basic
post-Transcension attractor
3.
a
human-achieved Transcension will merely serve to project humans into a domain
of being already occupied by plenty of other minds that have. The specifics of how humans approach the
Transcension is not going to have any significant impact on this
already-existent domain.
At this point, I
have no idea how to assess the probabilities of these various options.
In the second two
options, the only ethical question is whether the post-Transcension
state-of-being will be better than the states that would likely exist without a
Transcension. If yes, then we should
work to bring about the Transcension – and once this is done, reality will take
its course. If no, then we should work
to avoid Transcension.
In the first
option, the ethical choices are trickier, because some plausible
post-Transcension states may be better than the states that would likely exist
without a Transcension, whereas others may be worse. We then have to choose not only whether to seek or avoid
Transcension, but whether to seek or avoid particular kinds of
Transcension. In this case, it’s
meaningful to analyze what we can do now to increase the probability of a
positive Transcension outcome.
Of course,
serious discussion of any of these options can’t begin until we define
what a “positive” Transcension outcome really means
The following
sections of the essay deal mainly with two obvious issues that come out of the
above train of thought:
·
What
is a “positive outcome”? That is, what
is an appropriate ethical or meta-ethical standard by which to judge the
positivity or otherwise of a hypothetical post-Transcension scenario? A number of alternative, closely related
approaches are presented here, mostly centered around an abstract notion I call
the Principle of Voluntary Joyous Growth
·
In
the case that Option 3 above holds, then how can we encourage a positive
outcome? Here my focus is on
artificial general intelligence technology, which I believe will be the primary
driver behind the Transcension (because it will be making the other
inventions). I will argue that, in
addition to teaching AGI’s ethical behavior, it is important to embody ethical
principles in the very cognitive architecture of one’s AGI systems. (Specific ideas in this direction will be
presented, and discussed in the context of the Novamente AI system.)
What is
a good Transcension? Some people would
say that the only good Transcension is a non-Transcension. These people think that using technology to
radically alter the nature of mind and being is a violation of the natural
order of things. But even among radical
techno-futurists and others who believe that Transcension, in principle, may be
a good things, there is nothing close to agreement on what it means for a
post-Transcension world to be a “good” one.
For
Eliezer Yudkowsky, the preservation of “humaneness”[11] is of primary importance. He goes even further than most Singularity
believers, asserting that the most likely path is a “hard takeoff” in which a
self-modifying AI program moves from near-human to superhuman intelligence
within hours or minutes – instant Singularity! With this in mind, he prioritizes the creation of “Friendly
AI’s” – artificial intelligence programs with “normative altruism” (related to “humaneness”)
as a prominent feature of their internal “shaper networks” (a “shaper network” being
a network of “causal nodes” inside an AI system, used to help produce that AI
system’s “supergoals”). He discusses
extensively strategies one may take to design and teach AI’s that are Friendly
in this sense. The creation of Friendly
AI, he proposes, is the path most likely to lead to a humane post-Singularity
world.[12]
On the
other hand, Ray Kurzweil seems to downplay the radical nature of the
Singularity – leading up to, but not quite drawing, the conclusion that the
nature of mind and being will be totally altered by the advent of technologies
like AGI and MNT. At times he seems to
think of the post-Singularity world as being a lot like our current world, but
with funkier technology around; with AI minds to talk to and the absence of
pesky problems like death, disease, poverty and madness. And clearly he sees this vision as a good
one; he’s quite concerned to encourage ordinary non-techno-futurist people not
to be afraid of the beckoning changes.
Damien
Broderick’s novel Transcension presents a more ethically nuanced
perspective. In his envisioned future,
a superhuman AI rules over an Earth containing several different subregions,
including
(When I
read the book I for some reason assumed these humans were probably uploads
unknowingly living on a simulated Earth; but when I showed Broderick an earlier
version of this essay that mentioned this impression, he pointed out to me that
the book clearly states the people are real bodies on the real Earth. I guess I have a serious case of
simulation-on-the-brain!) Anyhow, at
the end of the novel the Transcension occurs – an event in which the ruling
superhuman AI mind decides that maintaining human lives isn’t consistent with
its other goals. It wants to move on to
a different order of being, and in preparation it uploads all humans from Earth
into digital form, so it can more easily guarantee their safety and help with
their development. (“Transcension” in
the sense that I’m using it in this essay is a bit broader than the event in
Broderick’s novel; in my terminology, his Transcension event is part of the
overall Transcension in his fictional universe.)
Not all
techno-futurists are as concerned with the future of human life or humane-ness. For example, the poster Metaqualia,
in a series of emails on Yudkowsky’s SL4 email list[13], has argued for alternate positions,
such as:
Clearly,
given that we humans can’t agree on what’s good and valuable in the current
human realm of life, it would be foolish to expect us to agree on what’s good
and valuable in the post-Transcension world.
But nevertheless, it seems it would be equally foolish to ignore the
issue completely. It seems important to
ask: What are the values that we would like to see guide the development of the
universe post-Transcension?
This
poses a challenge in terms of ethical theory, because for a value-system to
apply beyond the scope of human mind and society, it has to be very abstract
indeed – and yet there’s no use in a value-system so abstract that it doesn’t
actually say anything. Thinking about
the post-Transcension universe pushes one to develop ethical value-systems that
are both extremely general and reasonably clear.
There
may be many different value-systems of this nature; here I will discuss several
of them, and their interrelationship:
Each of
these is a very general, abstract ethical principle. [14] Specific ethical systems may come to
exist, but the quality of an ethical system must be judged relative to the
ethical principle it reflects. I will
return to this point later.
I note
again that I am only considering value-systems that are
Transcension-friendly. Of course there
are many other value-systems out there in the world today, and most of them
would argue that the Transcension as I conceive it is ethically wrong. These value-systems are interesting to
discuss from a psychological and cultural perspective, but they are not my
concern in this essay.
2.1 Ethics, Rationality and Attractors
It is important
to clearly understand the relationship between ethical principles and
rationality. Once one has decided upon
an ethical principle, one can use rationality to assess specific ethical
systems as to how well they support the ethical principle. Below I will present two meta-ethical
principles –
But one
can’t choose a meta-ethical principle based on rationality alone either. Ultimately the selection and valuation
process must bottom out in some kind of nonrational thought.
Reason
is about drawing conclusions from premises using appropriate rules, whereas at
the most abstract level, ethics is about what premises to begin with. We can push this decision back further and
further – reasoning about ethical rules based on ethical systems, and reasoning
about ethical systems based on ethical principles – but ultimately we must
stop, and acknowledge that we need to make a nonrational choice of premises. I have chosen this stopping-point at the
level of “abstract ethical principles” like the ones listed above.
Hume
isolated this nonrational bottoming-out in “human nature,” the human version of
“animal instinct.” Buddhist thought, on the other hand, associates it with the “higher self,” and the individual
self’s recognition of its interpenetration with the rest of the universe and
its ultimate nonexistence. My own view
is that Buddhism and Hume are both partly right – but that neither has gotten
at the essence of the matter. Hume is
right that our hard-wired instincts certainly play a large role in such
high-level, nonrational choices. And
Buddhism is right that subtle patterns connecting the individual with the rest
of the universe play a role here.
The crux
of the matter, I believe, lies in the dynamical-systems-theory notion of an
attractor. An attractor is a pattern
that tends to arise in a dynamical system, from a wide variety of different
preliminary conditions. A strict
mathematical attractor must persist forever once entered into; but one may also
speak of “probabilistic attractors” that are merely very likely to persist, or
that may mutate slightly and gradually over time, etc.[15]
I think that part of “human nature” consists of peculiarities of the
human mind/brain, whereas part of it consists of generic attractors that have
appeared in the human psyche – or as emergents among human minds or between
human minds and their environments --
because they generally tend to pop up in a lot of complex systems in a lot of
circumstances.
One
reason why some meta-ethics appear more convincing than others, then, is that
these meta-ethics appear to be attractors: they are “universal attractors,”
i.e. principles that arise as patterns in many different complex systems in
many different situations. This doesn’t
mean that they’re logically correct in the sense of following from some a
priori assumption regarding what is good.
Rather it means that, in a sense, they follow from the universe. This point will be returned to a little
later.
Of
course, we are still left with a selection problem, because there may be
different universal attractors that contradict each other. Does the more powerful universal attractor
win, or is this just a matter of chance, or context-dependent chance, or subtle
factors we paltry humans can’t understand?
I’ll leave off here and turn to slightly more concrete issues!
Firstly,
Cosmic Hedonism refers to the ethical system that values happiness above
all. In this perspective, our goal for
the post-Transcension universe should be to maximize the total amount of
happiness in the cosmos. Of course, the
definition of “happiness” poses a serious problem, but if one agrees that
Cosmic Hedonism is the right approach, one can impose the understanding of
happiness as part of the goal for the post-Transcension period. The goal becomes to understand what
happiness is, and then maximize it.
However, even if
one had a crisp and final definition of happiness, there would be a problem
with Cosmic Hedonism – a problem that I’ve come to informally refer to as the
problem of the “universal orgasm.” The question
is whether we really want a universe that consists of a single massive wave of
universal orgasmic joy. Perhaps we do
all want this, in a sense – but what if this means that mind, intelligence,
life, humanity and everything else we know becomes utterly nonexistent?
The
ethical maxim that I call the Principle of Joyous Growth attempts to circumvent
this problem, by adding an additional criterion:
What
does “growth” mean? A very general
interpretation is: Increase in the amount and complexity of patterns in the
universe. The Principle of Joyous
Growth rules out the universal orgasm outcome unless it involves a continually
increasing amount of pattern in the universe.
It rules out a constant, ecstatically happy orgasmic scream.
Of
course, maximizing two quantities at once is not always possible, and in
practice one must maximize some weighted average of the two. Different weightings of happiness versus
growth will lead to different practical outcomes, all lying within the general
purvey of the conceptual Principle of Joyous Growth.
The
Joyous Growth principle, without further qualification, is definitely not
Friendly in the Yudkowskian sense. In
fact it is definitively un-Friendly, in the sense that we humans are far from
maximally happy -- and in this as well as other ways, we are basically begging
to be transcended. A
post-Transcension universe operating according to the Principle of Joyous
Growth would not be all that likely to involve the continuation of the human
race.
An
alternative is to add a third criterion, obtaining a Principle of Voluntary
Joyous Growth, i.e.
This
means adopting as an important value the idea that sentient beings should be
allowed to choose their own destiny.
For example, they should be allowed to choose unhappiness or stagnation
over happiness and growth.
Of
course, the notion of “choice” is just as much a can of worms as
“happiness.” Daniel Dennett’s recent
book Freedom Evolves[16] does an excellent job of sorting through
the various issues involved with choice and freedom of will. While I don’t accept Dennett’s reductionist
view of consciousness, I find his treatment of free will generally very clear
and convincing.
Note
that including choice as a variable along with two others implies that ensuring
free choice for all beings is not an absolute commandment. Of course, given the extent to which human
wills conflict with each other, free choice for all beings is not a possible
opportunity. Given a case where one
being’s will conflicts with another being’s will, the Voluntary Joyous Growth
approach is to side with the being whose choice will lead to greater universal
happiness and growth.
Voluntary
Joyous Growth is not a simple goal, because it involves three different factors
which may contradict each other, and
which therefore need to be weighted and moderated. This complexity may be seen as unfortunate – or it may be seen as
making the ethical principle into a more subtle, intricate and fascinating
attractor of the universe.
2.3 Attractive Compassion
I
should note that my goal in positing “Voluntary Joyous Growth” has been to
articulate a minimal set of ethical principles. These are certainly not the only qualities that I consider
important. For example, I strongly
considered including Compassion as an ethical principle, since Compassion is,
in a sense, the root of all ethics.
However, it occurred to me that Compassion is actually a consequence of
choice, growth and freedom. In a
universe consisting of beings that respect the free choices of other beings,
and that want to promote joy and growth throughout the universe, compassion for
other beings is inevitable – because “being good to others” is generally an
effective way to induce these others to contribute toward the joy and growth of
the universe. Without the inclusion of
choice, Joyous Growth is consistent with simply (painlessly) annihilating
unhappy or insufficiently productive minds and replacing them with “better”
ones; but assigning a value to choice gives a disincentive to dissolve “bad”
minds and leads instead to the urge to help these minds grow and be joyful.
This
ties in with the notion that compassion itself is a “universal attractor.” Or, a more accurate statement is: A
modest level of compassion is a universal attractor. We can see this in the fact that it ensues
from the combination of the universal attractors of Joy, Growth and Choice; and
we can also see it in the evolution of human society. Most likely, compassion emerged in human beings because, in a
small tribe setting, it is often valuable for each individual to be kind to the
other individuals in the tribe, so as to keep them alive and healthy. This is the case regardless of whether the
tribe members are genetically related to each other; it’s the case purely
because, in many situations, the survival probability of an individual is
greater if
So if
humanity is divided up into tribes – because individual humans can survive
better in groups than all alone – then compassion toward tribe members
increases individual fitness.
Compassion emerges spontaneously via natural selection, in situations
where there is a group of minds which each has choice, and which (via growth)
have the complexity to cooperate to some extent.
Note
that absolute compassion doesn’t emerge from this tribal-evolutionary logic,
but a moderate level of compassion does.
Similarly, absolute compassion doesn’t come out of Voluntary Joyous
Growth – but it seems that a moderate level of compassion does. It seems more likely that “moderate
compassion” is a universal attractor than that “absolute compassion” a la
Buddha or Mother Teresa is.
Interestingly,
it’s harder to see how compassion would evolve among humans living in a large-group
society like modern America. In this
case, there’s not such a direct incentive for an individual to be kind to
others. It may be that a population of
rational-actor minds plunked into a large society would never evolve compassion
to any significant degree. However, I
suspect that without compassion, society would collapse into anarchy – and
anarchy would give way to a tribal society … in which compassion would evolve,
showing the power of compassion as an attractor once again!
2.4 Nostalgia
Philip Sutton, on
reviewing an earlier version of this essay, pointed out that I had omitted a
value that is very important to him: sustenance and preservation of what
already exists. On reading his
comments, I reflected that I had made this omission because, in fact, this
value – which I’ll call Nostalgia – is not all that important to me personally.
I
myself am somewhat attached to many things that exist -- such as my family and
friends, my self, my pets, Jimi Hendrix CD’s, Haruki Murakami novels and pinon
pine trees and Saturday mornings in bed and long whacky email conversations, to
name just a few – but I don’t consider this kind of attachment a primary
value. I think it’s important that I,
as a sentient being, have a choice to retain these things if they are important
to me and contribute toward my self-perceived happiness. But I don’t see an intrinsic value in
maintaining the past –whereas I do see an intrinsic value in growth and
development.
However,
I don’t see Nostalgia as a destructive or unpleasant value, and nor do I see it
as contradictory with growth, joy or choice.
The universe is a big place – and quite likely, many parts of it are not
terribly important to any sentient being.
It may well be possible to preserve the most important patterns that
currently exist in the universe, and still use the remainder of the universe to
create wonderful new patterns. The
values of Growth and Nostalgia only contradict each other in a universe that is
“full,” in the sense that every piece of mass-energy is part of some pattern
that is nostalgically important to some sentient being. In a full universe one must make a choice,
in which case I’ll advocate Growth … but it’s not clear whether such a thing as
a full universe will ever exist. It may
be that the process of growth will continue to open up ever more horizons for
expansion.
2.5 Smigrodzki’s Meta-Ethic
An
alternative approach, proposed by Rafal Smigrodzki in a discussion on the SL4
list, is to begin with an even more abstract sort of meta-ethic. Abstract though it is, the Principle of
Voluntary Joyous Growth still imposes some specific ethical standards. On the other hand, Smigrodzki proposes a
pure meta-ethic with no concrete content:
Find rules that will be accepted.
This principle
arose in a discussion of the analogy between ethics and science, and
specifically as an analogue to Karl Popper’s meta-rule for the scientific
enterprise[17]:
Popper’s
meta-rule specifies nothing about the particular contents of any scientific
theory or scientific research programme, it speaks only of what kinds of
theories are to be considered scientific.
Similarly, Smigrodzki’s meta-rule specifies nothing about what kinds of
actions are to be considered ethical, it speaks only of what kinds of
rule-systems are to be considered as falling into the class of “ethical
rule-systems”: namely, rule-systems that are accepted.
One
interesting thing about Smigrodzki’s meta-rule is how close it comes to the
Principle of Voluntary Joyous Growth.
To see this, consider first that the notion of “be accepted” assumes the
existence of volitional minds that are able to accept or reject rules. So to find rules that will be accepted, it’s
necessary to first find (or ensure the continued existence of) a community of
volitional minds able to accept rules.
Next,
observe that one version of the nebulous notion of “happiness” is “the
state that a volitional mind is in when it gets to determine enough of its
destiny by its own free choice.” This
is almost an immediate consequence from the notions of happiness and
choice. For, if happiness is what a
mind wants, and a mind has enough ability to determine its destiny via free
choice, then naturally the mind is going to make choices maximizing its
happiness.
So, “Find rules
that will be accepted” is arguably just about equivalent to “Create or maintain
a community of volitional minds, and find rules that the this community will
accept (thus making the community happy).”
But
then we run up against the problem that not all minds really know what will
make them happy. Often minds will
accept rules that aren’t really good for them – even by their own standards –
out of ignorance, stupidity or self-delusion.
To avoid this, one wants the minds to be as smart, knowledgeable and
self-aware as possible. So one winds up
with a maxim such as: “Create or maintain a community of volitional minds, with
an increasing level of knowledge, intelligence and self-awareness, and find
rules that the this community will accept (thus making the community happy).”
Incidentally,
Popper’s meta-rule of science also is susceptible to the “stupidity and
self-delusion” clause. In other words,
“Find conjectures that have more empirical content than their predecessors”
really means “Find conjectures that seem to a particular community of
scientists to have more empirical content than their predecessors” – and the
meaningfulness of this really depends on how smart and self-aware the community
of scientists is. The history of
science is full of apparent mistakes in the assessment of “degrees of empirical
content.” So Popper’s meta-rule could
be revised to read “Find conjectures that have more empirical content than
their predecessors, as judged by a community of minds with increasing
intelligence and self-awareness.”
The
notion of “increasing level of knowledge” can also be refined
somewhat. What is knowledge, after
all? One way to gauge knowledge is
using the philosophy of science.
Lakatos’s theory of research programmes[18] suggests that a scientific research
programme – a body of scientific theories – is “progressive” (i.e. good) if it
meets a number of criteria, including
One
interpretation of “increasing level of knowledge” is “association with a series
of progressive scientific research programmes.”
Once
all these details are put in place, my fleshing-out of Smigrodzki’s meta-rule
(which may well make it fleshier than Smigrodzki would desire) becomes an awful
lot like the Principle of Voluntary Joyous Growth. We have happiness, we have choice, and we have growth (in the
form of growth of intelligence, knowledge and self-awareness). The only real difference from the earlier
formulation of the Principle of Voluntary Joyous Growth is the nature of the
growth involved: is it in the universe at large, or within the minds in a
community that is accepting ethical rules?
2.6 Joyous Growth Biased Voluntarism
Voluntary Joyous Growth, obviously, has a
different relationship to Friendliness than pure Joyous Growth. Voluntary Joyous Growth means that, even if
superhuman AI’s determine that joy and growth would be maximized if the mass-energy
devoted to humans were deployed in some other way – even so, the choices of
individual humans (whether to remain human or let their mass-energy be deployed
in some other way) will still be respected and figured into the
equation.
One could
try to make Voluntary Joyous Growth more explicitly human-friendly by making
choice the primary criterion. This is
basically what’s achieved by my fleshed-out version of Smigrodzki’s
meta-rule. In this version, the #1
ethical meta-principle is to let volitional minds have their choices wherever
possible. Only when conflicts arise do
the other principles – maximize joy and growth – come into play. This might be called “Joyous Growth Biased
Voluntarism.” Joy and growth still may
play a very big role here, because quite obviously, conflicts may arise quite
frequently between volitional minds coupled in a finite universe. However, one can envision scenarios in which
all inter-mind conflicts are removed, so that it’s possible to fulfill choices
without considering joy and growth at all.
For
instance, what if all the minds in the universe decide they all want to play
video games and live in purely automated simulated worlds rather than worlds
occupied with other minds? Then living
in individual video-game-worlds of their choice may gratify them quite
adequately: so they have maximum choice, but no opportunity for any factor
besides choice to come into play. In
this case minds may, consistently with Joyous Growth Biased Voluntarism, make
themselves unhappy and refuse opportunities for growth unto eternity. In my personal judgment, this is a mark
against Joyous Growth Based Voluntarism and in favor of simple Voluntary Joyous
Growth with its greater flexibility. I
suspect that Voluntary Joyous Growth is much closer to being a powerful
attractor in the universe.
2.7 Human Preservationism
A more
extreme ethical principle, in the vein of Joyous Growth Biased Voluntarism, is
what I call Human Preservationism. In
this view, the preservation of the human race through the post-Transcension
period is paramount. Where this differs
from Joyous Growth Biased Voluntarism is that, according to Human
Preservationism, even if all humans want to become transhuman and leave human
existence behind, they shouldn’t be allowed to.
In
fact, I don’t know of any serious transhumanist thinkers who hold this
perspective. While many transhumanists
value humanity and some personally hope that traditional human culture persists
through the Transcension; transhumanists tend to be a freedom-centered bunch,
and few would agree with the notion of forcing sentient beings to remain
human against their will. But even so,
Human Preservationism is a perfectly consistent philosophy of
Transcension. There’s nothing
inconsistent about wanting vastly superhuman minds and new orders of beings to
come into existence, yet still placing an absolute premium on the persistence
of the peculiarly human.
A
(somewhat more appealing) variation on Human Preservationism is Cautious
Developmentalism, a perspective I will discuss a little later on. The abstract principle here is: If things
are basically good, keep them that way, and explore changes only very slowly
and cautiously. In practical terms,
the idea here is to preserve human life basically as-is, but to allow very slow
and careful research into Transcension technologies, in such a way as to
minimize any risk of either a bad Transcension or another bad existential
outcome. Then the choice of how to
approach the Transcension is deferred to future generations, and the problem
for the present generation is redefined as figuring out how to set the Cautious
Developmentalist course in motion.
2.8 Humaneness
Yudkowsky
has proposed that “The important thing is not to be human but to be humane.” Enlarging on this point, he argues that [19]
Though we might wish to believe that Hitler was an inhuman monster, he was,
in fact, a human monster; and Gandhi is noted not for being remarkably human but
for being remarkably humane. The attributes of our species are not exempt from
ethical examination in virtue of being "natural" or
"human". Some human attributes, such as empathy and a sense of
fairness, are positive; others, such as a tendency toward tribalism or
groupishness, have left deep scars on human history. If there is value in being
human, it comes, not from being "normal" or "natural", but
from having within us the raw material for humaneness: compassion, a sense of
humor, curiosity, the wish to be a better person. Trying to preserve
"humanness", rather than cultivating humaneness, would idolize the
bad along with the good. One might say that if "human" is what we
are, then "humane" is what we, as humans, wish we were. Human nature
is not a bad place to start that journey, but we can't fulfill that potential
if we reject any progress past the starting point.
In email comments on an earlier draft of this paper, Yudkowsky noted that he
felt my summary of his theory didn’t properly do it justice. Conversations following these comments have
improved my understanding of his thinking; but even so, I’m not certain I fully
“get” his ideas. So, rather than
explicitly commenting on Eliezer’s Friendly AI theory here, I will introduce a
theory called “Humane AI,” which I believe is somewhat similar to his approach,
but may also have some differences. I
will present some arguments describing difficulties with Humane AI, which may
not all be problems with Friendly AI, in the sense that there may be solutions
to these problems within Friendly AI theory that I don’t fully understand.
In Humane
AI, one posits as a goal, not simply the development of AI’s that are
benevolent to humans, but the development of AI’s that display the qualities of
“humaneness,” where “humaneness” is considered roughly according to Yudkowsky’s
description above. That is, one proposes “humaneness” as a kind
of ethical principle, where the principle is: “Accept an ethical system to the
extent that is agrees with the body of patterns known as ‘humaneness’.”
Now, it’s
not entirely clear that “humaneness,” in the sense that Yudkowsky proposes, is a well-defined concept. It
could be that the specific set of properties called "humaneness" you
get depend on the specific algorithm that you use to sum together the wishes of
various individuals in the world? If
so, then one faces the problem of choosing among the different algorithms. This is a question for a future, more
scientific study of human ethics.
The major problem with distinguishing “humaneness”
from “human-ness” is to distinguish the "positive" from the
"negative" aspects of human nature -- e.g. compassion (viewed as
positive) versus tribalism (viewed as negative). The approach hinted at in
the above Yudkowsky quote is to use a kind of “consensus” process. For instance, one hopes that most people, on
careful consideration and discussion, will agree that tribalism although
humanly universal, isn't good. One defines the extent to which a given
ethical system is humane as the average extent to which a human, after careful
consideration and discussion, will consider that ethical system as a good one. Of course, one runs into serious issues with
cultural and individual relativity here.
Personally, I'm not so confident that
people's "wishes regarding what they were" are generally good ones (Which
is another way of saying: I think my own ethic differs considerably from the
mean of humanity's.) For instance, the
vast majority of humans would seem to believe that "Belief in God" is
a good and important aspect of human nature.
Thus, it seems to me, "Belief in God" should be considered
humane according to the above definition -- it's part of what we humans are,
AND, part of what we humans wish we were.
But nevertheless, I think that belief in God -- though it has some
valuable spiritual intuitions at its core – is essentially ethically
undesirable. Nearly all ethical systems
containing this belief have had overwhelming negative aspects, in my view. Thus, I consider it my ethical responsibility
to work so that belief in God is not projected beyond the human race
into any AGI's we may create. Unless
(and I really doubt it) it's shown that the only way to achieve other valuable
things is to create an AGI that contains such a belief system. Of course, there are many other examples
besides "belief in God" that could be used to illustrate this point.
To get around
problems like this, one could try to define humaneness as something like
"What humans WOULD wish they were, if they were wiser humans" -- but of
course, defining “wiser humans” in this context requires some ethical or meta-ethical
standard beyond what humans are or wish they were.
So, in
sum, the difficulties with Humane AI are
The
second point here may seem bizarrely egomaniacal – who am I to judge the vast
mass of humanity as being ethically wrong on major points? And yet, it has to be observed that the vast
mass of humanity has shifted its ethical beliefs many times over history. At many points in history, the vast mass of
humans believed slavery was ethical, for instance. Now, you could argue that if they’d had enough information, and
carried out enough discussion and deliberation, they might have decided it was
bad. Perhaps this is the case. But to
lead the human race through a process of discussion, deliberation and discovery
adequate to free it from its collective delusions – this is a very large
task. I see no evidence that any
existing political institution is up to this task. Perhaps an AGI could carry out this process – but then what is
the goal system of this AGI? Do we
begin this goal system with the current ethical systems of the human race – as Yudkowsky
seems to suggest in the above (“Human nature is not a bad place to start…”)? In that case, does the AGI begin by
believing in God and reincarnation, which are beliefs of the vast majority of
humans? Or does the AGI begin with some
other guiding principle, such as Voluntary Joyous Growth? My hypothesis is that an AGI beginning with
Voluntary Joyous Growth as a guiding principle is more likely to help humanity
along a path of increasing wisdom and humane-ness than an AGI beginning with
current human nature as a guiding principle.
One can
posit, as a goal, the creation of a Humane AI that embodies humane-ness as
discovered by humanity via interaction with an appropriately guided AGI. However, I’m not sure what this adds, beyond
what one gets from creating an AGI that follows the principle of Voluntary
Joyous Growth and leaving it to interact with humanity. If the creation of the Humane AI is going to
make humans happier, and going to help humans to grow, and going to be
something that humans choose, then the Voluntary Joyous Growth based AGI
is going to choose it anyway. On the
other hand, maybe after humans become wiser, they’ll realize that the creation
of an AGI embodying the average of human wishes is not such a great goal
anyway. As an alternative, perhaps a
host of different AGI’s will be created, embodying different aspects of human
nature and humane-ness, and allowed to evolve radically in different
directions.
2.9 Ethical Principles, Systems and Rules
My
discussion of ethics has lived on a very abstract level so far -- and this has
been intentional. I have sought to
treat ethics in a manner similar to the philosophy of science. In science we have Popper’s meta-rule, and
then we have scientific research programmes, which may be evaluated
heuristically as to how well they fulfill Popper’s meta-rule: how good are they
at being science? Then, within each
research programme, we have a host of specific scientific theories and
conjectures, none of which can be evaluated or compared outside the context of
the research programmes in which they live.
Similarly, in the domain of ethics, we have highly abstract principles
like Smigrodzki’s meta-rule or the Principle of Voluntary Joyous Growth – and
then, within these, we may have particular ethical rule-systems, which in turn
generate specific rules for dealing with specific situations.
My feeling is
that the specific ethical rule-systems that promote a given abstract principle
in a human context are very unlikely to survive the Transcension. For instance, the standard ethics according
to which modern Americans live involves a host of subtle compromises, involving
such issues as
and so
on and so forth. This complex system of
compromises that constitutes our modern American practical ethics is not in
itself a powerful attractor. It is
largely in accordance with the Principle of Voluntary Joyous Growth – it tries
to promote happiness, progress and choice – but I have no doubt that,
Transcension or no, in a couple hundred years a rather different network of
compromises will be in place. And
post-Transcension, the practical manifestations of the Principle of Voluntary
Joyous Growth will be very radically different.
(As an
aside, it is clearly no coincidence that the Principle of Voluntary Joyous
Growth harmonizes better with modern urban American ethics than with the ethics
of many other contemporary cultures.
More so than, say, Arabia or China or the Mbuti pygmies, American culture
is focused on individual choice, progress and hedonism. And so, I’m aware that as a modern American
writing about Voluntary Joyous Growth, I’m projecting the nature of my own
particular culture onto the transhuman future.
On the other hand, it’s not a coincidence that America and relatively
culturally similar places are the ones doing most of the work leading toward
the Transcension. Perhaps it is
sensible that the cultures most directly leading to the Transcension should
have the most post-Transcension-friendly philosophies.)
One
thing this system-theoretic perspective says is: We can’t judge the modern
American ethical system by any one judgment it makes – we can only judge, as a whole,
whether it tends to move in accordance with the principle we choose as a
standard (e.g. Voluntary Joyous Growth).
And similarly, we can’t reasonably ask post-Transcension minds to follow
any particular judgment about any particular situation – rather, we can only
ask them to follow some ethical system that tends to move in accordance with
some general principle we pose as a standard.
(And this request is more likely to be fulfilled, a priori, if it
constitutes a powerful attractor in the universe at large.)
Thus, I
suggest that “Be nice to humans” or “Obey your human masters” are simply too
concrete and low-level ethical prescriptions to be expected to survive the
Transcension. On the other hand, I
suggest that a highly complex and messy network of beliefs like Yudkowsky’s “humane-ness”
is insufficiently crisp, elegant and abstract to be expected to survive the
Transcension. Perhaps it’s more
reasonable to expect highly abstract ethical principles to survive. Perhaps it’s more sensible to focus on
ensuring the Principle of Voluntary Joyous Growth to survive the Transcension,
than to focus on specific ethical rules (which have meaning only within
specific ethical systems, which are highly context and culture bound) or the
whole complex mess of human ethical intuition.
Initially
principles like joy, growth and choice will be grounded in human concepts and
feelings -- in aspects of "humane-ness" -- but as the Transcension
proceeds they will gain other, related groundings.
In
terms of technical AI theory, this contrast between general principles and
specific rules relates to the issue of “stability through successive
self-modifications” of an AI system. If
an AI system is constantly rewriting itself and re-rewriting itself, how likely
is it that this or that specific aspect of the system is going to persist over
time? One would like for the basic
ethical goal-system of the AI to persist through successive rewritings, but
it’s not clear how to ensure this, even probabilistically. The properties of AI goal-systems under
iterative self-modification are basically unknown and will be seriously explorable
only once we have some reasonably intelligent and self-modifiable AI systems at
hand to experiment with. However, my
strong feeling is that the more abstract the principle, the more likely it
is to survive successive self-modification. A highly specific rule like “Don’t eat yellow snow” or “Don’t
kill humans” or a big messy habit-network like “humane-ness” is relatively
unlikely to survive; a more general principle like Voluntary Joyous Growth is a
lot more likely to display the desired temporal continuity. I’m betting that this intuition will be
borne out during the exciting period to come when we experiment with these
issues on simple self-modifying, somewhat-intelligent AGI systems. And this intuition is followed up by the
intuition mentioned above: that, among all the abstract principles out there,
the ones that are more closely related to powerful attractors in the universe
at large, are more likely to occur as attractors in an iteratively
self-modifying AGI, and hence more likely to survive through the Trancension.
So, my
essential complaint against Yudkowsky’s Friendly AI theory is that – quite
apart from ethical issues regarding the wisdom of using mass-energy on humans
rather than some other form of existence -- I strongly suspect that it’s impossible
to create AGI’s that will progressively radically self-improve and yet
retain belief in the “humane-ness” principle.
I suspect this principle is just too non-universal to survive the
successive radical-self-improvement process and the Transcension. On the other hand, I think a more abstract
and universally-attractive principle like Voluntary Joyous Growth might well
make it.
Please
note that this is very different from the complaint that Friendly AI
won’t work because any AI, once it has enough intelligence and power, will
simply seize all processing power in the universe for itself. I think this “Megalomaniac AI” scenario is
mainly a result of rampant anthropomorphism.
In this context it’s interesting to return to the notion of
attractors. It may be that the
Megalomaniac AI is an attractor, in that once such a beast starts rolling, it’s
tough to stop. But the question is, how
likely is it that a superhuman AI will start out in the basin of attraction of
this particular attractor? My intuition
is that the basin of attraction of this attractor is not particularly
large. Rather, I think that in order to
make a Megalomaniac AI, one would probably need to explicitly program an AI
with a lust for power. Then, quite
likely, this lust for power would manage to persist through repeated self-modifications
– “lust for power” being a robustly simple-yet-abstract principle. On the other hand, if one programs one’s
initial AI with an initial state aimed at a different attractor meta-ethic,
there probably isn’t much chance of convergence into the megalomaniacal
condition.
2.10 Harmony with the Nature of the Universe
This
leads us to a point made by Jef Albright on the SL4 list, which is that the
philosophy of Growth ties in naturally with the implicit “ethical system”
followed by the universe – i.e., the universe grows. In other words, Growth is a kind of universe-scale attractor
– once one has a universe devoted to pattern-proliferation-and-expansion, one
will likely continue to do so for quite a while … the newly generated patterns
will generate yet more patterns, and so forth.
It is also a “universal attractor,” in the sense of an attractor that is
common in various dynamical subsystems of the universe.
I think
there’s a similar philosophical argument that Voluntary Joyous Growth is also harmonious
with the pattern of the universe – i.e. also holds promise as a universal
attractor.
Regarding
the Voluntary part – the evolution of life shows how powerful wills naturally
emerge from the weaker-willed … and then continue to survive due to their
powerful wills, and create yet more willed beings.
And if
you believe humans have a greater and deeper capacity for joy than rocks or
trilobites or pigs, then we can also see in natural evolution a movement toward
increasing Joy. Joyful creatures interact
with other Joyful creatures and produce yet more Joyful creatures – Joy wants
to perpetuate itself.
On the
other hand, the Friendly AI principle does not seem to harmonize naturally with
the evolutionary nature of the universe at all. Rather, it seems to contradict a key aspect of the nature of the
universe -- which is that the old gives way to the new when the time has come
for this to occur.
Sure,
there’s a certain quixotic nobility in maintaining ethics that contradict
nature. After all, in a sense,
technology development is all about contradicting nature. But in a deeper sense, I argue, technology
development is all about following the nature of the universe – following the
universal tendency toward growth and development. Modern technology may be in some ways a violation of biological
nature, but it’s a consequence of the same general-evolutionary principle that
led to the creation of biological forms out of the nonliving chemical stew of
the early Earth. There is a quixotic
beauty in contradicting nature -- but an even greater and deeper beauty,
perhaps, in contradicting local manifestations of the nature of the
universe while according with global ones.
In breaking out of local attractor patterns but remaining wonderfully in
synch with global ones.