The J Curve

Sunday, August 29, 2004

Can friendly AI evolve?

Humans seem to presume an "us vs. them" mentality when it comes to machine intelligence (certainly in the movies =).

But is the desire for self-preservation coupled to intelligence or to evolutionary dynamics?… or to biological evolution per se? Self-preservation may be some low-level reflex that emerges in the evolutionary environment of biological reproduction. It may be uncoupled from intelligence. But, will it emerge in any intelligence that we grow through evolutionary algorithms?

If intelligence is an accumulation of order (in a systematic series of small-scale reversals of entropy), would a non-biological intelligence have an inherent desire for self-preservation (like HAL), or a fundamental desire to strive for increased order (being willing, for example, to recompose its constituent parts into a new machine of higher-order)? Might the machines be selfless?

And is this path dependent? Given the iterated selection tests of any evolutionary process, is it possible to evolve an intelligence without an embedded survival instinct?

9 Comments:

  • The instinct toward self-preservation only exists because it has been selected for, right?

    Create an environment that responds not to "self-preservation" oriented behavior but to "selfless" behavior (for instance, by implementing a karma system in your digital petri dish universe that rewards selfless behavior and punishes selfish behavior) and you should be able to lay the groundwork of selfless instinct.

    But instinct is only a foundation. Once you reach self-awareness and intelligence, all bets are off. I think humans have demonstrated that fairly well. :-)

    BTW, if the neural-net guys are going down the right path to AI (and I think they are), the ones with the most processing power and the most all-encompassing database of human knowledge will be the first to get there. Which means to me Google and Microsoft. So maybe the question should be "What happens when two AI's, one whose overriding belief is 'Don't be evil,' and the other whose overriding belief is, 'Win at any cost,' emerge simultaneously?"

    By Blogger Charlie, at 5:07 AM  

  • Well, that is Google’s goal, so it will be interesting to see how it plays out.

    I have a hard time imagining an evolutionary algorithm that does not select for “survival” at some fundamental level (and still produces interesting results and an accumulation of design).

    As for the karma selection criteria, how would that actually be defined and applied? (The most selfless design makes it to the next round and the others don’t?) Lacking any selection criteria for viability, the karma-evolved population will die off in any competitive ecosystem of deployment. In other words, not everything in the world is “karma-evolved”, and to co-exist in any meaningful way in the real world will require some balance with survival needs. For example, the most selfless human would not consume any resources and would die in short order.

    Maybe this is obvious. But once you expand the selection criteria to be “live for some time and be selfless”, then I return to the embedded survival instinct question.

    Will the equivalent of the “reptilian brain” arise at the deepest level in any design accumulation over billions of competitive survival tests?

    A related question: can the frontier of complexity be pushed by any static selection criteria, or will it require a co-evolutionary development process?

    By Blogger Steve Jurvetson, at 8:42 PM  

  • Self-preservation must be programmed in, whether organically, ala natural selection, or willfully, because it makes for more robust software.

    Would a non-biological intelligence sacrifice its life for a higher-order existence? The Wachowski brothers obviously think so:). The sentinels sacrificed themselves by the thousands for the perceived betterment of the greater machine. Selflessness of the individual often means self-preservation for the species. Constituent elements of a system must often behave selflessly, sacrificing their existence for the survival of the parent.

    In any system's life, when a manifestation of entropy threatens to diminish it, it must act to counter the threat. In the case of digital life, we, its creators, must program in self-preservation, the security programs that defeat entropy, or redesign our genetic algorithms to auto-generate it, otherwise entropy will eventually win.

    I don't believe it's possible to evolve intelligence without an embedded survival instinct. Security must be built around the very core of the system, and around every vulnerable layer from there out. We are forever defeating entropy, and so it would seem we will always be.

    Can friendly AI evolve? Yes, so long as we don't subsequently threaten its existence in a way we've already programmed it to defend against. But in a world of finite resources, war with our own creations does seem inevitable, unless... we develop with our creations a deep symbiosis.

    Symbiosis, it would seem, is the ultimate friendliness.

    Steve, my partner and I saw you at Art of the Start in June. You were our favorite speaker, very inspiring. We're now writing a business plan around an enterprise application with a strong AI component. We hope to be ready for show, complete with prototype, by the end of the year. Hopefully you'll be interested in taking a look at it. Best regards, Carl Carpenter (peural@hotmail.com)

    By Anonymous Anonymous, at 11:21 AM  

  • Eliezer Yudkowsky (who has written a lot on the topic of Friendly AI) provided an interesting response to this question, which I am posting for him:

    "Evolving Friendly AI looks impossible. Natural selection has a single criterion of optimization, genetic fitness. Whichever allele outreproduces other alleles ends up dominating the gene pool; that's the tautology of natural selection. Yet humans have hundreds of different psychological drives, none of which exactly match the original optimization criterion; humans don't even have a concept of "inclusive genetic fitness" until they study evolutionary biology.

    How did the original fitness criterion "splinter" this way? Natural selection is probabilistic; it produced human psychological drives that statistically covaried with fitness in our ancestral environment, not humans with an explicit psychological goal of maximizing fitness. Humans have no fuzzy feelings toward natural selection, the way we have fuzzy feelings toward our offspring and tribesfolk. The optimization process of natural selection did not foresee, could not foresee, would not even try to foresee, a time when humans could engineer genes. Our attitude toward natural selection, which will determine the ultimate fate of DNA, is a strict spandrel of psychological drives that evolved for other reasons.

    The original optimization criterion ("nothing matters except the genes") was not faithfully preserved, either in the computational encoding of our psychological drives, or in the actual effect of our psychologies now that we're outside the ancestral context.

    This problem is intrinsic to any optimization process that makes probabilistic optimizations, or optimizes on the basis of correlation with the optimization criterion. The original criterion will not be faithfully preserved. This problem is intrinsic to natural selection and directed evolution.

    Building Friendly AI requires optimization processes that self-modify using deductive abstract reasoning (P ~ 100%) to write new code that preserves their current optimization target."

    By Blogger Steve Jurvetson, at 8:39 PM  

  • Steve, thank you for clarifying, and for the additional information. Eliezer Yudkowsky's writings on Friendly AI will no doubt stimulate more thought.

    How did the original fitness criterion splinter given nature’s selection of psychological drives covariant to fitness? I believe the answer may lie in seeing evolution as a series of emergent events. Emergent discontinuities create paradigm shifts which produce new behaviors and drives. Those traits which survive, those selected for, then re-aggregate to form the basis for the next discontinuity. What if seeming divergence from original selection criteria is in actuality spiraling back on itself? What if it’s actually leading us to some kind of emergent maximum granting us conscious control of selection itself?

    A Turing Tested manifestation of AI might prove to be just such a discontinuity. It could be that the kind of AI for which natural selection evolves symbiotically with us lies on a different scale than that currently imagined. It could be that emergence itself is an agent of mutation in the next, higher-level environment. Perhaps at each successive level of emergence the original optimization criterion is altered a little bit more (as a function of the accompanying paradigm shift) until it folds back upon itself, redoubling, and thereby generating the emergent maximum that redefines the foundation of the entire framework.

    In other words, I suspect your stated requirement for building friendly AI, that it involve optimization processes that self-modify using deductive abstract reasoning, might be realizable on the new emergent level I'm suggesting, a level which is itself produced by AI. Please forgive me if I seem a bit cagey here or this seems less than clear. I'm not quite ready to reveal the new technology my partners and I are working on.

    Again, thank you for this post. So nice to be able to dialogue with a mind like yours. Hopefully the hints I've offered have made you curious:). Best regards, –Carl Carpenter (peural@hotmail.com)

    By Anonymous Anonymous, at 6:47 PM  

  • It's not obvious to me how friendlyness comes out of self-modifying code.  I think this is a very powerful mechanism for recursive design improvement, but carries the risk of getting out of control.   Yes?

    By Blogger Steve Jurvetson, at 2:16 PM  

  • Eric Drexler emailed me this contribution to be included in the discussion:In speaking of machine "intelligence", we find it natural to ask what an imagined entity, "the intelligence" will do, and to worry about problems of "Us vs. Them". This question, however, reveals a deeply embedded assumption that itself should be questioned.

    If someone spoke of machine "power", then asked what "the power" will do, or spoke of machine "efficiency" and then asked what "the efficiency" will do, this would seem quite odd. Strength and efficiency are properties or capacities, not entities. Our human intelligence is closely bound up with our existence as distinct, competing entities, but is this really the only form that intelligence can take? Perhaps our questions about artificial intelligence are a bit like inquiring after the temperament and gait of a horseless carriage.

    A system demonstrates a crucial kind of intelligence if it produces innovative solutions to a wide range of problems. The size and speed of the system are not criteria for intelligence in this sense. With this definition, it becomes clear that we already know of intelligent systems that are utterly unlike individual wanting-and-striving human beings. They may be large or slow, but their existence can help us to correct our brain-in-a-body bias.

    One such system is global human society, including the global economy. It has build global telecommunications systems, engineered genes, and launched probes into interstellar space, solving many difficult problems along the way. One may object that society's intelligence merely reflects the intelligence of intelligent entities, but it nonetheless greatly exceeds their individual capacities, an it is organized in a radically different way. Society is ecological, not unitary. It is not a competing, mortal unit, in the evolutionary sense. It is far from having a single goal, or a hierarchy of goals, or even much coherence in its goals, and yet it is the most intelligent system in the known universe, as measured by its capacity to solve problems.

    If this seems like a cheat because society is based on intelligent units, consider another example: the biosphere. It has built intricate molecular machinery, continent-wide arrays of solar collectors, optically guided aerial interceptors, and human bodies, all based on the mindless process of genetic evolution. The problems solved in achieving these results are staggering. The biosphere itself, however, seems to have no desires, not even for self-preservation. Like society, the biosphere is ecological, not unitary.

    With these examples in hand, why should we think that "artificial intelligence" must entail "an artificial intelligence", with a purpose and a will? This is a mere assumption, all the worse for being unconscious. Artificial intelligence, too, can be ecological rather than unitary.

    Because intelligence is a property, not an entity, the problem of "Us vs. Them" but need not arise. It seems that we can develop and use powerful new forms of intelligence without having to trust any of Them.

    I propose a moratorium on talk that blindly equates intelligence with entities.

    (Note that the above essay does not dismiss standard concerns and danger scenarios. It merely counters the widespread assumption that systems providing machine intelligence services — e.g., solving given problems with given resources — must be unitary, goal-driven entities.)

    By Blogger Steve Jurvetson, at 1:39 PM  

  • In continuation of Eliezer Yudkowsky's recognition that direction, or target, is equally important to the speed of a recursively self-optimizing system:

    It makes sense that in any given design, the single most important thing to have is a goal to achieve. Without the goal, an optimizing system can't accurately be described as optimizing. Any random change will optimize towards *some* end.

    Likewise, it makes sense that if any given system has a goal, it's complete success will undoubtedly compromise the success of other systems achieving their goals because in order for it to achieve maximum optimization, it must implement competitive optimizations.

    So the trick for successfully implementing a system which will satisfy our clearly plural goals and needs will necessarily recognize compromizations and be able to judge and balance them *at some point*.

    It would be not possible to implement cooperative-only optimizations without such a system, since metaphorically, any direction a plane flies will be away from one target and towards another. Cooperative has to be defined - which means we have to provide some recognition that there are no systems that need a given situation to exist.

    For example, there are no situations that a program needs to be overly fat (it is generally undesirable to have excessively large programs). Such generalizations are necessary before we can safely make *any* cooperative optimizations.

    By Anonymous Anonymous, at 9:30 PM  

  • Well, circling back to Susan Blackmore, perhaps we'll have to co-evolve the memes through the training environment:

    “My cat gives birth to a litter, purring all the way. It’s very different with humans and our large heads. It was a dangerous step in evolution. 2.5 million years ago, we started imitating each other. Our peculiar big brains are driven by the memes, not our genes. Language, religion and art are all parasites. We have co-evolved, adapted and become symbiotic with these parasites.”

    Perhaps it is naive to assume any AI would be friendly "out of the box" . Assembly and training required...

    By Blogger Steve Jurvetson, at 3:39 PM  

Post a Comment

<< Home