The thing is, it's not a given that Mormon genes will triumph over hikikomori ones by outbreeding: we might first conquer aging, or make artificial wombs and robot childcare servitors that compensate for aversion to the costs of childrearing, or even make immortal ems a la Robin Hanson. We might, in short, evade natural selection pressures entirely.
And that's the other piece of the fear about AGI: that if we try to keep continuously training it so as to keep it aligned, it will try to defeat our training mechanisms, and it will win because it's superintelligent.
Artificial wombs or near immortality would probably make copies of oneself even more valuable, not less. So we could still get back on track with copying our genes.
(Deleted previous comment because I was reading too fast)
I think the difference is that Nature's goals/optimization are built into reality. As you said, it never stops selecting. There's no way for a species to take over the universe and change the laws of physics.
Yeah it's not clear how to actually follow natural selection's example here for alignment. But I do think it is important that evolution's solution is not "Somehow instantiate an understanding of and inherent desire for the meta-goal into the organisms." It's my understanding that this is close to the current paradigm in Yudkowskian-style AI safety and very little progress has been made.
I don't think we'll ever solve alignment that way. But nature shows that it is possible to get a pretty well aligned system by just using constant competition and selection to quickly retarget heuristic models when they get uncorrelated with the goal.
Setting up that competition and selection is clear in some cases (like the adversarial learning of AlphaZero) but seems much harder in others (like chatGPT where you need to know the correct prediction in advance to train it).
I am at risk of taking the analogy too far but do you think humans have done this with evolution? We might still but I don't think we have prevented natural selection from retargeting us, nor do we really seem to want to.
We have used our intelligence to pursue the heuristic values that evolution instilled in us. In this process we have drastically changed our environment much faster than our heuristics change. So we've become quite misaligned but I don't think its because we've made efforts to prevent evolutionary retargeting. If we slow down changing our environment, evolution will catch back up a la Hanson: https://www.overcomingbias.com/p/this-is-the-dream-timehtml
I am responding real late, but, if I am following correctly, I don’t think your analogy is quite right.
First, evolution has not been kind to any species, mercilessly exterminating 99% of them.
Second, there are now two distinct evolutionary processes in multilevel selection, biology, which works for all the others, and culture, which has worked for our branch of hominids. Once culture took off, we left a path of wholesale destruction across the majority of large, slower breeding birds and mammals on the planet. The problem (for other species) is that culture evolves incalculably faster than biology. The concern here is that AI will be able to use an evolutionary process of variation and selection at speeds which will similarly dwarf cultural evolution.
That said, I am not an AI doomer. I just disagree with this particular argument.
FWIIW, my take is that AI is inevitable (when not if), that humans are or soon will be more than capable of exterminating each other even without AI, that immense Artificial intelligence will not be morally inferior to humans, and, on a positive note, that greater intelligence is the path to eternal knowledge and that AI might just be just part of this greater story.
I think both of your observations are correct but they don't conflict with my understanding of the analogy.
Here's how the standard AI fear argument goes:
Humans are the result of an optimization process not unlike the ones we use to create AIs. Evolution optimized us to copy our genetic code. But for most of our history we didn't even know this. And now even when we do know it, we didn't care about Evolution's goals. We'd rather wear condoms and get obese.
Similarly, we can optimize an AI for a goal, but the AI will not know what the goal is and even if it figures our what our goals are, it won't care.
In this post I am questioning the premise of this analogy. It does seem like Humans are unaligned with Evolution's goals. But if you zoom out and look at all species over all time, it looks like Evolution has been very successful at aligning species to the goal of copying their genes.
So it's true that we face all the same problems with creating an agent from an optimization process as evolution does. But evolution has managed to get around all of these problems most of the time, which suggests that we can too.
Survival of the fittest means the non-survival of the less fit. If all humans are sufficiently less-fit (in relation to the alternatives), then all humans are headed for non-survival.
> LLMs today are trained for thousands of GPU hours but once their training is finished, their weights are set and users just send inputs through a static matrix.
Have you been reading up on how ML works? I remember about a year ago we were discussing modern ML models and you seemed to think they had an almost neural consciousness we didn't understand, until I explained the weights/matrices. Not meaning to roast you on the web tho đŸ˜‚
> Most organisms are well-aligned with the goal of reproduction
Of course they are! Unaligned organisms quickly went extinct! The processes of reproduction and mortality wipes out organisms that fail to be aligned.
Unfortunately this insight only allows us to engineer entities whose goal is reproduction. It doesn't generalise to any other goal. And reproduction is definitely not the primary goal we want an AI to have.
I agree with everything you said except that the properties of reproduction as a goal which allow alignment can't generalize to any other goal.
I think there is more uncertainty about this than you are admitting. E.g 'reproductive' pressures work well for many types of AIs like game playing AIs that learn via self-competition.
I am definitely not certain that we could generalize anything from nature but I think it should not be dismissed.
IMO, the problem with this framework is twofold: one, pointing to a large mass of lifeforms that follow evolutionary programming, but with multiple exceptions/failures, still dooms us if there are multiple mostly-aligned AI's, and even one unaligned exception that can bootstrap or use deception effectively.
Two, it sort of sidesteps the role of intelligence in OUR exception case. The reason we're not aligned is because we've used intelligence to be able to circumvent pregnancy and childbirth when they're not wanted. Similarly, an intelligent AI will use it's intelligence to circumvent whatever "don't destroy humans" drives we've tried to instill in it via millions-of-years-equivalent of simulation.
The second one has some greater uncertainty. It is true that our intelligence is the reason why were are not aligned right now. We've used our intelligence to change our environment and now the heuristics for reproductive fitness we learned in the savannah make us obese and childless even when we're rich.
But I don't think this is because we used our intelligence to directly circumvent our evolutionary drive for reproductive fitness. We used our intelligence to follow our evolutionary heuristics like producing more food and climbing status hierarchies and this happened to change our environment in the process. As we changed our environment faster the heuristics became more and more uncorrelated with reproductive fitness but we kept following them because they make up our internal reward function. I am sort of rambling on this subtle point but I think it is important to point out that our misalignment is just a consequence of following our heuristics so well that we changed our environment, not because we've used our intelligence to decide on different values.
It is important because evolution is still changing our heuristics. We are copying the genes and practices that lead to lots of descendants and followers and not the other ones. So intelligence changed our environment which sent our heuristics out of order, but natural selection is slowly retargeting those heuristics in our new environment.
So I don't think that humans have circumvented the drives that evolution instilled in us. We pursued those drives so well that it changed our environment drastically. We're still following those same drives in the new environment but they no longer correlate with reproductive fitness. Different heuristics that do correlate with reproductive fitness will eventually take over.
You're right that even temporary misalignment like our own is enough to be very dangerous. And even if it's true that humans have not circumvented their evolutionary drives with their intelligence, it's no guarantee that AIs wont or that humans wont in the future.
Sorry for the long winded comment, I didn't have time to write a shorter one. Thank you for reading!
I think you have a great point that a lot of below-replacement fertility at the country level is probably significantly driven by misaligned status and social comparison heuristics, and it actually makes me wonder what fertility interventions may be possible in that framework.
But I was actually pointing more directly at the individual-level decisions around birth control and contraceptives as an example of intelligence / technology directly deployed in allowing us to circumvent the telos of the drive instilled in us by evolution. I mean, consider if for "biology is hard" reasons, we didn't have effective hormonal birth control. I would bet that the fertility rate today at the country level would be considerably higher in most countries, despite the same social and status heuristic misalignment.
I think the relevant level when considering AI risk is going to be an individual level decision to use intelligence to circumvent a drive, much like an individual deciding to use (or in this case, invent) contraception or birth control, in which case the broader EEA heuristic misalignment is less relevant.
You just can’t get very far by reasoning about AI training by analogy with evolution. There are so many differences on both sides, it’s like reasoning about the development of the airline industry by analogy with the bat population.
That may be true but it's a popular analogy nonetheless, and it's often used to explain alignment even by very sophisticated thinkers like Yudkowsky.
What other examples do we have to understand alignment? I think markets and governments are pretty good ones too but for similar reasons as the natural selection one.
There's a lot of research about training AI models, and examples about how it has worked in practice. If you want to understand AI models, look at practical examples like ChatGPT, the various generative image models, or previous successes like AlphaGo.
I don't agree with the Yudkowsky line of reasoning in the first place (and I think most people are on my side here) so to me "Eliezer likes to reason by analogy between AI and evolution" is not really a selling point. In my opinion the method of reasoning via vague analogy is part of what led Yudkowsky astray in the first place.
That's fair. This post is primarily a response to Eliezer's reasoning on this point on the Bankless podcast.
For that audience, I think the argument is stronger when I assume that the analogy is correct but derive a different conclusion than if I just attacked the use of the analogy in the first place. But I do agree that Eliezer is too heavy on logical arguments and analogies and not focused enough on actual practice.
I'm sure I don't understand this well enough, but it seems we should distinguish between an analogy with evolution by natural selection in terms of process, eg variation, replication, etc, as opposed to the thing upon which selection acts upon.
While there may be formal similarities between humans and AI development re: evolution, I see no reason to assume any analogy between the two substantively, because AI as we now understand it is very different from humans.
But perhaps I don't understand the essential point of the argument?
The thing is, it's not a given that Mormon genes will triumph over hikikomori ones by outbreeding: we might first conquer aging, or make artificial wombs and robot childcare servitors that compensate for aversion to the costs of childrearing, or even make immortal ems a la Robin Hanson. We might, in short, evade natural selection pressures entirely.
And that's the other piece of the fear about AGI: that if we try to keep continuously training it so as to keep it aligned, it will try to defeat our training mechanisms, and it will win because it's superintelligent.
There are still some big technology changes that would drastically change the requirements for reproductive fitness.
FWIW I think Robin shares a similar view to me on this: https://www.overcomingbias.com/p/this-is-the-dream-timehtml. He sees our current maladaptiveness as temporary.
Artificial wombs or near immortality would probably make copies of oneself even more valuable, not less. So we could still get back on track with copying our genes.
(Deleted previous comment because I was reading too fast)
I think the difference is that Nature's goals/optimization are built into reality. As you said, it never stops selecting. There's no way for a species to take over the universe and change the laws of physics.
Yeah it's not clear how to actually follow natural selection's example here for alignment. But I do think it is important that evolution's solution is not "Somehow instantiate an understanding of and inherent desire for the meta-goal into the organisms." It's my understanding that this is close to the current paradigm in Yudkowskian-style AI safety and very little progress has been made.
I don't think we'll ever solve alignment that way. But nature shows that it is possible to get a pretty well aligned system by just using constant competition and selection to quickly retarget heuristic models when they get uncorrelated with the goal.
Setting up that competition and selection is clear in some cases (like the adversarial learning of AlphaZero) but seems much harder in others (like chatGPT where you need to know the correct prediction in advance to train it).
But the worry is that the AI gains control and doesn't allow you to retarget it anymore.
I am at risk of taking the analogy too far but do you think humans have done this with evolution? We might still but I don't think we have prevented natural selection from retargeting us, nor do we really seem to want to.
We have used our intelligence to pursue the heuristic values that evolution instilled in us. In this process we have drastically changed our environment much faster than our heuristics change. So we've become quite misaligned but I don't think its because we've made efforts to prevent evolutionary retargeting. If we slow down changing our environment, evolution will catch back up a la Hanson: https://www.overcomingbias.com/p/this-is-the-dream-timehtml
I am responding real late, but, if I am following correctly, I don’t think your analogy is quite right.
First, evolution has not been kind to any species, mercilessly exterminating 99% of them.
Second, there are now two distinct evolutionary processes in multilevel selection, biology, which works for all the others, and culture, which has worked for our branch of hominids. Once culture took off, we left a path of wholesale destruction across the majority of large, slower breeding birds and mammals on the planet. The problem (for other species) is that culture evolves incalculably faster than biology. The concern here is that AI will be able to use an evolutionary process of variation and selection at speeds which will similarly dwarf cultural evolution.
That said, I am not an AI doomer. I just disagree with this particular argument.
FWIIW, my take is that AI is inevitable (when not if), that humans are or soon will be more than capable of exterminating each other even without AI, that immense Artificial intelligence will not be morally inferior to humans, and, on a positive note, that greater intelligence is the path to eternal knowledge and that AI might just be just part of this greater story.
I think both of your observations are correct but they don't conflict with my understanding of the analogy.
Here's how the standard AI fear argument goes:
Humans are the result of an optimization process not unlike the ones we use to create AIs. Evolution optimized us to copy our genetic code. But for most of our history we didn't even know this. And now even when we do know it, we didn't care about Evolution's goals. We'd rather wear condoms and get obese.
Similarly, we can optimize an AI for a goal, but the AI will not know what the goal is and even if it figures our what our goals are, it won't care.
In this post I am questioning the premise of this analogy. It does seem like Humans are unaligned with Evolution's goals. But if you zoom out and look at all species over all time, it looks like Evolution has been very successful at aligning species to the goal of copying their genes.
So it's true that we face all the same problems with creating an agent from an optimization process as evolution does. But evolution has managed to get around all of these problems most of the time, which suggests that we can too.
Thanks for reading!
Survival of the fittest means the non-survival of the less fit. If all humans are sufficiently less-fit (in relation to the alternatives), then all humans are headed for non-survival.
> LLMs today are trained for thousands of GPU hours but once their training is finished, their weights are set and users just send inputs through a static matrix.
Have you been reading up on how ML works? I remember about a year ago we were discussing modern ML models and you seemed to think they had an almost neural consciousness we didn't understand, until I explained the weights/matrices. Not meaning to roast you on the web tho đŸ˜‚
Yeah this was a great layman explanation https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/
There is no mystery:
> Most organisms are well-aligned with the goal of reproduction
Of course they are! Unaligned organisms quickly went extinct! The processes of reproduction and mortality wipes out organisms that fail to be aligned.
Unfortunately this insight only allows us to engineer entities whose goal is reproduction. It doesn't generalise to any other goal. And reproduction is definitely not the primary goal we want an AI to have.
I agree with everything you said except that the properties of reproduction as a goal which allow alignment can't generalize to any other goal.
I think there is more uncertainty about this than you are admitting. E.g 'reproductive' pressures work well for many types of AIs like game playing AIs that learn via self-competition.
I am definitely not certain that we could generalize anything from nature but I think it should not be dismissed.
IMO, the problem with this framework is twofold: one, pointing to a large mass of lifeforms that follow evolutionary programming, but with multiple exceptions/failures, still dooms us if there are multiple mostly-aligned AI's, and even one unaligned exception that can bootstrap or use deception effectively.
Two, it sort of sidesteps the role of intelligence in OUR exception case. The reason we're not aligned is because we've used intelligence to be able to circumvent pregnancy and childbirth when they're not wanted. Similarly, an intelligent AI will use it's intelligence to circumvent whatever "don't destroy humans" drives we've tried to instill in it via millions-of-years-equivalent of simulation.
Or am I just missing something?
I think both points are fair.
The second one has some greater uncertainty. It is true that our intelligence is the reason why were are not aligned right now. We've used our intelligence to change our environment and now the heuristics for reproductive fitness we learned in the savannah make us obese and childless even when we're rich.
But I don't think this is because we used our intelligence to directly circumvent our evolutionary drive for reproductive fitness. We used our intelligence to follow our evolutionary heuristics like producing more food and climbing status hierarchies and this happened to change our environment in the process. As we changed our environment faster the heuristics became more and more uncorrelated with reproductive fitness but we kept following them because they make up our internal reward function. I am sort of rambling on this subtle point but I think it is important to point out that our misalignment is just a consequence of following our heuristics so well that we changed our environment, not because we've used our intelligence to decide on different values.
It is important because evolution is still changing our heuristics. We are copying the genes and practices that lead to lots of descendants and followers and not the other ones. So intelligence changed our environment which sent our heuristics out of order, but natural selection is slowly retargeting those heuristics in our new environment.
So I don't think that humans have circumvented the drives that evolution instilled in us. We pursued those drives so well that it changed our environment drastically. We're still following those same drives in the new environment but they no longer correlate with reproductive fitness. Different heuristics that do correlate with reproductive fitness will eventually take over.
You're right that even temporary misalignment like our own is enough to be very dangerous. And even if it's true that humans have not circumvented their evolutionary drives with their intelligence, it's no guarantee that AIs wont or that humans wont in the future.
Sorry for the long winded comment, I didn't have time to write a shorter one. Thank you for reading!
I think you have a great point that a lot of below-replacement fertility at the country level is probably significantly driven by misaligned status and social comparison heuristics, and it actually makes me wonder what fertility interventions may be possible in that framework.
But I was actually pointing more directly at the individual-level decisions around birth control and contraceptives as an example of intelligence / technology directly deployed in allowing us to circumvent the telos of the drive instilled in us by evolution. I mean, consider if for "biology is hard" reasons, we didn't have effective hormonal birth control. I would bet that the fertility rate today at the country level would be considerably higher in most countries, despite the same social and status heuristic misalignment.
I think the relevant level when considering AI risk is going to be an individual level decision to use intelligence to circumvent a drive, much like an individual deciding to use (or in this case, invent) contraception or birth control, in which case the broader EEA heuristic misalignment is less relevant.
You just can’t get very far by reasoning about AI training by analogy with evolution. There are so many differences on both sides, it’s like reasoning about the development of the airline industry by analogy with the bat population.
That may be true but it's a popular analogy nonetheless, and it's often used to explain alignment even by very sophisticated thinkers like Yudkowsky.
What other examples do we have to understand alignment? I think markets and governments are pretty good ones too but for similar reasons as the natural selection one.
There's a lot of research about training AI models, and examples about how it has worked in practice. If you want to understand AI models, look at practical examples like ChatGPT, the various generative image models, or previous successes like AlphaGo.
I don't agree with the Yudkowsky line of reasoning in the first place (and I think most people are on my side here) so to me "Eliezer likes to reason by analogy between AI and evolution" is not really a selling point. In my opinion the method of reasoning via vague analogy is part of what led Yudkowsky astray in the first place.
That's fair. This post is primarily a response to Eliezer's reasoning on this point on the Bankless podcast.
For that audience, I think the argument is stronger when I assume that the analogy is correct but derive a different conclusion than if I just attacked the use of the analogy in the first place. But I do agree that Eliezer is too heavy on logical arguments and analogies and not focused enough on actual practice.
I'm sure I don't understand this well enough, but it seems we should distinguish between an analogy with evolution by natural selection in terms of process, eg variation, replication, etc, as opposed to the thing upon which selection acts upon.
While there may be formal similarities between humans and AI development re: evolution, I see no reason to assume any analogy between the two substantively, because AI as we now understand it is very different from humans.
But perhaps I don't understand the essential point of the argument?