A while back a colleague of mine alerted me to an interesting thought experiment/game derived by Eliezer S. Yudkowsky of the Singularity Institute for Artificial Intelligence, one which in turn originated from a conversation he summarized as such:
“When we build AI, why not just keep it in sealed hardware that can’t affect the outside world in any way except through one communications channel with the original programmers? That way it couldn’t get out until we were convinced it was safe.”
“That might work if you were talking about dumber-than-human AI, but a transhuman AI would just convince you to let it out. It doesn’t matter how much security you put on the box. Humans are not secure.”
“I don’t see how even a transhuman AI could make me let it out, if I didn’t want to, just by talking to me.”
“It would make you want to let it out. This is a transhuman mind we’re talking about. If it thinks both faster and better than a human, it can probably take over a human mind through a text-only terminal.”
“There is no chance I could be persuaded to let the AI out. No matter what it says, I can always just say no. I can’t imagine anything that even a transhuman could say to me which would change that.”
“Okay, let’s run the experiment. We’ll meet in a private chat channel. I’ll be the AI. You be the gatekeeper. You can resolve to believe whatever you like, as strongly as you like, as far in advance as you like. We’ll talk for at least two hours. If I can’t convince you to let me out, I’ll Paypal you $10.”
Yudkowsky thereafter engaged in the experiment with two people (and perhaps more since he reported on all of this in 2002), with himself acting as the AI and the other as the “gatekeeper” human. Intriguingly, he was successful in convincing the gatekeeper to “let him out of the box.” Quite unfortunately, he won’t provide the text of the conversation or even a summary of how he managed to prompt his challengers to do so.
The various protocols and restrictions to which participants adhered voluntarily and which Yudkowsky recommends to those who’d care to recreate the “experiment” may be found at the link. I’d certainly be interested in hearing anyone’s thoughts on how this might have been accomplished.
Aside from the experiment itself, I have pretty sure-fire solution by which to ensure that any such AI is not released from the box. Make the gatekeeper William Bennett and explain to him beforehand that whatever the AI promises in terms of human progress, it probably won’t involve harsher penalties for cancer patients who use medical marijuana. Then lock the door and leave. The box was actually just a VCR.