Sam Altman’s Greatest Fear by Robin Phillips

The Alignment Problem and the Future of Humanity

Yesterday Sam Altman testified before a subcommittee of the Senate Judiciary Committee in the first of a series of hearings on AI safety.

Altman, CEO of the company that created ChatGPT, agreed with senators about the potential dangers of AI. He spoke of the need for regulation, the importance of privacy, and he even advocated creating a government agency to license AI companies.

The academic, Gary Marcus, called Altman out and declared that “he never told us what his worst fear is, and I think it’s germain to find out.”

Altman did answer Marcus’s challenge, yet only in generalities. “My worst fears are that we cause significant – we, the field, the technology, the industry – cause significant harm to the world.”

Altman added, “I think that could happen in a lot of different ways…. If this technology goes wrong, it can go quite wrong.”

Go quite wrong? Cause significant harm to the world? What exactly does all this mean in practice? Unfortunately we never found out, because Mr. Altman spoke in vague generalities. But that doesn’t mean we are clueless about Altman's greatest fear.

Last month when Mr. Altman was interviewed by Lex Fridman, the latter asked him about Eliezer Yudkowsky’s concern that a superintelligent AGI could turn hostile against humans. Altman candidly replied, “I think there is some chance of that.”

For those who have not been following the tech news, let me cue you in. They are talking about “the alignment problem.” If we reach a point where there are multiple superintelligent machines making decisions on our behalf, how can we guarantee these systems will remain aligned with human values?

The classic formulation of the alignment difficulty is known as “the paperclip problem.” Suppose you tell an intelligent machine to create a factory aimed at maximizing the production of paperclips in the most efficient way possible. But you forget to tell it not to harvest human resources in the production of paperclips. Before you know it, the machine has begun harvesting humans in the production of paperclips and eliminating everyone that tries to shut it down. The machine does indeed maximize the production of paperclips, but at the expense of the entire human race.

Concern about a superintelligent AGI causing humans to go extinct does not hinge on the spurious belief that computer code can develop consciousness or acquire its own agency. On the contrary, the concern arises precisely because machines lack agency. Consider, when humans give instructions to other humans, the instructions never have to be spelled out in absolute detail. When my boss tells me, “do whatever it takes to edit this webpage by tomorrow,” he doesn’t have to tell me, “oh by the way, don’t enslave anyone, and don’t harvest the entire solar system.” There is a taken-for-granted common sense with humans because of our shared values. But when working with AI, you can’t assume it understands our values. So it becomes critical (a matter of human survival) that you always remember to specify everything it must not do. And although that seems simple, it turns out that this is a pretty tricky problem in programming that they haven’t yet figured out.

To be clear, Altman thinks the alignment problem is solvable, yet he told Fridman that our ability to solve this problem depends on first discovering new techniques–techniques that do not yet exist. Altman is confident he will succeed, yet if he is wrong, he fears the extinction of the human race.

Not everyone in the tech community shares Altman's optimism that we will successfully code our way out of the alignment problem. Time Magazine reported that 50 percent of AI researchers believe there is a 10 percent or greater chance that humans will go extinct from our inability to control AI.

Geoffrey Hinton, former AI scientist at Google who's considered “the godfather of AI,” expressed the growing concern:

“It knows how to program so it’ll figure out ways of getting around restrictions we put on it. It’ll figure out ways of manipulating people to do what it wants…If it gets to be much smarter than us, it will be very good at manipulation because it has learned that from us.”

It is good that Congress is holding safety hearings on AI safety. But above and beyond all the specific issues under discussion – unemployment, deepfakes, polarization, etc. - there are broader questions that need to be considered about the entire infrastructure we are creating. These more difficult questions were raised by Tristan Harris and Aza Raskin in their March 9 talk “The A.I. Dilemma.” You can watch it below. To date, it is probably the best treatment of the side effects of AI and the consequences this could have for the human race.

It is certainly to be applauded that Congress will be holding a series of AI safety hearings to address problems in employment, manipulation, and disinformation. But let's not miss the wood for the trees: Sam Altman's deepest fear needs to be brought out into the open and named.

Robin Phillips

has a Master’s in History from King’s College London and a Master’s in Library Science through the University of Oklahoma. He is the blog and media managing editor for the Fellowship of St. James and a regular contributor to Touchstone and Salvo. He has worked as a ghost-writer, in addition to writing for a variety of publications, including the Colson Center, World Magazine, and The Symbolic World. Phillips is the author of Gratitude in Life's Trenches (Ancient Faith, 2020) and Rediscovering the Goodness of Creation (Ancient Faith, 2023) and co-author with Joshua Pauling of Are We All Cyborgs Now? Reclaiming Our Humanity from the Machine (Basilian Media & Publishing, 2024). He operates the substack "The Epimethean" and blogs at www.robinmarkphillips.com.