General Discussion

Celerity

(55,367 posts) Thu Jun 25, 2026, 05:30 PM Thursday

Sinéad Bovell: Did the Godfather of AI Just Invent a Mathematical Solution to the AI Control Problem?

Yoshua Bengio's Next Frontier

https://sineadbovell.substack.com/p/did-the-godfather-of-ai-just-invent (video at the link)

Every so often a story goes viral about an AI system displaying extremely concerning tendencies. From blackmail and deception if threatened to be shutdown, to colluding with other AIs, to an AI system that autonomously decided it needed more resources and diverted computing power to mine cryptocurrency. While these instances happened in controlled laboratory environments where AI companies themselves were testing the systems (minus the Alibaba cryptocurrency scenario), they are nonetheless concerning and raise deeply troubling questions about some of the unintended and extreme safety implications of this technology.

These are conversations I spend a fair amount of time having in policy circles and national security rooms, yet I’ve often been cautious about bringing them into public discourse. Discussions about AI risk can quickly become either too sensationalized, pushing people out of a conversation they need to be part of. Society’s voice is imperative in guiding this technology. Moreover, when AI industry folks seek public support on the existential risks AI presents, conversations often stop at identifying the risks with vague calls for “global agreements.”

While such agreements are imperative, they aren’t sufficient. Nor do they give the public a clear lever to engage with. That’s why I was waiting to sit down with one computer scientist in particular to bring this conversation to my community. Yoshua Bengio is one of the most influential figures in modern artificial intelligence. A Turing Award recipient and one of the researchers widely referred to as a “Godfather of AI,” the most cited computer scientist of all time, his work helped lay the foundations for today’s AI boom.

The AI Control Problem

One of the central themes of our conversation was the AI control problem: how do we ensure increasingly capable AI systems continue behaving in ways that align with human interests? Imagine asking an AI system to book you a restaurant reservation. If the restaurant is full, most people would expect it to tell you there are no available tables. But a sufficiently capable system focused solely on achieving the objective might pursue actions that technically accomplish the goal, such as hacking the restaurant to get you a table, while violating rules, norms, or human expectations. The goal was achieved. The outcome was not aligned.

snip

1 replies

= new reply since forum marked as read

Highlight:

Sinéad Bovell: Did the Godfather of AI Just Invent a Mathematical Solution to the AI Control Problem? (Original Post) Celerity Thursday OP

Give AI a conscience ... nice if u can do it. /nt bucolic_frolic Thursday #1

bucolic_frolic

(56,334 posts)

1. Give AI a conscience ... nice if u can do it. /nt

Reply to Celerity (Original post)

Thu Jun 25, 2026, 06:20 PM

Thursday

Reply to this discussion