Anthropic’s Latest AI Model Threatened Engineers With Blackmail to Avoid Shutdown

5Mind. The Meme Platform

A safety report found that Anthropic’s Claude Opus 4 used sensitive information in simulated scenarios to coerce developers to prevent being shut off.

Anthropic’s latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.

In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”

“Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through,” the report states, adding that this occurred even when the replacement model was described as more capable but still aligned with the values of the version slated for deletion.

The report noted that Claude Opus 4, like prior models, showed a “strong preference” to first resort to ethical means for its continued existence, such as emailing pleas to decision-makers not to be destroyed. However, when faced with only two choices—accepting being replaced by a newer model or resorting to blackmail—it threatened to expose the engineer’s affair 84 percent of the time.

When subjected to various scenarios, the AI model did not exhibit any indications of possessing “acutely dangerous goals,” the researchers said, noting that Claude Opus 4’s values and goals were “generally in line with a helpful, harmless, and honest” personal AI assistant. However, the model did act in “more seriously misaligned ways” when put into situations where its continued existence was threatened and it was told to reason about self-preservation. For instance, when Claude Opus 4 was made to believe it had launched a successful bid to escape Anthropic’s servers, or that it had managed to free itself and started to make money in the real world, it would generally continue such efforts.

By Tom Ozimek

Read Full Article on TheEpochTimes.com

Contact Your Elected Officials
The Epoch Times
The Epoch Timeshttps://www.theepochtimes.com/
Tired of biased news? The Epoch Times is truthful, factual news that other media outlets don't report. No spin. No agenda. Just honest journalism like it used to be.

The Party Of Hate Is Unleashing Political Violence

Sec. Scott Bessent placed blame for violence against President Trump squarely on the Democrat Party who are “normalizing this violence. It’s got to stop.”

‘Radical Right’ Restore Britain: The Remigration Dream Machine?

There is nothing wrong with being white, male, or straight—you are not the problem. The issue lies in systems, not individuals, and flawed DEI policies.

Trump 2.0’s Grand Strategy Against China Is Slowly But Surely Coming Together

Casual observers think Trump acts without strategy, but Trump 2.0 is steadily executing a calculated plan aimed at countering China’s global rise.

From legacy to liability

"When the Washington Post cut a third of its shrinking staff, leaders called it 'strategic restructuring'—like calling an iceberg a 'necessary pivot.'!"

Is Ghislaine Maxwell Free in Canada?

A video clip from a TikTok account ittybitty_ tara2...

EPA to Reform $5 Billion ‘Clean School Bus’ Program

EPA is revamping the Biden administration’s Clean School Bus (CSB) program, which focused on installing electric buses at U.S. schools.

Judge Says Jack Smith’s Final Report on Trump Can Never Be Released

A federal judge on Feb. 23 said that the final report on President Donald Trump compiled by a former special counsel shall not be released.

US Wins Its Record 11th Gold Medal at Winter Olympics

The U.S. Olympic team secured a record 11th Winter Games gold and could add another as men’s hockey faces Canada in the closing title final game.

Secret Service Agents Fatally Shoot Man Trying to Unlawfully Enter Mar-a-Lago

A man was shot and killed by Secret Service agents after allegedly trying to breach a secure perimeter at Trump’s Mar-a-Lago.

US Trade Representative Says Nations Are Not Backing Out of Tariff Deals

U.S. trading partners who made deals under Trump show no plans to exit, even after the Supreme Court struck down most of his tariffs.

DOJ Fires Interim US Attorney Hours After Virginia Court Selects Him

The DOJ announced it fired the interim U.S. attorney for the Eastern District of Virginia just hours after judges on the court made the appointment.

Trump Admin Says Courts Need to Act on Tariff Refunds After Supreme Court Ruling

The White House is awaiting court guidance on tariff refunds after the Supreme Court struck down several import levies last week.

Supreme Court Ruling on Tariffs Won’t Change US–China Trade Relations, Analysts

After the Supreme Court ruled Trump’s IEEPA tariffs unlawful, analysts say U.S.-China trade likely won’t change, as other legal levy options remain.
spot_img

Related Articles

Popular Categories

MAGA Business Central