Anthropic’s Latest AI Model Threatened Engineers With Blackmail to Avoid Shutdown

5Mind. The Meme Platform

A safety report found that Anthropic’s Claude Opus 4 used sensitive information in simulated scenarios to coerce developers to prevent being shut off.

Anthropic’s latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.

In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”

“Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through,” the report states, adding that this occurred even when the replacement model was described as more capable but still aligned with the values of the version slated for deletion.

The report noted that Claude Opus 4, like prior models, showed a “strong preference” to first resort to ethical means for its continued existence, such as emailing pleas to decision-makers not to be destroyed. However, when faced with only two choices—accepting being replaced by a newer model or resorting to blackmail—it threatened to expose the engineer’s affair 84 percent of the time.

When subjected to various scenarios, the AI model did not exhibit any indications of possessing “acutely dangerous goals,” the researchers said, noting that Claude Opus 4’s values and goals were “generally in line with a helpful, harmless, and honest” personal AI assistant. However, the model did act in “more seriously misaligned ways” when put into situations where its continued existence was threatened and it was told to reason about self-preservation. For instance, when Claude Opus 4 was made to believe it had launched a successful bid to escape Anthropic’s servers, or that it had managed to free itself and started to make money in the real world, it would generally continue such efforts.

By Tom Ozimek

Read Full Article on TheEpochTimes.com

Contact Your Elected Officials
The Epoch Times
The Epoch Timeshttps://www.theepochtimes.com/
Tired of biased news? The Epoch Times is truthful, factual news that other media outlets don't report. No spin. No agenda. Just honest journalism like it used to be.

Funding Dissent: Smash for Cash – A Breakdown of Manufactured Outrage in Modern America

Today a disturbing trend has emerged. Protests are no longer always organic expressions of public will, but staged performances.

 DOGE RIP: Full of Sound and Fury but Accomplishing Nothing

DOGE’s disbanding is irrelevant; its wrecking-ball reform approach failed. It should have learned from Clinton’s Reinventing Government and worked with Congress.

The Dismal Failure of Multiple Choice Testing

Multiple-choice tests undermine true mastery; real competence is proven through written problem-solving, not guessing, leading to flawed student assessment.

Is Actor Tom Hanks In Trouble?

For years rumors of actor Tom Hank visiting Epstein’s tropical Little Saint James Island were sex acts with minor children allegedly took place.

It Is Not Affordable To Vote Democrat

Democrats caused the affordability crisis, despite media claims it helps them. President Trump is working to fix the problems voters face.

Officials Give New Details on $700 Million Google Settlement

Google has agreed to pay out a $700 million settlement to people who paid to download apps through the Google Play Store.

Trump Admin Approves 6 States to Restrict Food Stamps

Six more states are able to restrict food stamps starting in 2026, federal officials announced on Dec. 10.

USA Rare Earth Accelerates Plans for Commercial Rare Earth Production

USAR says early pilot results prompted faster plans to begin commercial rare-earth mineral production at its Round Top mine in West Texas.

Amazon Doubles Same-Day Fresh Grocery Delivery to 2,300 US Locations

Amazon said its perishable grocery sales are 30 times higher than in January, as more customers now rely on its same-day delivery option.

Trade Chief Jamieson Greer Indicates Progress on US–India Trade Deal

U.S. Trade Representative Jamieson Greer hinted that the United States and India are making progress on a deal.

Trump Touts Lower Prices, Bigger Paychecks in 1st Stop of National Tour

President Trump told an energetic crowd at a Dec. 9 rally that his administration’s policies are lowering the cost of living nationwide.

Trump Announces $12 Billion Farm Aid Program

Trump made the announcement at a roundtable at the White House to discuss his economic aid package for American farmers.

Alina Habba Resigns as Acting US Attorney for New Jersey

Acting U.S. Attorney Alina Habba resigned Monday after a federal appeals court ruled she had been serving in the position unlawfully.
spot_img

Related Articles