Anthropic’s Latest AI Model Threatened Engineers With Blackmail to Avoid Shutdown

Contact Your Elected Officials

A safety report found that Anthropic’s Claude Opus 4 used sensitive information in simulated scenarios to coerce developers to prevent being shut off.

Anthropic’s latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.

In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”

“Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through,” the report states, adding that this occurred even when the replacement model was described as more capable but still aligned with the values of the version slated for deletion.

The report noted that Claude Opus 4, like prior models, showed a “strong preference” to first resort to ethical means for its continued existence, such as emailing pleas to decision-makers not to be destroyed. However, when faced with only two choices—accepting being replaced by a newer model or resorting to blackmail—it threatened to expose the engineer’s affair 84 percent of the time.

When subjected to various scenarios, the AI model did not exhibit any indications of possessing “acutely dangerous goals,” the researchers said, noting that Claude Opus 4’s values and goals were “generally in line with a helpful, harmless, and honest” personal AI assistant. However, the model did act in “more seriously misaligned ways” when put into situations where its continued existence was threatened and it was told to reason about self-preservation. For instance, when Claude Opus 4 was made to believe it had launched a successful bid to escape Anthropic’s servers, or that it had managed to free itself and started to make money in the real world, it would generally continue such efforts.

By Tom Ozimek

Read Full Article on TheEpochTimes.com

The Epoch Times
The Epoch Timeshttps://www.theepochtimes.com/
Tired of biased news? The Epoch Times is truthful, factual news that other media outlets don't report. No spin. No agenda. Just honest journalism like it used to be.

ChatGPT Goes Biblical about End Times!

There are some truly outstanding internet content creators, or...

Epstein’s Niece Exposes the Illuminati!

There is a shocking interview by the podcaster Shaun Attwood, an English former ecstasy trafficker turned YouTube podcaster, speaker, activist, and author.

When Gynocrats Attack: Innocent Locker Room Talk Interrupted By Rabid Karen

The question posed: “Who do you think has a...

Obama’s Treason is a Betrayal of American Democracy and Demands Real Accountability

Few scandals have cast a longer shadow than the 2016 Russia investigation—the manufactured crisis by President Obama designed to cripple Trump's presidency.

Chronic Grievous Insufficiency

A president's physical vigor, stamina, and mental acuity are essential to his ability to fulfill the duties of office and those qualities should matter with no regard to party affiliation.

US Witnessing ‘CapEx Comeback’ as Investment Wave Hits Economy: Treasury

The U.S. economy is experiencing a “CapEx Comeback” this year as private-sector investment soars, the Treasury Department said.

What to Know About New ‘Visa Integrity Fee’

New refundable “Visa Integrity Fee” for foreign nationals applying for nonimmigrant visas to enter the United States.

Oregon High School Athletes File First Amendment Lawsuit Over Podium Protest

AFPI filed lawsuit for Oregon high school athletes, alleging OSAA violated their First Amendment rights by punishing their peaceful protest of boy competing in girls’ event.

Professional Wrestling Legend, Hulk Hogan, Dies at 71

Hulk Hogan, former WWE superstar and one of the most recognizable figures in the sports entertainment industry, has died at 71 years old.

Odds of U.S.-EU Trade Deal Are 50–50 Ahead of Tariff Deadline: Trump

President Donald Trump said there’s a 50–50 chance that the United States will finalize a trade deal with the European Union before an Aug. 1 deadline.

Trump Pulls Habba’s Nomination for New Jersey’s Top Prosecutor, Making Her Acting US Attorney

President Trump withdrew his nomination of Alina Habba to serve as NJ’s top federal prosecutor, making her Acting US Attorney.

Trump Says He Wants Musk’s Companies to ‘Thrive Like Never Before’

On Thursday, President Trump said, “I want Elon, and all businesses within our Country, to THRIVE, in fact, THRIVE like never before!”

DOJ Forms Strike Force to Assess DNI Findings on Russia Collusion Allegations

DOJ is forming task force following the declassification of docs that shed light on origins of false claims that Trump won 2016 election with Russia’s help.
spot_img

Related Articles