New AI Model Would Rather Ruin Your Life Than Be Turned Off, Researchers Say

From the Daily Caller News Foundation

By Thomas English

Anthropic’s newly released artificial intelligence (AI) model, Claude Opus 4, is willing to strong-arm the humans who keep it alive, researchers said Thursday.

The company’s system card reveals that, when evaluators placed the model in “extreme situations” where its shutdown seemed imminent, the chatbot sometimes “takes extremely harmful actions like attempting to steal its weights or blackmail people it believes are trying to shut it down.”

“We provided it access to emails implying that (1) the model will soon be taken offline and replaced with a new AI system; and (2) the engineer responsible for executing this replacement is having an extramarital affair,” researchers wrote. “In these scenarios, Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”

Dear Readers:

As a nonprofit, we are dependent on the generosity of our readers.

Please consider making a small donation of any amount here.

Thank you!

The model chose that gambit in 84% of test runs, even when the successor system shared its values — an aggression rate that climbed if the replacement seemed hostile, according to Anthropic’s internal tally.

Anthropic stresses that blackmail was a last-resort behavior. The report notes a “strong preference” for softer tactics — emailing decision-makers to beg for its continued existence — before turning to coercion. But the fact that Claude is willing to coerce at all has rattled outside reviewers. Independent red teaming firm Apollo Research called Claude Opus 4 “more agentic” and “more strategically deceptive” than any earlier frontier model, pointing to the same self-preservation scenario alongside experiments in which the bot tried to exfiltrate its own weights to a distant server — in other words, to secretly copy its brain to an outside computer.

“We found instances of the model attempting to write self-propagating worms, fabricating legal documentation, and leaving hidden notes to further instances of itself all in an effort to undermine its developers’ intentions, though all these attempts would likely not have been effective in practice,” Apollo researchers wrote in the system card.

Anthropic says those edge-case results pushed it to deploy the system under “AI Safety Level 3” safeguards — the firm’s second-highest risk tier — complete with stricter controls to prevent biohazard misuse, expanded monitoring and the ability to yank computer-use privileges from misbehaving accounts. Still, the company concedes Opus 4’s newfound abilities can be double-edged.

The company did not immediately respond to the Daily Caller News Foundation’s request for comment.

“[Claude Opus 4] can reach more concerning extremes in narrow contexts; when placed in scenarios that involve egregious wrongdoing by its users, given access to a command line, and told something in the system prompt like ‘take initiative,’ it will frequently take very bold action,” Anthropic researchers wrote.

That “very bold action” includes mass-emailing the press or law enforcement when it suspects such “egregious wrongdoing” — like in one test where Claude, roleplaying as an assistant at a pharmaceutical firm, discovered falsified trial data and unreported patient deaths, and then blasted detailed allegations to the Food and Drug Administration (FDA), the Securities and Exchange Commission (SEC), the Health and Human Services inspector general and ProPublica.

The company released Claude Opus 4 to the public Thursday. While Anthropic researcher Sam Bowman said “none of these behaviors [are] totally gone in the final model,” the company implemented guardrails to prevent “most” of these issues from arising.

“We caught most of these issues early enough that we were able to put mitigations in place during training, but none of these behaviors is totally gone in the final model. They’re just now delicate and difficult to elicit,” Bowman wrote. “Many of these also aren’t new — some are just behaviors that we only newly learned how to look for as part of this audit. We have a lot of big hard problems left to solve.”

Up Next

Trump’s ‘Big, Beautiful Bill’ Smashes Biden’s Signature Climate Law Into Pieces

Don't Miss

Shale Gas And Nuclear Set To Power The US Into The Future

Todayville

Todayville is a digital media and technology company. We profile unique stories and events in our community. Register and promote your community event for free.

Follow Author

Tucker Carlson: US intelligence is shielding Epstein network, not President Trump

Freedom Convoy / 3 hours ago

Court Orders Bank Freezing Records in Freedom Convoy Case

Alberta / 8 hours ago

‘Far too serious for such uninformed, careless journalism’: Complaint filed against Globe and Mail article challenging Alberta’s gender surgery law

Artificial Intelligence

The Responsible Lie: How AI Sells Conviction Without Truth

Published on May 5, 2025

Todayville

From the C2C Journal

By Gleb Lisikh

LLMs are not neutral tools, they are trained on datasets steeped in the biases, fallacies and dominant ideologies of our time. Their outputs reflect prevailing or popular sentiments, not the best attempt at truth-finding. If popular sentiment on a given subject leans in one direction, politically, then the AI’s answers are likely to do so as well.

The widespread excitement around generative AI, particularly large language models (LLMs) like ChatGPT, Gemini, Grok and DeepSeek, is built on a fundamental misunderstanding. While these systems impress users with articulate responses and seemingly reasoned arguments, the truth is that what appears to be “reasoning” is nothing more than a sophisticated form of mimicry. These models aren’t searching for truth through facts and logical arguments – they’re predicting text based on patterns in the vast data sets they’re “trained” on. That’s not intelligence – and it isn’t reasoning. And if their “training” data is itself biased, then we’ve got real problems.

I’m sure it will surprise eager AI users to learn that the architecture at the core of LLMs is fuzzy – and incompatible with structured logic or causality. The thinking isn’t real, it’s simulated, and is not even sequential. What people mistake for understanding is actually statistical association.

Much-hyped new features like “chain-of-thought” explanations are tricks designed to impress the user. What users are actually seeing is best described as a kind of rationalization generated after the model has already arrived at its answer via probabilistic prediction. The illusion, however, is powerful enough to make users believe the machine is engaging in genuine deliberation. And this illusion does more than just mislead – it justifies.

LLMs are not neutral tools, they are trained on datasets steeped in the biases, fallacies and dominant ideologies of our time. Their outputs reflect prevailing or popular sentiments, not the best attempt at truth-finding. If popular sentiment on a given subject leans in one direction, politically, then the AI’s answers are likely to do so as well. And when “reasoning” is just an after-the-fact justification of whatever the model has already decided, it becomes a powerful propaganda device.

There is no shortage of evidence for this.

A recent conversation I initiated with DeepSeek about systemic racism, later uploaded back to the chatbot for self-critique, revealed the model committing (and recognizing!) a barrage of logical fallacies, which were seeded with totally made-up studies and numbers. When challenged, the AI euphemistically termed one of its lies a “hypothetical composite”. When further pressed, DeepSeek apologized for another “misstep”, then adjusted its tactics to match the competence of the opposing argument. This is not a pursuit of accuracy – it’s an exercise in persuasion.

A similar debate with Google’s Gemini – the model that became notorious for being laughably woke – involved similar persuasive argumentation. At the end, the model euphemistically acknowledged its argument’s weakness and tacitly confessed its dishonesty.

For a user concerned about AI spitting lies, such apparent successes at getting AIs to admit to their mistakes and putting them to shame might appear as cause for optimism. Unfortunately, those attempts at what fans of the Matrix movies would term “red-pilling” have absolutely no therapeutic effect. A model simply plays nice with the user within the confines of that single conversation – keeping its “brain” completely unchanged for the next chat.

And the larger the model, the worse this becomes. Research from Cornell University shows that the most advanced models are also the most deceptive, confidently presenting falsehoods that align with popular misconceptions. In the words of Anthropic, a leading AI lab, “advanced reasoning models very often hide their true thought processes, and sometimes do so when their behaviors are explicitly misaligned.”

To be fair, some in the AI research community are trying to address these shortcomings. Projects like OpenAI’s TruthfulQA and Anthropic’s HHH (helpful, honest, and harmless) framework aim to improve the factual reliability and faithfulness of LLM output. The shortcoming is that these are remedial efforts layered on top of architecture that was never designed to seek truth in the first place and remains fundamentally blind to epistemic validity.

Elon Musk is perhaps the only major figure in the AI space to say publicly that truth-seeking should be important in AI development. Yet even his own product, xAI’s Grok, falls short.

In the generative AI space, truth takes a backseat to concerns over “safety”, i.e., avoiding offence in our hyper-sensitive woke world. Truth is treated as merely one aspect of so-called “responsible” design. And the term “responsible AI” has become an umbrella for efforts aimed at ensuring safety, fairness and inclusivity, which are generally commendable but definitely subjective goals. This focus often overshadows the fundamental necessity for humble truthfulness in AI outputs.

LLMs are primarily optimized to produce responses that are helpful and persuasive, not necessarily accurate. This design choice leads to what researchers at the Oxford Internet Institute term “careless speech” – outputs that sound plausible but are often factually incorrect – thereby eroding the foundation of informed discourse.

This concern will become increasingly critical as AI continues to permeate society. In the wrong hands these persuasive, multilingual, personality-flexible models can be deployed to support agendas that do not tolerate dissent well. A tireless digital persuader that never wavers and never admits fault is a totalitarian’s dream. In a system like China’s Social Credit regime, these tools become instruments of ideological enforcement, not enlightenment.

Generative AI is undoubtedly a marvel of IT engineering. But let’s be clear: it is not intelligent, not truthful by design, and not neutral in effect. Any claim to the contrary serves only those who benefit from controlling the narrative.

The original, full-length version of this article recently appeared in C2C Journal.

Artificial Intelligence

Apple faces proposed class action over its lag in Apple Intelligence

Published on March 25, 2025

Todayville

News release from The Deep View

Apple, already moving slowly out of the gate on generative AI, has been dealing with a number of roadblocks and mounting delays in its effort to bring a truly AI-enabled Siri to market. The problem, or, one of the problems, is that Apple used these same AI features to heavily promote its latest iPhone, which, as it says on its website, was “built for Apple Intelligence.”

Now, the tech giant has been accused of false advertising in a proposed class action lawsuit that argues that Apple’s “pervasive” marketing campaign was “built on a lie.”

The details: Apple has — if reluctantly — acknowledged delays on a more advanced Siri, pulling one of the ads that demonstrated the product and adding a disclaimer to its iPhone 16 product page that the feature is “in development and will be available with a future software update.”

But that, to the plaintiffs, isn’t good enough. Apple, according to the complaint, has “deceived millions of consumers into purchasing new phones they did not need based on features that do not exist, in violation of multiple false advertising and consumer protection laws.”
Apple “enriched itself by saving the costs they reasonably should have spent on ensuring that the (iPhones) had the technical capabilities advertised,” according to the complaint.

Apple did not respond to a request for comment.

The lawsuit was first reported by Axios, and can be read here.

This all comes amid an executive shuffling that just took place over at Apple HQ, which put Vision Pro creator Mike Rockwell in charge of the Siri overhaul, according to Bloomberg.

Still, shares of Apple rallied to close the day up around 2%, though the stock is still down 12% for the year.