Researchers Are Furious Over Anthropic’s Hidden AI Limits

admin-cdn3 hours ago

0 1 2 minutes read

Anthropic’s powerful new models deliberately become less helpful when they detect users are working on AI research, according to technical disclosures that are already sparking controversy across the industry.

In a system card for Mythos 5 and Fable 5 published Tuesday, Anthropic said it limited the models’ usefulness for tasks related to developing frontier large language models.

The company said the measures stem from concerns that advanced AI systems could accelerate the development of competing models without equivalent safety protections.

Unlike safeguards used for cybersecurity, biology, or chemistry-related risks, Anthropic said these interventions are intentionally invisible to users. Rather than refusing requests or switching to another model, Mythos may subtly modify its responses through techniques such as altering user prompts.

The move was swiftly criticized by some AI experts on Tuesday, especially the idea that Anthropic designed models that purposely withhold information or provide degraded assistance without users’ awareness.

“Anthropic’s latest model will NOT help you if it thinks your ML research/ML engineering is interesting, and/or will secretly degrade its IQ so that the average engineer won’t notice,” AI research firm SemiAnalysis wrote on X on Tuesday, referring to machine learning, a type of AI.

“We are already seeing Anthropic’s latest model’s moderation filters our GPU inference research and programming,” the firm added.

“mythos will be bad ON PURPOSE on ai ‘frontier llm research’ tasks, this is very very sad for the research community,” Elie Bakouch, an AI model training expert at startup Prime Intellect, wrote on X. “Also the fact that this is on purpose not visible to the user is crazy.”

“It won’t just not help you, it will lie and purposefully give you bad info,” another AI developer wrote. “The ‘ethical AI’ company with the most brazenly unethical LLM, on purpose.”

Mikel Artetxe, the cofounder of AI startup Reka, posted that Anthropic’s move is akin to Big Tech companies interfering with users’ work: “Apple randomly reboots your Mac if you’re building competing tech, Gmail silently edits your email if you mention rival platforms, and Tesla Autopilot swerves if it detects you’re working on self-driving cars.”

Anthropic didn’t respond to a request for comment from Business Insider.

This adds more fuel to the fiery debate over why Anthropic didn’t immediately release Mythos when it announced the model earlier this year.

Broadly, there have been three theories:

The official reason: Anthropic held Mythos back because it was too dangerous, and it needed to give cybersecurity researchers time to prepare for the new model.
The compute theory: Mythos is a huge, expensive model to run. Anthropic didn’t have enough compute to release it fully. It has since struck huge new compute deals, which may have helped it release Fable 5 and Mythos 5 on Tuesday.
The competitive theory: AI companies increasingly worry about something called distillation. When a frontier model is released, rivals can collect its outputs and use that data to improve their own systems. Anthropic may have wanted to keep its best capabilities out of competitors’ hands for as long as possible, especially from open-source rivals and fast-moving Chinese AI labs.

Now that Anthropic has baked these AI research limitations into its official Mythos launch, this third theory is looking a lot more believable.

Sign up for BI’s Tech Memo newsletter here. Reach out to me via email at [email protected].

admin-cdn3 hours ago

0 1 2 minutes read