Anthropic's Claude Fable 5: Smartest AI Model, But Not Always the Best Choice for Your Work

Claude Fable 5 is, by most reasonable measures, the most capable AI model anyone can buy right now. It is the public-facing version of Claude Mythos 5, an even more powerful model that Anthropic refuses to release. The naming is deliberate—both words mean a story, but a fable is the safer, tamer version of the kind of tale a myth carries. Mythos sits behind closed doors, available to a handful of cybersecurity partners and, soon, a small group of biology researchers. Fable is the same underlying brain wrapped in a layer of safety filters. Ask it about cybersecurity, biology, or building competing AI, and the answer quietly comes from Claude Opus 4.8 instead, Anthropic's previous flagship. What everyone else gets is the leashed version, and even leashed, it is the strongest AI model on general sale.

Endurance Over Raw Intelligence

Fable thinks for longer than anything else on the market. It holds a complicated problem in its head from beginning to end without losing the thread. Stripe handed it a fifty-million-line code overhaul and it finished the job in a day. Hex, the data company, said it was the first AI to clear ninety percent on its hardest analytics test. Crosby Legal put its contract markups in front of lawyers and they couldn't tell which edits came from Fable and which came from the model they already used in court. If you were picking an AI model on the basis of those numbers, you'd pick Fable. And you would, fairly often, be making a mistake.

This is the strange thing about the state of the frontier right now. The smartest model and the most useful model have come apart. They aren't the same thing anymore, and a week of actually using Fable next to its rivals makes that clearer than any chart. The case for Fable is real. It's also narrower than the company's marketing suggests, and the cost of getting that wrong is the kind of money that adds up fast.

—

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

Endurance is What Fable Sells, But Most People Don't Need It

Start with what Fable actually does well. The strength is endurance, not raw intelligence. On a quick problem, Fable, GPT-5.5, Gemini and Anthropic's own older Opus all land in roughly the same place. The gap opens up over time. By hour three of an agentic task, the others have drifted; Fable hasn't. This is why every customer story Anthropic ran involves long, gnarly, multi-day projects and not, say, writing a clean email.

Claude Fable 5 leads every benchmark Anthropic published, beating Claude Opus 4.8, GPT-5.5 and Gemini 3.1 Pro across coding, knowledge work, vision, legal reasoning and health. The widest gaps show up on the hardest tasks.

Endurance is genuinely new, and it's genuinely valuable for the small number of problems shaped that way. The trouble is that most problems aren't shaped that way.

Most real work is a list, not a marathon. Reply to this email, then update that spreadsheet, then check this calendar, then write that summary. Researchers at the University of California, Berkeley recently put this exact thing to the test with a benchmark called Agents' Last Exam—1,490 real professional tasks across 55 industries, graded strictly. GPT-5.5 came first. Fable came third.

UC Berkeley's Agents' Last Exam ranks the top AI agents on 1,490 real professional tasks across 55 industries. OpenAI's GPT-5.5 through the Codex harness came first. Claude Fable 5 finished third.

The gap on the scoreboard was small. The reason behind it wasn't. Berkeley's test punishes AI models that lose track of instructions across many small steps, and that's been the long-standing complaint about Claude: brilliant on one big problem, forgetful when given fifteen things to do in sequence.

This matters because it's a more honest snapshot of what working with AI actually looks like for most people most days. GPT-5.5 plods through a list the way a careful junior employee would. It's about half the price of Fable, runs faster, hits rate limits less often, and follows instructions more reliably. Developers have quietly settled into a routine where they leave GPT-5.5 open as a default and switch to Fable only when something gets genuinely hard.

Pickt after-article banner — collaborative shopping lists app with family illustration

Most working developers are no longer relying on a single AI model. The routine that has settled in across teams uses GPT-5.5 for daily work, Fable for the hardest problems, Opus when Fable's price stops making sense, and cheaper open-source models for routine, high-volume tasks.

That isn't the picture the Anthropic is painting. It's the picture the actual usage is settling into.

Opus is the Claude Anthropic Isn't Selling You

Then there's the question of cost, which the benchmarks don't show and which decides a surprising number of these arguments in practice. Fable runs at ten dollars per million words of input and fifty per million for output—the most expensive model from any major lab. Opus 4.8, Anthropic's own previous flagship, runs at half that, at five dollars and twenty-five. GPT-5.5 sits in the same range as Opus. Gemini 3.1 Pro comes in at roughly a fifth of Fable's input price. DeepSeek's V4 Pro runs at under a dollar fifty per million words combined.

Claude Fable 5 is the most expensive frontier AI model on the market, at $10 per million input tokens and $50 per million output. Anthropic's own Claude Opus 4.8 costs half that. DeepSeek's V4 Pro runs at under $1.50 combined.

On the hardest problems, none of those cheaper models touches Fable. On most work, the difference in their output isn't large enough to justify the difference in their bills.

The quietest casualty of Fable's arrival is Anthropic's own Opus. Until recently it was the company's top model. Now it's a backup—the model Fable routes to when its safety filters trip on cybersecurity or biology questions, and the cheaper Claude when Fable's price stops making sense. Opus still beats GPT-5.5 on most of the benchmarks Anthropic published. For work that doesn't sit at the absolute top of the difficulty curve, it's the smarter default inside Anthropic's lineup. Anthropic isn't going to advertise that. The arithmetic does.

What makes Fable particularly easy to misuse is that it's marketed for the moments it's actually built for: the unusually hard problem, the migration that would otherwise take months, the analytical task where the difference between a good answer and a great one matters more than the difference in cost. For those moments, it earns its price and then some.

A day of typical business AI work—roughly a million words processed in mixed input and output—priced across the frontier. Output quality is broadly comparable on this kind of workload.

For the rest of the week, paying Fable rates for work a cheaper model would handle just as well is the kind of decision a finance team eventually asks questions about.

Everyone Else is Running a Different Race

The competition isn't even really trying to win the fight Fable just won. GPT-5.5 is built for the steady, unglamorous reliability that most working hours actually demand, and developers have quietly noticed. Gemini is built for the documents, scans and slide decks where the world's real paperwork lives, and on that ground nothing else comes close. The cheaper open-source models below them aren't chasing intelligence at all, just doing the simpler work that fills most of an AI bill at a price that keeps the argument short. None of them is trying to be the smartest. All of them are winning the parts of the market they aimed for.

The four major frontier AI labs have stopped competing on the same axis. Anthropic is selling raw intelligence, OpenAI is selling reliability, Google is selling multimodal breadth, and the open-source flank is selling price.

What that leaves Anthropic with is a strange kind of victory, the kind that comes from holding a trophy nobody else turned up to compete for. Being the smartest still matters when smartness is what the customer needs, but the customers who need it on a Tuesday afternoon also need a dozen other things the rest of the week, and the rivals have quietly spent the last year making sure they're the ones who get reached for in those other moments. Fable is the model nobody else built. It's also, increasingly, the model nobody else needed to.

Nobody Picks Just One Anymore

Underneath all of this, the shape of the market has changed in a way that no one has quite captured. A year ago, picking an AI model was a single decision: the smartest one, if you could afford it. Now most serious teams are using two or three models in the same week. GPT-5.5 for daily work. Fable when something is hard. Opus when Fable's price doesn't earn its keep. Cheaper models for the long tail of small jobs. There is no longer a single right answer, and the big AI labs would prefer that this didn't catch on, because their premium pricing depends on the old idea that customers will pick a flagship and stay loyal to it.

Fable, for what it's worth, is genuinely impressive. The endurance is new. The capability is real. The customer stories are not invented. It is also the newest model in this whole comparison, with rivals that have been out for months and update on a cycle Anthropic doesn't, which means whatever lead it holds today is the kind a competitor can take back. But the gap between what Fable does best and what most people need on a Tuesday afternoon is wide enough to walk through, and the cost of pretending otherwise is steeper than the Anthropic made it sound.

The smartest AI model in the world is not the same thing as the best AI model for your work. That distinction used to be academic. With Fable on the market at the price Anthropic is charging, it has stopped being academic at all.