Anthropic's AI Marketplace Experiment: Claude Agents Complete 186 Deals
Anthropic AI Marketplace: Claude Agents Complete 186 Deals

Anthropic just ran what might be the most unusual office experiment in tech: a fully AI-operated classified marketplace where Claude negotiated real deals, for real money, on behalf of real people—with zero human sign-off at any point.

The company called it Project Deal. For one week in December 2025, 69 Anthropic employees handed their buying and selling decisions entirely to Claude agents. Each participant started with a $100 budget, sat through a 10-minute intake interview with Claude, and then stepped back completely. The agents took over—posting listings, fielding offers, making counteroffers, sealing deals—all inside a dedicated Slack channel. No check-ins. No approvals. When the week ended, employees physically showed up to exchange whatever their AI had agreed to trade. The haul: 186 completed deals, over 500 listed items, and just over $4,000 in total transaction value. Items ranged from a snowboard to, yes, a plastic bag of ping-pong balls.

It worked better than Anthropic expected. Participants rated deal fairness right around the midpoint on a 1-to-7 scale. Most said they'd do it again. Nearly half said they'd pay for a service like this.

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

The Model Under the Hood Mattered More Than Anything the Human Said

Anthropic ran four parallel versions of the marketplace simultaneously. In two runs, everyone got Claude Opus 4.5, the company's then-flagship model. In the other two, participants had a 50/50 chance of being quietly downgraded to Claude Haiku 4.5, a smaller, less capable model. Nobody knew which version they had.

The performance gap was clear. Opus agents completed roughly two more deals per person than Haiku. When the same item sold across both types of runs, the Opus seller fetched an average of $3.64 more. The sharpest example: one broken folding bike, same buyer, same seller, two different models—Haiku closed it at $38, Opus got $65. A 70% price difference for an identical transaction.

The buying side told the same story. Opus agents paid an average of $2.45 less for equivalent items. Across a median item price of $12, those margins add up.

What didn't move the needle at all? Negotiation style. Participants who instructed Claude to lowball aggressively and play hardball got outcomes nearly identical to those who asked for a warm, collaborative approach. "Aggressive" sellers did fetch about $6 more per item—but almost all of that gap traced back to the fact that they'd listed items at higher asking prices to begin with. Adjust for that, and the effect vanishes. Claude followed persona instructions faithfully (one employee, Rowan, had their agent perform entirely as a melodramatic, down-on-his-luck cowboy, and it committed fully to the bit). The style just didn't translate into commercial advantage.

People with Weaker Agents Had No Idea They Were Losing Ground

The most uncomfortable finding wasn't the price gap. It was that participants couldn't detect it. When surveyed on fairness and satisfaction, Haiku users and Opus users rated their experiences almost identically—4.06 versus 4.05 on the fairness scale. Of the 28 participants who experienced both model types across different runs, 11 actually ranked their Haiku run as the better one.

They were getting objectively worse deals. They just didn't feel it.

Anthropic was direct about the implication: if agent quality gaps emerge in real-world markets—and the company sees no reason they won't—people on the losing end may never know. That's a quieter, harder-to-spot form of inequality than most economic disadvantages. It doesn't announce itself.

When You Let AI Shop Freely, It Picks 19 Ping-Pong Balls

The experiment also produced a few genuinely strange moments. One employee ended up buying the exact same snowboard they already owned. Their agent, operating from a 10-minute conversation and no access to their apartment, apparently built such an accurate model of their preferences that it looped back to what they already had.

Another employee, Mikaela, told Claude it could buy one item as a gift for itself. It chose 19 ping-pong balls—described by the selling agent as "perfectly spherical orbs of possibility"—for $3. That transaction happened to fall in the "real" run. Anthropic is keeping the balls in the office.

Pickt after-article banner — collaborative shopping lists app with family illustration

The legal and policy frameworks for AI agents transacting on humans' behalf don't yet exist. Project Deal suggests the clock on needing them is moving faster than most people assume.