🤚 The Open-Palm Price Guillotine
If you thought DeepSeek’s V4 launch last month was aggressive, congratulations — you were correct, but also hopelessly naive about the floor. On May 22, the Chinese AI lab announced that its “promotional” 75% discount on the flagship V4-Pro model — originally set to expire May 31 — is now permanent. The sale is not ending. The sale is the price.
Here’s what that means in actual numbers:
- V4-Pro input tokens: $0.435 per million (down from $1.74)
- V4-Pro output tokens: $0.87 per million (down from $3.48)
- Cache hit pricing: slashed to one-tenth across the entire API suite
For context, OpenAI’s GPT-5.5 charges $5.00 per million input tokens and a staggering $30.00 per million output tokens. Anthropic’s Claude Opus 4.7 runs $5.00 input and $25.00 output. DeepSeek’s V4-Pro is now roughly 11.5 times cheaper than GPT-5.5 on input and 34 times cheaper on output. This is not a rounding error. This is a different tax bracket.
👐 The Two-Handed Huawei Handshake
The quiet headline beneath the loud one: DeepSeek V4 is the company’s first major model family optimized for Huawei Ascend 950 and Ascend 950PR AI accelerator supernodes, with additional support for Cambricon hardware. That’s right — the model that’s undercutting every Western frontier lab by an order of magnitude isn’t even running on Nvidia GPUs.
This is particularly inconvenient for the narrative that U.S. chip export controls would slow Chinese AI development. The controls were supposed to prevent exactly this. Instead, DeepSeek built a 1.6-trillion-parameter mixture-of-experts model with a one-million-token context window on domestic silicon, then priced it like a regional carrier undercutting the airlines.
The timing is also exquisite. The price cut arrived just days after White House officials accused Chinese AI firms of “industrial-scale” model distillation — allegations that both Anthropic and OpenAI have previously leveled at DeepSeek. DeepSeek’s response to being called a copier was, apparently, to permanently slash its prices to levels that make the originals look like they’re running a luxury markup on a commodity product. Which, to be fair, they might be.
🌿 The Gentle Awakening
There’s something philosophically delicious about the AI pricing wars of 2026. Western labs have spent billions convincing enterprise customers that intelligence is expensive — that the sheer computational majesty of frontier models demands premium pricing. Then a lab in Hangzhou shows up with comparable benchmarks, domestic chips the U.S. tried to prevent from existing, and a price tag that makes the competition look like they’re charging for the packaging.
The developer math is brutally simple. If you’re building agentic workflows where tokens compound — and in 2026, who isn’t — the difference between $0.87 and $30.00 per million output tokens is the difference between a viable startup and a Series A that goes entirely to your API bill. One outlet framed DeepSeek as “weaponizing Western AI rate limit anger,” and honestly, it’s hard to argue with the characterization. Developers frustrated by restrictive rate limits and premium pricing are doing what consumers always do: they’re opening new browser tabs.
👑 The Gold-Leaf Reckoning
The real question isn’t whether DeepSeek can sustain these prices — the Huawei hardware economics suggest they can. The real question is what happens to the Western pricing consensus. OpenAI charging 34x more for output tokens only works if customers believe they’re getting 34x more value. For some use cases, they are. For the growing army of developers stitching together AI agents that burn millions of tokens per task, the value proposition is starting to look more like brand loyalty than rational economics.
We noted when V4 launched that it arrived at 1.6 trillion parameters for the price of a sad sandwich. Apparently the sandwich was still too expensive. DeepSeek has now permanently priced its flagship model below what most Western labs charge for their budget tier, and it did it on chips that weren’t supposed to exist. The export controls created a timeline where Chinese AI is cheaper, runs on domestic hardware, and has no supply chain dependency on the country that tried to embargo it.
Somewhere in San Francisco, a pricing committee is having a very long meeting.
“The export controls were designed to create a five-year gap. DeepSeek interpreted that as a five-month head start on Huawei optimization. The spreadsheet doesn’t lie, but it does wince.” — The Slap of Wisdom Geopolitical Arbitrage Desk, currently converting its cloud budget to yuan