DON’T FORGET: February 25th is my next event, this time on how AI can (maybe) fix the government – where I’ll be speaking to Alexander Iosad, Director of Government Innovation Policy at the Tony Blair Institute. Come and hang out!
I feel like I’m going insane.
Yesterday, the markets woke up to another major technological breakthrough. DeepSeek, a Chinese AI company, recently released a new Large Language Model (LLM) which appears to be equivalently capable to OpenAI’s ChatGPT “o1” reasoning model – the most sophisticated it has available.
However, there was a twist: DeepSeek’s model is 30x more efficient, and was created with only a fraction of the hardware and budget as Open AI’s best.
As a result, other than Apple, all of the major tech stocks fell – with Nvidia, the company that has a near-monopoly on AI hardware, falling the hardest and posting the biggest one day loss in market history.
And then something really weird happened. For some reason, many people seemed to lose their minds.
For example, here’s Timnit Gebru, an AI ethics researcher, who famously parted ways with Google after she wrote a paper criticising LLMs for their biases and environmental impact, way back in 2020.
Gebru’s post is representative of many other people who I came across, who seemed to treat the release of DeepSeek as a victory of sorts, against the tech bros. DeepSeek’s superiority over the models trained by OpenAI, Google and Meta is treated like evidence that – after all – big tech is somehow getting what is deserves.
And then there were the commentators who are actually worth taking seriously, because they don’t sound as deranged as Gebru.
For example, here’s Ed Zitron, a PR guy who has earned a reputation as an AI sceptic.
And here’s Karen Hao, a long time tech reporter for outlets like The Atlantic.
And here’s what bugs me. I’m just not sure the conclusions that are being drawn following DeepSeek are right.
For example, take the hundreds of billions of dollars that have been invested (or is planned to be invested) in building out AI datacentres and supporting infrastructure. Sure, the markets clearly panicked yesterday,1 but is it really sensible to assume that just because this one, dramatically more efficient, model now exists that all of that compute capacity will go to waste?
Don’t forget to subscribe - for free - to get more takes direct to your inbox!
What seems more likely to me is that we’ll see Jevons’ paradox kick in – the observation that often when the cost of resources falls, overall demand increases. Because reducing the price of compute will unlock tonnes of new use-cases that would have previously been prohibitively expensive.
In fact, this ‘law’ was specifically cited by Microsoft CEO Satya Nadella – who tweeted a link to the Wikipedia page, presumably worried that his company’s stock price would crater once the markets opened.
While Jevons paradox is a not an iron law, it seems obvious to me that there will be an endless appetite for compute, and that any expectation that DeepSeek marks a collapse in demand will look as foolish as IBM CEO Thomas Watson in the 1940s, who (probably apocryphally) suggested that the world would maybe need five computers.
This was a bad prediction, because we now know that as the price of computing equipment fell, new use cases emerged to fill the gap – which is why today my lightbulbs have semiconductors inside them, and I occasionally have to install firmware updates my doorbell.
And similarly, it really seems as though even if models become more efficient, there will be plenty of need for compute – as evidenced by the way that DeepSeek actually works: It’s ability to reason.
What makes DeepSeek’s R1 model and OpenAI’s o1 model so powerful is that they are a new class of LLM – “chain of thought” models, which takes time to “reason” through problems.
When you type a prompt (eg, “Write a sea shanty about the Postcode Address File”), instead of spitting out the first thing it “thinks” of, it runs through different possibilities and checks its responses.2
What researchers discovered is that the more the AI reasons, the better it performs when answering questions. (It’s one of the reasons these more sophisticated models are thought to be less prone to hallucination.)
And this is true for both the initial training of the model (when the developers build it) – and for “inference” (when us users ask it questions). In other words, the more compute you throw at a problem, the better the answers you get.
So surely the compute freed up by more efficient models will be used to train models even harder, and apply even more “brain power” to coming up with responses? Even if DeepSeek is dramatically more efficient, the logical thing to do will be to use the excess capacity to ensure the answers are even smarter.
At least, this is the conclusion that was reached by Jeffrey Emanuel, a veteran AI guy, who concluded similarly, in an extremely long blog post:3
“It doesn't take an AI genius to realize that this development creates a new scaling law that is totally independent of the original pre-training scaling law. Now, you still want to train the best model you can by cleverly leveraging as much compute as you can and as many trillion tokens of high quality training data as possible, but that's just the beginning of the story in this new world; now, you could easily use incredibly huge amounts of compute just to do inference from these models at a very high level of confidence or when trying to solve extremely tough problems that require "genius level" reasoning to avoid all the potential pitfalls that would lead a regular LLM astray.”
So I suspect that we’ll see both efficient DeepSeek-calibre models developed by American tech firms, but also even more sophisticated models developed, by using the extra compute headroom.
However, this all said, even if you don’t buy this about LLMs, and they are genuinely a bust, there are plenty of other reasons to be bullish on compute capacity – because there are tonnes of other things we can do with it once it has been built.
For example, we could use the same hardware to run physics simulations to train self-driving cars. We could build highly sophisticated climate models. Or we could just crunch through some more proteins and discover more drugs and achieve new breakthroughs in medicine. And I’m sure the advertising industry will expend even more money to discover new ways to persuade people to click on ads 0.00001% more often too.
So essentially then, while I’m definitely open the idea that Hao and Zitron are right in the narrow sense of the fate of, say, Nvidia4 and OpenAI, I’m really struggling to see how the AI “bubble” has burst, given the compute resources freed up by these more efficient models will surely just be taken up by other stuff.
Sputnik moment
There is another reason to doubt that DeepSeek is going to fundamentally change the trajectory of the AI build out, even if the business case has changed.
Because the news yesterday was not just that a cheaper, more efficient AI model has been invented. It is that a cheaper, more efficient AI model has been invented by China.
Marc Andreessen, the Netscape founder, venture capitalist and Trump supporter called DeepSeek a “Sputnik” moment. Despite that third thing, he’s absolutely right. As things stand on the 28th January 2025, a Chinese company, and thus the Chinese government, has a decisive technical advantage over the west in our most important new technology.
And this is where I think arguments like this, from Hao, don’t really work.
The problem with the economics argument is that when it comes to geopolitical matters, economics matters less than security. Because if America (and the west) wants to regain its edge over China, it is going to need to just spend the money – whatever it costs.
Presumably this will involve OpenAI and other firms desperately racing to build more efficient models, now that DeepSeek has proven that it can be done. But when economics matters less, simply throwing compute at the problem also works as a strategy – as what matters ultimately is the functional outcome. A bit like how South Korea has more sophisticated weapon systems than North Korea – but the latter reduces the military gap by having many, many more soldiers.
So I find it hard to believe that America – even an America led by Trump – will simply let this technological disparity slide. I mean, it’s not like after Sputnik America just gave up either. Instead, it marshalled upwards of $250bn (in today’s money) to launch the Apollo program that took humans to the Moon.5
Not quite a reckoning
I might be hilariously wrong about this. Perhaps we’ll look back in a few years and the datacentres will stand empty, collecting dust. Perhaps we really can build ever more sophisticated AI models through code alone, without the need for more physical stuff?
But it really feels as though at least a slice of the crowing I’ve seen in reaction is a result of motivated reasoning. Many people do not like the impact of AI on climate, copyright, or that the major companies have all cosied up to Trump. The tech bros are annoying – so we’re desperately looking for a reason to discredit them. It’s like Elon Musk and the exploding rocket all over again.
However, I’m not sure this is the sceptics’ moment of victory. Because DeepSeek is not discrediting AI. Instead, it’s a breakthrough moment that, once replicated by American firms, will see AI tools become even better.
It’s also, as Ben Ansell points out, a positive signal for the British government. It might be a “get out of jail free” card that lets the Labour government use AI to grow the economy and transform the public sector, without having to spend as much money as it might have had to.
So sure, if DeepSeek heralds a new era of much leaner LLMs, it’s not great news in the short term if you’re a shareholder in Nvidia, Microsoft, Meta or Google.6 But if DeepSeek is the enormous breakthrough it appears, it just became even cheaper to train and use the most sophisticated models humans have so far built, by one or more orders of magnitude. Which is amazing news for big tech, because it means that AI usage is going to be even more ubiquitous.
Why not subscribe (for free!) to more takes on policy, politics, tech and more direct to your inbox?
If you enjoyed this, you will like my forthcoming AI event with Alexander Iosad – we’re going to be talking about how AI can (maybe!) fix the government.
Though to put Nvidia’s fall into context, it is now only as valuable as it was in… early September. It’s now only the third most valuable company in the world. Disaster.
I’m sure AI people will find this offensively over-simplified but I’m trying to keep this comprehensible to my brain, let alone any readers who do not have stupid jobs where they can justify reading blogposts about AI all day.
His language is a bit technical, and there isn’t a great shorter quote to take from that paragraph, so it might be easier just to assume that he agrees with me.
Jeffrey Emanuel, the guy I quote above, actually makes a very persuasive bear case for Nvidia at the above link.
Then there’s the arms race dynamic - if America builds a better model than China, China will then try to beat it, which will lead to America trying to beat it… We’re going to need a lot of compute for a long time, and “be more efficient” won’t always be the answer.
Apple actually closed up yesterday, because DeepSeek is brilliant news for the company – it’s proof that the “Apple Intelligence” bet, that we can run good enough local AI models on our phones could actually work one day. Not to mention Apple also makes the best mobile chips, so will have a decisive advantage running local models too.
I think this is just a stark example of how the AI sceptic crowd in the media believe things for political reasons. "We don't like big tech and big tech is making this thing, therefore it is bad and rubbish. A news story shows how much more potential this thing has - in a way that meaningfully undermines one of my criticisms - and somehow this confirms my prior beliefs".
I'm going to shamelessly steal and expand on a reddit comment that I saw earlier which I think had a good perspective on this situation:
"It's a bit like there's rumours of an AI gold rush and the big American companies have headed off on an expensively funded expedition to Alaska.
DeepSeek have come along and effectively said "Hey guys, I've borrowed some of your shovels and I've found just as much gold as you have digging in my back garden". "
Meanwhile, we're not sure that the gold is actually useful for anything yet - but we now know that the American companies have overpriced it by overbuilding prior to demand. It's a first adopter problem.
The big winners here aren't necessarily the AI sceptics, but they're certainly the AI Company Sceptics.