back to index"OpenAI is Not God” - The DeepSeek Documentary on Liang Wenfeng, R1 and What's Next

00:00:00.000 |
For early access to future documentaries and 30-plus exclusive ad-free videos, 00:00:04.960 |
check out my Patreon, link in the description. 00:00:18.100 |
Language models were getting ever more expensive as they got more intelligent. 00:00:23.420 |
And research was retreating behind a veil of competitive secrecy. 00:00:28.820 |
But on the 20th of January, 2025, those reading those lines started to stutter. 00:00:36.780 |
A model that visibly seemed to think before it spoke had been released, DeepSeek R1. 00:00:42.820 |
It was unbelievably cheap, competitive with the best the West had to offer, 00:00:47.860 |
and out in the open, available to anyone to download. 00:00:51.960 |
Even OpenAI admit as much, arguing in March that DeepSeek shows, 00:00:57.180 |
quote, that our lead is not wide and is narrowing. 00:01:01.020 |
OpenAI even want models like DeepSeek R1 banned because they say, quote, 00:01:06.220 |
DeepSeek could be compelled by the Chinese Communist Party to manipulate its models to cause harm. 00:01:12.460 |
And because DeepSeek is simultaneously state-subsidized, state-controlled, and freely available, 00:01:19.340 |
it will cost users their privacy and security. 00:01:22.900 |
Now, while Google's Gemini 2.5 and the new ChatGPT image gen have wrestled back the headlines at the beginning of April, 00:01:30.840 |
DeepSeek is preparing to deliver yet another shock to the system, 00:01:34.840 |
with DeepSeek R2 expected later in April or May. 00:01:39.380 |
But truth be told, many of you will already know all of that. 00:01:42.960 |
What you might not know, though, are the aims and beliefs expressed in disparate interviews by the secretive founder behind DeepSeek, 00:01:50.840 |
billionaire Liang Wenfeng, a man who now has to hide from crowds of adoring fans in his own hometown, 00:02:00.400 |
and who has now fled his home province with his family to escape further attention. 00:02:04.400 |
Nor will some of you know about the first AI operation that made Liang his money, 00:02:11.140 |
Or the beauty of some of the technical innovations behind the Omega viral DeepSeek R1. 00:02:16.840 |
Or just how the Western labs like OpenAI and Anthropic have fired back with their own narratives 00:02:23.200 |
in the days and weeks since the release of R1. 00:02:26.520 |
There is frankly so much that so many people don't know about the company DeepSeek and what it means. 00:02:32.840 |
The truth is that DeepSeek is a whale caught in a net of narratives, 00:02:39.180 |
So let's get as close as we can to the truth behind the narratives, 00:02:43.480 |
and what that truth says about where all of this is going. 00:02:49.000 |
and artificial general intelligence is, quote, 00:02:55.260 |
then this story is about far, far more than one man, one lab, or even one nation. 00:03:01.680 |
Here then is what one of Liang's business partners said of the man who is thought to be 40. 00:03:07.620 |
He was this very nerdy guy with a terrible hairstyle when they first met. 00:03:12.140 |
Talking about building a 10,000 chip cluster to train his own AI models. 00:03:18.760 |
Of course, there are many AI leaders with terrible hairstyles, so what sets Liang Wenfeng apart? 00:03:24.720 |
He certainly wasn't always about solving intelligence and making it free. 00:03:28.360 |
It's hard to become a billionaire that way, as you might well guess. 00:03:32.060 |
No, to seek out the origin story here, we must switch to a first-hand account from the man himself. 00:03:37.980 |
Before that, though, a few moments of background. 00:03:40.660 |
Liang graduated university into a world that was falling apart. 00:03:45.240 |
Some of you will be too young, of course, to remember the panic of September 2008, 00:03:50.240 |
when the financial pyramid built on the sands of the US subprime housing market collapsed. 00:03:56.080 |
Either way, you might be able to understand the drive Liang had to try to understand the patterns within the unfolding chaos, 00:04:05.640 |
There were those who tried to tempt him into different directions while he operated out of a small flat in Chengdu, Sichuan. 00:04:12.260 |
Not me, though I was there, actually, in Chengdu at the same time, learning Mandarin. 00:04:17.040 |
No, no, no, it was the founder of what would become DJI, the world's preeminent drone maker, 00:04:23.200 |
who tried to headhunt Liang, but to no avail. 00:04:28.320 |
After getting a master's in information engineering in 2010, 00:04:31.960 |
Liang went on a founding spree between 2013 and 2016, 00:04:36.600 |
culminating in the establishment of the hedge fund High Flyer in February 2016. 00:04:42.200 |
Each entity he started included the core goal of using machine learning 00:04:47.200 |
to uncover the patterns behind microsecond or even nanosecond movements in the financial markets. 00:04:53.600 |
Patterns and paradigms no humans could detect alone. 00:05:01.280 |
As late as May 2023, Liang was still describing his goal in financial terms. 00:05:07.500 |
Our broader research aims to understand what kind of paradigms can fully describe the entire financial market, 00:05:14.260 |
and whether there are simpler ways to express it. 00:05:17.140 |
Anyway, it worked through attracting $9.4 billion in assets under management by the end of 2021, 00:05:23.440 |
and providing returns that in some cases were 20 to 50 percentage points more than stock market benchmarks. 00:05:32.140 |
He was a billionaire by his mid-30s and on top of the world. 00:05:36.380 |
All of High Flyer's market strategies used AI, 00:05:41.140 |
and they even had a supercomputer powered by 10,000 NVIDIA GPUs. 00:05:45.700 |
He might not at this point be scaling up language models like a tiny American startup, 00:05:50.240 |
OpenAI, had done the year earlier in 2020 with GPT-3. 00:05:53.960 |
But had his AI truly solved the chaos of the financial markets? 00:06:00.780 |
This is where the story starts to get interesting. 00:06:03.500 |
Liang's AI system, built with a team of just over 100 individuals, 00:06:12.900 |
It would double down on bets when it felt it was right, and that wasn't all. 00:06:16.800 |
The hedge fund itself, High Flyer, had become hubristic. 00:06:22.960 |
Success as a hedge fund, as you might expect, attracts more investments. 00:06:26.760 |
If you don't limit your fund size, and Liang didn't in time, 00:06:30.340 |
then sometimes you have too much money to deploy in a smart way. 00:06:33.980 |
Your trades get copied, your edge becomes less keen. 00:06:36.540 |
So after seeing a sharp drawdown, High Flyer expressed its deep guilt in public, 00:06:42.560 |
and took measures to further limit who could invest with them. 00:06:45.260 |
Yes, in case you're curious, they did learn their lesson, 00:06:47.580 |
and are still going as a hedge fund today with some degree of success. 00:06:53.960 |
High Flyer has outperformed the Chinese equivalent of the S&P index, 00:06:59.660 |
And yes, as we know, Liang didn't give up on AI. 00:07:02.360 |
He was rich now, and could afford an outfit dedicated to decoding not just financial systems, 00:07:08.020 |
but the nature of general intelligence itself. 00:07:13.720 |
and it was first formed as a research body in April 2023. 00:07:18.580 |
Any scars, perhaps, though, for Liang from his previous AI experience? 00:07:22.500 |
Well, there is one that might have carried over into the paper DeepSeek produced 00:07:27.460 |
on their first large language model, or chatbot. 00:07:30.140 |
From his experience, Liang knew that AI could be fickle, 00:07:36.060 |
So DeepSeek added this disclaimer for their first chatbot, 00:07:43.080 |
We profoundly recognize the importance of safety for general artificial intelligence. 00:07:49.060 |
The premise for establishing a truly helpful artificial intelligence model 00:07:53.440 |
is that it possesses values consistent with those of humans, 00:08:02.100 |
let's not pretend that many of us in the West were paying much attention 00:08:07.980 |
By then, of course, OpenAI were well onto GPT-4, 00:08:17.320 |
well before DeepSeek was even officially founded in July of that year. 00:08:25.460 |
one and a half decades deep into wielding artificial intelligence 00:08:37.760 |
may think there's some hidden business logic behind DeepSeek, 00:08:44.100 |
Why did DeepSeek R1 capture the world's attention at the start of 2025? 00:08:50.940 |
Why did it divide opinions and convulse markets? 00:08:54.860 |
Was it that the wider world could see the thinking process 00:08:58.920 |
of the language model before it gave its final answer? 00:09:03.520 |
Or that the model and the methods behind it were so open and accessible? 00:09:07.860 |
Or was it that such a performant model had come from China, 00:09:11.380 |
which was supposed to be a year behind the Western frontier? 00:09:14.520 |
We will investigate each of these possibilities, 00:09:17.160 |
but there was one thing that was certain of the DeepSeek of summer 2023. 00:09:22.160 |
It was, indeed, deeply behind Western AI labs. 00:09:29.320 |
but so was the first version of Claude from Anthropic 00:09:40.640 |
That model might not have been quite as smart on key benchmarks as GPT-4, 00:09:51.480 |
A model is, of course, nothing without its weight, 00:09:53.720 |
or its billions of tweakable numerical values used to calculate outputs. 00:09:58.740 |
open weights isn't quite the same as open source, 00:10:03.040 |
we would need to see the data that went into training the model, 00:10:10.000 |
Despite some models like Llama 2 being open weights at least, 00:10:13.560 |
key leaders within Western AI labs were saying 00:10:32.740 |
between the open models and the private models, 00:10:38.680 |
Sam Altman, CEO and co-founder of OpenAI, went further. 00:10:42.060 |
It wasn't just that research secrets were becoming a moat, 00:11:00.360 |
and build a truly intelligent language model. 00:11:02.740 |
Look, the way this works is we're going to tell you 00:11:06.740 |
on training foundation models you shouldn't try, 00:11:36.700 |
Liang had launched what would become DeepSeq. 00:11:51.760 |
measured not just in how many tens of thousands 00:12:01.860 |
without the backing of multi-trillion dollar hyperscalers 00:12:29.860 |
rather than vertical domains and applications. 00:12:44.420 |
DeepSeq prioritizes capability over credentials, 00:13:25.660 |
were not exactly stunning in their originality. 00:15:25.980 |
Towards Ultimate Expert Specialization paper,