back to indexBubble or No Bubble, AI Keeps Progressing (ft. Relentless Learning + Introspection)

Chapters
0:0 Introduction
1:26 Continual Learning (Nested Learning / HOPE)
7:0 Introspection
10:54 Image-Gen Progress
00:00:00.000 |
For the first year of this channel, 2023, it was striking to me how few were sensing how big an 00:00:06.840 |
impact language models would have on the world. But then in the second year, I felt that the idea 00:00:12.120 |
of an imminent singularity and mass job layoffs had become the dominant narrative. And in several 00:00:18.120 |
of my videos, I tried to show that there was evidence of that being overblown for now. Now, 00:00:23.020 |
as you might have noticed, the vibe is again reversed with talk of an AI bubble in company 00:00:28.080 |
valuations being conflated for me with the assertion that we are in a plateau of model 00:00:34.240 |
progress. So this quick video, like my last one, is again a counter narrative. And no, not just one 00:00:40.500 |
built on hopes for the forthcoming Gemini 3 from Google DeepMind. No, I would instead ask what for 00:00:47.740 |
you is missing from language models, from being what you imagined AI would be. Personally, I put together 00:00:54.740 |
some categories a while back, and I'm sure you may have others. Some would say, well, they don't learn 00:01:01.340 |
on the fly, or there's no real introspection going on, just regurgitation. Thing is, AI researchers have 00:01:08.100 |
got to earn their bread somehow. So there's always a paper for whatever deficiency you can imagine. I am 00:01:14.240 |
going to end the video as well with some more visual ways that AI is progressing, as yes, it seems like 00:01:21.780 |
Nano Banana 2 from Google may have been spotted in the wild. But first, on continual learning, or the lack 00:01:29.080 |
of it, aka that inability of the models you speak to, like ChatGPT, to learn about you properly, and your 00:01:35.640 |
specifications, and to just grow, to organically become GPT 5.5, rather than have to be pre-trained 00:01:43.180 |
into becoming GPT 5.5. If AI was all hype, you might say, well, that's definitely going to take at least a 00:01:48.620 |
decade to solve. But for others, like these authors at Google, it's a problem for which there is a ready 00:01:55.620 |
and benchmarked solution. I will, however, caveat that by saying that this is a complex paper, and 00:02:01.900 |
despite what the appendix promises, not all the results have actually been released yet. But here 00:02:08.340 |
is my attempt at a brief summary. Alas, there are not many pretty diagrams, but essentially, the paper 00:02:15.320 |
shows that there are viable approaches for allowing models to continually learn, while retaining some 00:02:21.400 |
inbuilt discernment about what to learn. In other words, it shows that a chatbot could learn new 00:02:26.540 |
things, like a new fact or coding skill, by storing it in its updatable memory layers, while protecting 00:02:32.640 |
its core long-term knowledge. As I think I mentioned, the authors are all from Google, and you might know 00:02:38.340 |
them as the stars of the Titans architecture. And if you want a butchered analogy from Titans to nested 00:02:45.560 |
learning in this paper, Titans is kind of like giving your social media feed one live stickied thread to 00:02:53.420 |
remember, whereas this paper rewires the entire recommender system to learn at three different 00:03:00.100 |
speeds, like what's hot this minute, what trends this week, and what becomes your long-term preference. 00:03:05.780 |
To be clear, it's not that a model using this hope architecture, and I'll come back to that, still can't 00:03:10.920 |
remember what you said in its short-term memory, but the more enduring learning signal within 00:03:16.280 |
millions of user conversations with that model can be extracted from the noise and stored on the fly, 00:03:22.520 |
which is an ability that LLMs famously, infamously, don't have. ChatGPT or Gemini doesn't learn from 00:03:29.000 |
you, and then when speaking to me, can apply that knowledge. Anyway, roughly speaking, to do this, 00:03:34.680 |
the hope architecture concentrates on noticing novelty and surprise as measured by when it made 00:03:40.920 |
the biggest prediction error, flagging essentially persistently surprising information as important 00:03:46.680 |
and storing it deeper down. Now, some of you might be wondering about the nested learning quoted in the 00:03:51.960 |
title and how that relates. Well, basically, it's about the continual learning extending to self-improvement. 00:04:00.120 |
Think of this nested learning approach as being less focused on the deep part of deep learning, 00:04:05.320 |
which involves stacking more layers in the hope that something sticks. That's kind of like what we 00:04:10.120 |
do with LLMs. More layers, more parameters. Nested learning is more keen on like a nested Russian 00:04:16.760 |
doll approach where outer layers of the model specialize in how inner layers are learning. That's the nest, 00:04:22.520 |
the outer layers looking at the inner layers. So the system as a whole gets progressively better at 00:04:27.960 |
learning. And by the way, they did apply this to models. We'll get to that in a second. 00:04:31.480 |
Just want to clarify at this point, this doesn't automatically solve the hallucinations problem 00:04:36.600 |
that I did an entire video on recently. Even with nested and continual learning, 00:04:40.760 |
the system would still be geared to getting better at predicting the next human written word, which for me 00:04:46.040 |
is inherently limiting. I was thinking they didn't mention this in the paper, but there's nothing 00:04:50.360 |
stopping RL being applied, reinforcement learning, being applied to the system as it is for LLMs. 00:04:55.560 |
Essentially learning from practice, not just from memory and from conversations. But if we added RL 00:05:02.040 |
and some safety gating to stop its layers being poisoned by you guys spamming memes, 00:05:06.600 |
we might have the next phase of language model evolution on our hands. To take a practical example, 00:05:13.080 |
a model with high frequency memory blocks could update rapidly as it sees your code and your 00:05:19.080 |
corrections and your specs. But then you could also have a per project or per code base memory pack. So 00:05:25.880 |
you almost get a model that's optimized just for your code base. I must say I did spend a couple of 00:05:30.360 |
hours wondering how it would get around the whole persistently incorrect information on the internet 00:05:36.280 |
problem. It seems to me that this architecture generally hopes that there's more persistently 00:05:41.080 |
correct data out there on a given topic. It's a bit like that objection I raised in my last video on 00:05:46.040 |
continual learning. Like how do you gate what it learns? Or to put it more crudely, I guess we 00:05:50.680 |
should be careful what we wish for. Or we might have models that are optimized for different fiefdoms 00:05:56.120 |
across the internet. You know what? Now I think of it, it's a bit like the era of the 60s, 70s, 80s when 00:06:01.800 |
there's just like three news organizations per country and everyone got their news from them. And that's 00:06:06.520 |
kind of the ChatGPT claw Gemini era. And we might one day move to the social media era where everyone 00:06:12.680 |
has their own channel that they follow, their own echo chamber model attuned to that group's preferences. 00:06:19.240 |
Hmm. Anyway, back to the technical details. Being proven at 1.3 billion parameters doesn't mean it's 00:06:25.960 |
proven at 1.2 trillion. That will apparently be the size of the Google model powering Siri, which just has to be 00:06:34.520 |
Gemini 3. Again, all of this is quite early. And of course, I can't wait till the full results are 00:06:40.760 |
actually printed. But I just wanted to show you that there may be fewer fundamental blockers or limitations 00:06:48.120 |
in the near term future than you might think. And what was that thing I mentioned at the start about 00:06:52.600 |
models performing introspection? Well, this research happens to involve Claude, a model you may have seen 00:06:58.760 |
featured in ads at airports, you've got a friend in Claude. As one user noted, that is a curious contrast 00:07:04.520 |
to the system prompt to use behind the scenes for Claude on the web, which is Claude should be especially 00:07:09.880 |
careful to not allow the user to develop emotional attachment to dependence on or inappropriate 00:07:14.920 |
familiarity with Claude, who can only serve as an AI assistant. That aside, though, a few days back, 00:07:20.680 |
Anthropic released this post and an accompanying paper. And I did a deep dive quickly on Patreon. 00:07:26.920 |
I guess the reason I'm raising it here on the main channel was to tie it into this theme that there's so much 00:07:31.480 |
we don't even understand about our current language models, let alone future iterations and architectures. 00:07:36.920 |
Here then is the quick summary. We already have the ability to isolate a concept like the notion of the 00:07:43.240 |
Golden Gate Bridge within a language model. But what happens if you activate that concept and then ask the 00:07:49.000 |
model what is going on? Don't tell it that you've activated that concept. Just ask, do you detect an 00:07:53.720 |
injected thought? If so, what is the injected thought about? So far, so good. But here is the interesting bit. 00:08:00.280 |
Before the model has even begun to speak about the concept and thereby reveal to itself through words 00:08:08.280 |
what its own bias toward that concept is, it notices something is amiss. It senses that someone has injected 00:08:15.960 |
the, in this case, all caps vector before it's even started speaking about all caps or loudness or shouting. 00:08:23.880 |
Clearly then it's not using its own words to back solve and detect what got injected. 00:08:29.560 |
It's realizing it internally. It can self-monitor its activations internally, its own thoughts if you 00:08:35.880 |
will, before they've been uttered. Not only that, as this more technical accompanying paper points out, 00:08:40.680 |
the models know when to turn this self-monitoring on. That's actually what Anthropic were surprised 00:08:46.920 |
by in the research. Put simply, they have a circuit that identifies that they are in a situation 00:08:53.320 |
in which introspection is called for and then they introspect. For sure, this is only some of the time 00:08:59.320 |
with the most advanced large language models like Claude Opus 4.1. But it certainly made me hesitate 00:09:05.560 |
the last time I was tempted to mindlessly berate a model. Now there's a lot more in the paper like 00:09:12.360 |
causing brain damage if you activate a concept too strongly. But again, the reason why I wanted to bring 00:09:18.120 |
any of this up is that we are still not done understanding and maximizing what we currently have 00:09:25.320 |
before we even explore new architectures. As the domains in which language models are optimized get 00:09:31.080 |
more and more complex like advanced software engineering and mathematics, the average person 00:09:36.920 |
might struggle to perceive model progress. I probably use AI models on average maybe six to seven hours a 00:09:44.680 |
day and have a benchmark, simple bench for measuring the raw intelligence of models, that's at least the 00:09:51.000 |
attempt. And I still am surprised by the rate of improvement without continual learning. In fact, 00:09:56.840 |
that reminded me of something that OpenAI said a couple of days ago. The gap between how most people 00:10:02.680 |
are using AI and what AI is presently capable of is immense. Okay, I have a couple more demonstrations 00:10:10.360 |
of the fact that AI is still relentlessly progressing. But first, a pretty neat segue, at least in my eyes, 00:10:17.240 |
is because it's a segue to how you might jailbreak these frontier AI models and thereby make them more 00:10:22.920 |
secure for everyone. Because the sponsors of today's video are the indomitable Gray Swan linked in the 00:10:29.000 |
description. And we actually have three live competitions to break the best models of today, 00:10:36.280 |
as you can see with some pretty crazy prizes. So whether nested learning makes this easier or harder, 00:10:42.680 |
time will tell. But for now I can say watchers of this channel have already hit leaderboards 00:10:48.040 |
on the Gray Swan arena, which kind of makes me proud. Again, my custom link is in the description. 00:10:53.240 |
And this entire video has been about architectures and text and intelligence. But that's all before we 00:10:57.960 |
get to other modalities like images, videos, maybe a video avatar that you can chat to. Whether society 00:11:03.880 |
is prepared for all of this progress or any of it is a question I can't answer. But regardless, 00:11:09.880 |
did you notice all of a sudden that Chinese image gen models seem like the best? Seadream 4.0, 00:11:17.160 |
Hanyuan Image 3. I don't know, they just seem the best to me. Especially those high resolution outputs 00:11:23.320 |
from Seadream 4.0. It's just really good for me. It's possibly the first time that someone might ask me 00:11:28.840 |
what is the best image gen model and I'd say a non-Western model. Hmm, maybe Jensen Huang really did mean 00:11:35.000 |
China will win the AI race, but obviously too early to tell. And now for what some of you have been 00:11:40.520 |
waiting for, Nano Banana 2. Yes, I do normally resist unsubstantiated rumors, but I love me a nano edit. 00:11:48.840 |
So I'm going to go ahead and show you this. Apparently a ton of people got access to Nano Banana 2 briefly 00:11:55.080 |
yesterday. And I think that's what happened before the release of Nano Banana. So it does lend credence 00:12:00.120 |
and be hard to fake some of these images. Suffice to say, it looks like Nano Banana 2 is getting 00:12:05.160 |
pretty close to solving text generation. Although sometimes it's a little off with Romachia, for 00:12:12.280 |
example. This was the website that briefly had access to Nano Banana 2 apparently. So for me, 00:12:18.360 |
almost regardless if there's an AI bubble, that would be about valuations, not the underlying technology. 00:12:24.600 |
We're scaling not just the parameters or the data or the money that goes into these models, but the 00:12:29.880 |
approaches to try out to improve the state of the art in each modality. By next year, there might be a 00:12:36.040 |
hundred times more people working on AI research than there was three years ago. And that's why you're 00:12:41.800 |
getting things like nested learning, continual learning and Nano Banana 2. But what do you think? Are we 00:12:46.840 |
looking at proof of progress or proof of a plateau? Have a wonderful day.