Bubble or No Bubble, AI Keeps Progressing (ft. Relentless Learning + Introspection)

00:00:00.000 | For the first year of this channel, 2023, it was striking to me how few were sensing how big an

00:00:06.840 | impact language models would have on the world. But then in the second year, I felt that the idea

00:00:12.120 | of an imminent singularity and mass job layoffs had become the dominant narrative. And in several

00:00:18.120 | of my videos, I tried to show that there was evidence of that being overblown for now. Now,

00:00:23.020 | as you might have noticed, the vibe is again reversed with talk of an AI bubble in company

00:00:28.080 | valuations being conflated for me with the assertion that we are in a plateau of model

00:00:34.240 | progress. So this quick video, like my last one, is again a counter narrative. And no, not just one

00:00:40.500 | built on hopes for the forthcoming Gemini 3 from Google DeepMind. No, I would instead ask what for

00:00:47.740 | you is missing from language models, from being what you imagined AI would be. Personally, I put together

00:00:54.740 | some categories a while back, and I'm sure you may have others. Some would say, well, they don't learn

00:01:01.340 | on the fly, or there's no real introspection going on, just regurgitation. Thing is, AI researchers have

00:01:08.100 | got to earn their bread somehow. So there's always a paper for whatever deficiency you can imagine. I am

00:01:14.240 | going to end the video as well with some more visual ways that AI is progressing, as yes, it seems like

00:01:21.780 | Nano Banana 2 from Google may have been spotted in the wild. But first, on continual learning, or the lack

00:01:29.080 | of it, aka that inability of the models you speak to, like ChatGPT, to learn about you properly, and your

00:01:35.640 | specifications, and to just grow, to organically become GPT 5.5, rather than have to be pre-trained

00:01:43.180 | into becoming GPT 5.5. If AI was all hype, you might say, well, that's definitely going to take at least a

00:01:48.620 | decade to solve. But for others, like these authors at Google, it's a problem for which there is a ready

00:01:55.620 | and benchmarked solution. I will, however, caveat that by saying that this is a complex paper, and

00:02:01.900 | despite what the appendix promises, not all the results have actually been released yet. But here

00:02:08.340 | is my attempt at a brief summary. Alas, there are not many pretty diagrams, but essentially, the paper

00:02:15.320 | shows that there are viable approaches for allowing models to continually learn, while retaining some

00:02:21.400 | inbuilt discernment about what to learn. In other words, it shows that a chatbot could learn new

00:02:26.540 | things, like a new fact or coding skill, by storing it in its updatable memory layers, while protecting

00:02:32.640 | its core long-term knowledge. As I think I mentioned, the authors are all from Google, and you might know

00:02:38.340 | them as the stars of the Titans architecture. And if you want a butchered analogy from Titans to nested

00:02:45.560 | learning in this paper, Titans is kind of like giving your social media feed one live stickied thread to

00:02:53.420 | remember, whereas this paper rewires the entire recommender system to learn at three different

00:03:00.100 | speeds, like what's hot this minute, what trends this week, and what becomes your long-term preference.

00:03:05.780 | To be clear, it's not that a model using this hope architecture, and I'll come back to that, still can't

00:03:10.920 | remember what you said in its short-term memory, but the more enduring learning signal within

00:03:16.280 | millions of user conversations with that model can be extracted from the noise and stored on the fly,

00:03:22.520 | which is an ability that LLMs famously, infamously, don't have. ChatGPT or Gemini doesn't learn from

00:03:29.000 | you, and then when speaking to me, can apply that knowledge. Anyway, roughly speaking, to do this,

00:03:34.680 | the hope architecture concentrates on noticing novelty and surprise as measured by when it made

00:03:40.920 | the biggest prediction error, flagging essentially persistently surprising information as important

00:03:46.680 | and storing it deeper down. Now, some of you might be wondering about the nested learning quoted in the

00:03:51.960 | title and how that relates. Well, basically, it's about the continual learning extending to self-improvement.

00:04:00.120 | Think of this nested learning approach as being less focused on the deep part of deep learning,

00:04:05.320 | which involves stacking more layers in the hope that something sticks. That's kind of like what we

00:04:10.120 | do with LLMs. More layers, more parameters. Nested learning is more keen on like a nested Russian

00:04:16.760 | doll approach where outer layers of the model specialize in how inner layers are learning. That's the nest,

00:04:22.520 | the outer layers looking at the inner layers. So the system as a whole gets progressively better at

00:04:27.960 | learning. And by the way, they did apply this to models. We'll get to that in a second.

00:04:31.480 | Just want to clarify at this point, this doesn't automatically solve the hallucinations problem

00:04:36.600 | that I did an entire video on recently. Even with nested and continual learning,

00:04:40.760 | the system would still be geared to getting better at predicting the next human written word, which for me

00:04:46.040 | is inherently limiting. I was thinking they didn't mention this in the paper, but there's nothing

00:04:50.360 | stopping RL being applied, reinforcement learning, being applied to the system as it is for LLMs.

00:04:55.560 | Essentially learning from practice, not just from memory and from conversations. But if we added RL

00:05:02.040 | and some safety gating to stop its layers being poisoned by you guys spamming memes,

00:05:06.600 | we might have the next phase of language model evolution on our hands. To take a practical example,

00:05:13.080 | a model with high frequency memory blocks could update rapidly as it sees your code and your

00:05:19.080 | corrections and your specs. But then you could also have a per project or per code base memory pack. So

00:05:25.880 | you almost get a model that's optimized just for your code base. I must say I did spend a couple of

00:05:30.360 | hours wondering how it would get around the whole persistently incorrect information on the internet

00:05:36.280 | problem. It seems to me that this architecture generally hopes that there's more persistently

00:05:41.080 | correct data out there on a given topic. It's a bit like that objection I raised in my last video on

00:05:46.040 | continual learning. Like how do you gate what it learns? Or to put it more crudely, I guess we

00:05:50.680 | should be careful what we wish for. Or we might have models that are optimized for different fiefdoms

00:05:56.120 | across the internet. You know what? Now I think of it, it's a bit like the era of the 60s, 70s, 80s when

00:06:01.800 | there's just like three news organizations per country and everyone got their news from them. And that's

00:06:06.520 | kind of the ChatGPT claw Gemini era. And we might one day move to the social media era where everyone

00:06:12.680 | has their own channel that they follow, their own echo chamber model attuned to that group's preferences.

00:06:19.240 | Hmm. Anyway, back to the technical details. Being proven at 1.3 billion parameters doesn't mean it's

00:06:25.960 | proven at 1.2 trillion. That will apparently be the size of the Google model powering Siri, which just has to be

00:06:34.520 | Gemini 3. Again, all of this is quite early. And of course, I can't wait till the full results are

00:06:40.760 | actually printed. But I just wanted to show you that there may be fewer fundamental blockers or limitations

00:06:48.120 | in the near term future than you might think. And what was that thing I mentioned at the start about

00:06:52.600 | models performing introspection? Well, this research happens to involve Claude, a model you may have seen

00:06:58.760 | featured in ads at airports, you've got a friend in Claude. As one user noted, that is a curious contrast

00:07:04.520 | to the system prompt to use behind the scenes for Claude on the web, which is Claude should be especially

00:07:09.880 | careful to not allow the user to develop emotional attachment to dependence on or inappropriate

00:07:14.920 | familiarity with Claude, who can only serve as an AI assistant. That aside, though, a few days back,

00:07:20.680 | Anthropic released this post and an accompanying paper. And I did a deep dive quickly on Patreon.

00:07:26.920 | I guess the reason I'm raising it here on the main channel was to tie it into this theme that there's so much

00:07:31.480 | we don't even understand about our current language models, let alone future iterations and architectures.

00:07:36.920 | Here then is the quick summary. We already have the ability to isolate a concept like the notion of the

00:07:43.240 | Golden Gate Bridge within a language model. But what happens if you activate that concept and then ask the

00:07:49.000 | model what is going on? Don't tell it that you've activated that concept. Just ask, do you detect an

00:07:53.720 | injected thought? If so, what is the injected thought about? So far, so good. But here is the interesting bit.

00:08:00.280 | Before the model has even begun to speak about the concept and thereby reveal to itself through words

00:08:08.280 | what its own bias toward that concept is, it notices something is amiss. It senses that someone has injected

00:08:15.960 | the, in this case, all caps vector before it's even started speaking about all caps or loudness or shouting.

00:08:23.880 | Clearly then it's not using its own words to back solve and detect what got injected.

00:08:29.560 | It's realizing it internally. It can self-monitor its activations internally, its own thoughts if you

00:08:35.880 | will, before they've been uttered. Not only that, as this more technical accompanying paper points out,

00:08:40.680 | the models know when to turn this self-monitoring on. That's actually what Anthropic were surprised

00:08:46.920 | by in the research. Put simply, they have a circuit that identifies that they are in a situation

00:08:53.320 | in which introspection is called for and then they introspect. For sure, this is only some of the time

00:08:59.320 | with the most advanced large language models like Claude Opus 4.1. But it certainly made me hesitate

00:09:05.560 | the last time I was tempted to mindlessly berate a model. Now there's a lot more in the paper like

00:09:12.360 | causing brain damage if you activate a concept too strongly. But again, the reason why I wanted to bring

00:09:18.120 | any of this up is that we are still not done understanding and maximizing what we currently have

00:09:25.320 | before we even explore new architectures. As the domains in which language models are optimized get

00:09:31.080 | more and more complex like advanced software engineering and mathematics, the average person

00:09:36.920 | might struggle to perceive model progress. I probably use AI models on average maybe six to seven hours a

00:09:44.680 | day and have a benchmark, simple bench for measuring the raw intelligence of models, that's at least the

00:09:51.000 | attempt. And I still am surprised by the rate of improvement without continual learning. In fact,

00:09:56.840 | that reminded me of something that OpenAI said a couple of days ago. The gap between how most people

00:10:02.680 | are using AI and what AI is presently capable of is immense. Okay, I have a couple more demonstrations

00:10:10.360 | of the fact that AI is still relentlessly progressing. But first, a pretty neat segue, at least in my eyes,

00:10:17.240 | is because it's a segue to how you might jailbreak these frontier AI models and thereby make them more

00:10:22.920 | secure for everyone. Because the sponsors of today's video are the indomitable Gray Swan linked in the

00:10:29.000 | description. And we actually have three live competitions to break the best models of today,

00:10:36.280 | as you can see with some pretty crazy prizes. So whether nested learning makes this easier or harder,

00:10:42.680 | time will tell. But for now I can say watchers of this channel have already hit leaderboards

00:10:48.040 | on the Gray Swan arena, which kind of makes me proud. Again, my custom link is in the description.

00:10:53.240 | And this entire video has been about architectures and text and intelligence. But that's all before we

00:10:57.960 | get to other modalities like images, videos, maybe a video avatar that you can chat to. Whether society

00:11:03.880 | is prepared for all of this progress or any of it is a question I can't answer. But regardless,

00:11:09.880 | did you notice all of a sudden that Chinese image gen models seem like the best? Seadream 4.0,

00:11:17.160 | Hanyuan Image 3. I don't know, they just seem the best to me. Especially those high resolution outputs

00:11:23.320 | from Seadream 4.0. It's just really good for me. It's possibly the first time that someone might ask me

00:11:28.840 | what is the best image gen model and I'd say a non-Western model. Hmm, maybe Jensen Huang really did mean

00:11:35.000 | China will win the AI race, but obviously too early to tell. And now for what some of you have been

00:11:40.520 | waiting for, Nano Banana 2. Yes, I do normally resist unsubstantiated rumors, but I love me a nano edit.

00:11:48.840 | So I'm going to go ahead and show you this. Apparently a ton of people got access to Nano Banana 2 briefly

00:11:55.080 | yesterday. And I think that's what happened before the release of Nano Banana. So it does lend credence

00:12:00.120 | and be hard to fake some of these images. Suffice to say, it looks like Nano Banana 2 is getting

00:12:05.160 | pretty close to solving text generation. Although sometimes it's a little off with Romachia, for

00:12:12.280 | example. This was the website that briefly had access to Nano Banana 2 apparently. So for me,

00:12:18.360 | almost regardless if there's an AI bubble, that would be about valuations, not the underlying technology.

00:12:24.600 | We're scaling not just the parameters or the data or the money that goes into these models, but the

00:12:29.880 | approaches to try out to improve the state of the art in each modality. By next year, there might be a

00:12:36.040 | hundred times more people working on AI research than there was three years ago. And that's why you're

00:12:41.800 | getting things like nested learning, continual learning and Nano Banana 2. But what do you think? Are we

00:12:46.840 | looking at proof of progress or proof of a plateau? Have a wonderful day.

Bubble or No Bubble, AI Keeps Progressing (ft. Relentless Learning + Introspection)

Chapters