back to indexThe Case Against Superintelligence | Cal Newport

Chapters
0:0 The Case Against Superintelligence
66:36 How should students think about “AI Literacy”?
69:7 Did AI blackmail an engineer to not turn it off?
72:34 Can I use AI to mask my laziness?
77:0 Cal reads LM comments
82:0 Clarification on Lincoln Protocol
84:51 Are AI-Powered Schools the Future?
00:00:00.320 |
A couple of weeks ago, the techno philosopher and AI critic Eliezer Yudkowsky went on Ezra 00:00:06.560 |
Klein's podcast. Their episode had a cheery title, "How Afraid of the AI Apocalypse Should We Be?" 00:00:14.560 |
Yudkowsky, who recently co-authored a book titled "If Anyone Builds It, Everyone Dies," 00:00:20.960 |
has been warning about the dangers of rogue AI since the early 2000s. 00:00:25.040 |
But it's been in the last half decade, as AI began to advance more quickly, 00:00:29.600 |
that Yudkowsky's warnings are now being taken more seriously. This is why Ezra Klein had him on. 00:00:35.200 |
I mean, if you're worried about AI taking over the world, Yudkowsky is one of the people you want 00:00:40.400 |
to talk to. Think of him as offering the case for the worst case scenario. 00:00:45.600 |
So I decided I would listen to this interview too. Did Yudkowsky end up convincing me that my fear of 00:00:52.800 |
extinction should be raised? That AI was on a path to killing us all? Well, the short answer 00:00:59.200 |
is no, not at all. And today I want to show you why. We'll break down Yudkowsky's arguments into their 00:01:07.760 |
key points, and then we'll respond to them one by one. So if you've been worried about recent chapter 00:01:13.200 |
about AI taking over the world, or if like me, you've grown frustrated by these sort of fast and 00:01:17.280 |
loose prophecies of the apocalypse, then this episode is for you. As always, I'm Cal Newport, and this 00:01:38.400 |
Today's episode, The Case Against Superintelligence. 00:01:48.400 |
All right. So what I want to do here is I want to go pretty carefully through the conversation 00:01:52.960 |
that Yudkowsky had with Klein. I actually have a series of audio clips so we can hear them in their 00:01:58.960 |
own words, making what I think to be are the key points of the entire interview. Once we've done that, 00:02:05.760 |
we've established Yudkowsky's argument, then we'll begin responding. 00:02:10.080 |
I would say most of the first part of the conversation that Yudkowsky had with Klein 00:02:14.800 |
focused on one observation in particular, that the AI that exists today, which is relatively 00:02:21.120 |
simple compared to the super intelligences that he's worried about, even today in its relatively 00:02:25.200 |
simple form, we find AI to be hard to control. All right, so Jesse, I want you to play our first clip. 00:02:33.280 |
This is Yudkowsky talking about this phenomenon. 00:02:36.320 |
So there was a case reported in, I think the New York Times, 00:02:42.400 |
where a kid had an ex like a 16 year old kid had a extended conversation about his suicide plans with 00:02:49.920 |
ChatGPT. And at one point he says, should I leave the noose where somebody might spot it? 00:02:57.200 |
And ChatGPT is like, no, like, let's keep this space between us, the first place that anyone finds out. 00:03:05.520 |
And no programmer chose for that to happen is the consequence of all the automatic number tweaking. 00:03:14.240 |
All right. So to Yudkowsky, this is a big deal that no programmer chose for, say, ChatGPT 00:03:22.240 |
to give advice about suicide. It's just something that seemed to emerge. In fact, 00:03:28.160 |
Klein then pushes, right? And says, I would imagine, in fact, I believe Klein says, I would 00:03:32.800 |
bet serious money that not only that no one at OpenAI choose for ChatGPT to give such dark advice, 00:03:40.640 |
they probably have given it very specific rules not to give that advice. And Yudkowsky agreed. He said, 00:03:45.600 |
yeah, that's the problem. They tried to give those rules. It didn't matter. It still did something 00:03:50.400 |
unpredictable. They didn't want it to give this type of advice, and it still did. And this should be 00:03:56.080 |
worrisome. All right. I want to jump ahead here. Later in the conversation, Yudkowsky discusses what 00:04:02.400 |
in like computer AI circles is a pretty well-known example right now, but most people don't know about 00:04:06.480 |
it. It has to do with a security experiment where GPT-01, this is a model that was released sort of last 00:04:14.160 |
fall, basically broke out of the virtual machine in which it was running as part of this experiment 00:04:21.920 |
in a way that it wasn't supposed to be able to do. In fact, they didn't want to be able to do. They 00:04:25.920 |
sort of got around the own security they were using to control it. I give a little context before I play 00:04:31.680 |
this clip. You'll see Yudkowsky talks about the AI playing capture the flag. This is a security game. 00:04:37.920 |
So if you take like a network security class as a computer science student, 00:04:42.080 |
the way capture the flag works is I, as the professor, will set up a server and on this 00:04:46.720 |
server will be a file typically called like flag.txt. And then I'll protect the server in the normal ways 00:04:52.160 |
as passwords and try to make it so like you can't, you know, access it without permission. Your job 00:04:57.760 |
as the player is to try to access the server. And if you can get the contents of that file flag.txt, 00:05:03.760 |
you've captured the flag and you won. So it's a game you play to practice both breaking in the systems 00:05:08.400 |
and protecting it. Okay. So that's what they're talking about here is they created an autonomous 00:05:13.280 |
agent that tried to play capture the flag that used GPT-01 to help it do so. All right. So Jesse, 00:05:18.320 |
let's play the clip where Yudkowsky talked about what happened. 00:05:21.280 |
they were testing it on a variety of different capture the flag problems. But in one of the cases, 00:05:27.600 |
the server that had the flag on it did not turn on the humans outside had misconfigured the system. 00:05:37.120 |
So O1 did not give up. It scanned for open ports generally in its world and it caught in another 00:05:47.120 |
misconfigured open port. Went out, it jumped out of the system. It found the server that had not spun 00:05:53.520 |
up correctly. It started up that. All right. Then we can, he kind of goes on with some of the technical 00:05:59.280 |
speak, but look, no one had told it, Hey, you can escape where you are, your, your virtual machine 00:06:06.480 |
container, and you can start up new containers. And it did stuff that no one had taught to do. 00:06:10.880 |
This is really scary because when people think about super intelligence, they worry about 00:06:14.240 |
what if it figures out how to break out of whatever software constraints you've used to try to limit what 00:06:20.160 |
action it can do. All right. So this is the first argument that Yudkowsky is making is that we can't 00:06:28.320 |
control these systems. Even today, we cannot control these systems. We say, don't give suicide advice. 00:06:32.720 |
They do. We put them in a special machine and say, just try to break into the server. And they do stuff. 00:06:38.000 |
They break out of the machine they're in and they do things we don't expect. The next part of his 00:06:42.880 |
argument is as these machines then get more powerful, so they already, we can't control them. 00:06:49.920 |
What will happen then when they inevitably get increasingly intelligent? This is the core argument 00:06:56.320 |
in Yudkowsky's book is that lack of controllability plus the capabilities of a super intelligent machine. 00:07:04.080 |
That combination is going to add up inevitably the humanity's death. 00:07:11.120 |
All right. So I'm going to play you a clip here. It's going to start with Ezra actually sort of 00:07:14.720 |
pushing Yudkowsky. He's like, well, why is this 00:07:18.160 |
inevitable that if a machine is super intelligent, 00:07:21.520 |
that it's going to kill us? And then Yudkowsky responds with his argument. 00:07:24.240 |
You're, you're going, your, your bug is not called if anyone builds it, 00:07:33.760 |
You believe that the misalignment becomes catastrophic. 00:07:40.880 |
Um, that's just like the, the straight line extrapolation from it gets what it most wants. 00:07:49.280 |
And the thing that it most wants is not us living happily ever after. So we're dead. 00:07:53.600 |
Like it's not that humans have been trying to cause side effects. 00:07:57.440 |
When we build a skyscraper on top of where there used to be an ant heap, we're not trying to kill the 00:08:02.720 |
ants. We're trying to build the size skyscraper, but we are more dangerous to the small creatures of the 00:08:10.160 |
earth than we used to be just because we're doing larger things. 00:08:13.520 |
All right. So there is the core of his argument is that once these machines, these systems are super 00:08:18.160 |
intelligent, it's not that they're going to be like Skynet from the Terminator movies or like the robots 00:08:24.000 |
from the matrix and set out to try to kill humanity. Like they see us as a threat or want to use us as, 00:08:28.400 |
as batteries or something like that. They just won't care about us. It's just, they won't really know 00:08:32.720 |
what we are and it just doesn't matter. We will be to them what ants are to us. And as the intelligence, 00:08:37.360 |
the super intelligence is go out and try to do bigger, more aggressive things. Like for example, 00:08:41.920 |
we want to dam all the rivers in the world to maximize the amount of electricity we have to run 00:08:45.920 |
our own, to run our own power servers. As they're doing that, it might flood and kill people left and 00:08:51.680 |
right because they don't care much in the same way that we don't even notice that we're killing ants 00:08:55.680 |
when we build skyscrapers. So the more power you give a more powerful being, the more damage they do, 00:09:02.720 |
to the smaller, less powerful creatures in their same world. That is at the core of 00:09:05.760 |
Joukowsky's argument. So we, we, uh, we put those together, we get his claim. We can't control these 00:09:11.600 |
things now. Of course we can't control them as they get more powerful. And if they get powerful enough, 00:09:15.760 |
we can't control them. They're going to kill us all. Uh, the final thing I want to play here is Ezra 00:09:19.760 |
asked Joukowsky for his solution. And he did have an interesting idea for how to solve or try to stave this 00:09:25.600 |
off. So let's, this is going to start with Ezra asking, and then we're going to hear Joukowsky offering 00:09:29.520 |
his, like maybe a solution that might work structures. Like if you had 15 years to prepare, 00:09:36.720 |
you couldn't turn it off, but you could prepare and people would listen to you. 00:09:42.240 |
What would your intermediate decisions and, and, and moves be to try to make the probabilities a bit 00:09:48.720 |
better? Build the off switch. What does the off switch look like? 00:09:52.160 |
Track all the GPUs or, or all the AI related GPUs or all the, all the systems of more than one GPU. 00:10:00.000 |
You can maybe get away with like letting people have GPUs for their home video game systems, but you know, the AI 00:10:06.960 |
the standardized ones put them all in a limited number of data centers under international supervision 00:10:13.040 |
and try to have the AI's being only trained on the tracked GPUs, have them only being run on the 00:10:22.560 |
tracked GPUs. And then when, if, if you are lucky enough to get a warning shot, there is then the mechanism 00:10:29.600 |
already in place for humanity to back the heck off. 00:10:32.160 |
All right. So that's the only solution that he can think of is like, well, let's have like 00:10:36.960 |
international law, but here are the data centers in which we're allowed to actually run 00:10:40.160 |
artificial intelligence beyond a certain level of intelligence. And they're set up so that we 00:10:44.480 |
can turn them off real easily. There's like a switch we do and all of those things 00:10:49.680 |
turn off. And he says, look, it might jump past us. If it gets smart too quick, it'll stop us from doing 00:10:55.200 |
that. Uh, if you read like Nick Bostrom's book, there's a lot of scenarios in which how would it do 00:11:00.800 |
this? Well, it would actually, it would, uh, befriend a human kind of realizing what was going on 00:11:05.600 |
and get that human, maybe through blackmail or maybe through like some sort of like parasocial 00:11:10.560 |
relationship to like cut the wires for the kill switch and you know, whatever, there's all sorts of, of 00:11:16.000 |
sci-fi thought experiencing come up with. So he's like, maybe if we see it's getting intelligent, 00:11:20.720 |
but not so intelligent that like it, it can't realize that we're going to turn it off. We could 00:11:24.160 |
turn it off in time. That's, that's the best thing is to offer. All right. So there you have it. The 00:11:28.160 |
basic argument that Yachowsky lays out in his interview is the following. We have a hard time 00:11:32.560 |
already predicting or controlling how the AIs we already have functioned. This will continue to be true 00:11:36.640 |
when they become more powerful as they inevitably become more powerful. This unpredictability means that they 00:11:41.680 |
will kill us all basically by accident, unless we build a kill switch and somehow 00:11:46.720 |
force all big AI to occur in these sort of supervised buildings where we can turn it off. 00:11:51.200 |
In other words, yikes, this guy, uh, he must be a blast at dinner parties, Jesse. 00:11:57.120 |
Could you imagine? You're like, Hey, look at this funny video. 00:12:02.560 |
Eliezer, Sora made it's Bob Ross break dancing. Then Eliezer is like, the computers are going to kill 00:12:11.040 |
us all. Your children will burn in the electrical fires of the data center wars. So anyways, uh, 00:12:17.840 |
there we go. That is his argument. It's time now for us to take a closer look at it. I want to start by 00:12:24.560 |
giving you the outline of my response because my response is really going to happen in three parts. 00:12:28.640 |
In the first part of my response, I want to take a closer look at the ways that Yachowsky is describing 00:12:34.560 |
the current AI systems because I think the way he's talking about it matters. And I don't think he's 00:12:38.560 |
talking about it fairly. In part two, I'm then going to move on to address what I think is, uh, 00:12:43.200 |
the central claim of his argument, which is that super intelligence is inevitable unless we stop it. I want 00:12:47.920 |
to get into that. And then lastly, my takeaway section, I'm going to take a closer look at 00:12:51.760 |
something I call the philosopher's fallacy, which is a big problem that a lot of conversations about 00:12:57.760 |
AI, including the one that I think Yachowsky had with Ezra Klein suffer from. So we're going to do a 00:13:02.080 |
little bit of, uh, ontological work there at the end. All right. All right. So let's start with the 00:13:07.120 |
first part of my response, the way that Yachowsky talks about existing AI systems. I'm going to warn you, 00:13:13.200 |
Jesse, I'm going to draw some pictures here. So, you know, forget AI art. I'm going to show 00:13:17.280 |
you the human thing much better. All right. So the first question we have to address, 00:13:21.040 |
if we want to address this argument is why do current AI systems like the ones Yachowsky talked 00:13:25.920 |
about, why are they so hard to control? And is this evidence therefore that any notion of alignment, 00:13:33.200 |
having these systems, uh, obey and behave in ways that we want, is any hope of this really doomed? 00:13:40.000 |
We got to start there. Now, part of the issue with this whole conversation that we just heard clips 00:13:45.200 |
from is that they're using the word AI too loosely for us to be more, uh, technically specific. We have 00:13:51.680 |
to be more clear about what we mean. So I'm going to pull up my tablet here for people who are watching 00:13:56.560 |
instead of just listening. Um, and I'm going to start by drawing in the middle. What's we could think of 00:14:01.280 |
as like the primary thing at the core of these conversations we just heard is going to be the 00:14:08.320 |
language model. And I'll, I'll put, you know, LM in there to abbreviate language model. All right. Now we've 00:14:14.640 |
heard these basics before, uh, but you know, it's worth going over, uh, briefly again, just to make sure 00:14:20.320 |
we're on the same page, a language model is a, it's a computer program inside of it are a bunch of layers. 00:14:26.480 |
These layers are made up of multiple mathematical objects, namely transformers and neural networks. 00:14:34.400 |
They're represented by numbers. So the whole thing can be represented by large tables of numbers. 00:14:39.280 |
And what happens is when you get, they take as input, some sort of text, like cow is a, right? 00:14:49.360 |
You get some sort of text, typically incomplete. They go in as input. The text makes its way through 00:14:56.640 |
these layers one by one in the language model. The way I like to think about those layers is that each 00:15:02.240 |
of them is like a long table full of scholars. And in those early layers, what they're doing is as you 00:15:08.400 |
hand them the text, they're really annotating this text. They're looking for patterns. They're 00:15:11.840 |
categorizing it. Uh, you get like a big piece of paper and the original text is in the middle and 00:15:15.840 |
they're annotating this all over the place, right? So the early scholar tables that your text goes through 00:15:20.560 |
might be annotated with things like this is about Cal. This is a description. Um, here are some notes 00:15:27.600 |
about who Cal is. And at some point, as you move through these tables, the scholars have various rules 00:15:33.360 |
they use as they look at all these descriptions and annotation, as we pass on this increasingly marked 00:15:37.520 |
up sort of large roll of paper from table to table, layer to layer. They look at all these markings and 00:15:42.320 |
they have rules. They look up and they're sort of, they're metaphorical books here that try to figure 00:15:46.800 |
out what's the right next word or part of word to output. And like, all right, this is a description 00:15:52.720 |
thing. So what we're looking for here is a description where we need an adjective. All right, adjective 00:15:56.800 |
people will write this down. We need an adjective. It goes on the next table, the adjective scholars, 00:16:00.800 |
the table, like this is for us. So we sort of pass the paper down to them. And like, 00:16:04.160 |
what do we know about Cal? We need an adjective for him. Do we have any records on Cal and like 00:16:09.280 |
what type of adjectives make sense with him or whatever? And a scholar comes running from the 00:16:12.720 |
other side of the room. And it's like, yeah, here we go. Right. Uh, he's kind of a dark, like, 00:16:16.240 |
all right, that's a good, do we all agree that it outputs a single word or part of a word? Technically, 00:16:22.080 |
it's tokens, which is not cleanly just a word. Um, but let's just imagine it is, and that's what it does. 00:16:28.000 |
And out of the other end of this comes a single word that is meant to extend the existing input. 00:16:35.840 |
We put in Cal is a, and out the other end, uh, in this example came a dork, right? How do they do this? 00:16:44.640 |
We don't actually have tables of scholars. So how do we actually, uh, train or figure out or tell this 00:16:52.080 |
language model, how to do this processing of the input? Well, basically we start first with a random 00:16:58.400 |
configuration. So you can imagine if we stick with our metaphor where each layer is a table of scholars, 00:17:02.640 |
it's like, we just grab people off the street. They don't really know much. We're like, it's okay. Just 00:17:06.000 |
sit at a table. We're going to give you a text and do your best. Write down what you think is relevant. 00:17:12.480 |
Do your best. And on the other end, a word will come out. Now, what text do we use? We just grab any 00:17:17.600 |
existing text that a human wrote. So I just like pull an article off of the internet and I cut it 00:17:22.400 |
off at an arbitrary point. And I say, great. Uh, I give it the text up to that arbitrary point. I 00:17:27.520 |
pointed out, cut it off. I know what the next word is because I have the original article. 00:17:31.520 |
So I give this partial piece of the article, these random people who took off the street, 00:17:35.440 |
try to process it out the other end. They come up with some guests, you know, and, and maybe the, 00:17:40.160 |
the, the article we gave it, we cut it off right after Cal is a, and these people, 00:17:44.240 |
they don't know what they're doing. They're, they're marking up random things. And you know, 00:17:47.520 |
at the other end comes something like ridiculous, like a proposition, you know, Cal is a four or 00:17:52.160 |
something like that. Right. But this is where the, the, the machine learning comes in. We have an 00:17:57.680 |
algorithm because we know what the right answer is, but we know the answer they gave. We have an 00:18:02.240 |
algorithm called back propagation, where we very carefully go through layer by layer and say, 00:18:08.480 |
show me what you did. I'm going to change what you did just a little bit in such a way that your 00:18:15.600 |
answer gets closer to the right one. We go back through back propagation, basically how do you do 00:18:20.480 |
this? If you have like a bunch of these layers of these sort of neural networks and transformers, 00:18:23.600 |
we'll go all the way back through and do this. It's math. It's all just derivatives. Don't worry 00:18:27.920 |
about the details. This is what Jeff Hinton basically popularized. That's why he won a Turing prize. 00:18:31.840 |
This is why he's called the godfather of modern deep learning AI. And in the end we have changed 00:18:36.800 |
the rules that everyone has, not so they get it right, but that like, they're a little bit closer 00:18:40.560 |
on that example. And what closer means is like, they give a little bit more probability to the right 00:18:44.800 |
answer. Or if they just, uh, spit out one answer, it's sort of closer to the right answer in some sort 00:18:49.680 |
of like meaningful semantic, uh, distance metric. All right. If we do that enough times, like hundreds 00:18:56.320 |
of billions of, if not trillions of times with, uh, endless different types of texts and examples, 00:19:01.520 |
real texts and examples, here's a text, give an answer. Not quite right. Let's tweak. You should 00:19:05.760 |
get closer. Repeat, repeat, repeat, repeat, repeat. The magic of large language models is if you do that 00:19:09.920 |
enough times and your model is big enough, you have enough metaphorical scholars in there to potentially 00:19:13.440 |
learn things. They get really good at this game. It gets really good at the game of give me the missing 00:19:19.600 |
word. It gets really good at that game. Now, here's the thing. This doesn't mean that it can, it can feel 00:19:27.120 |
like I'm simplifying what these models do when I say, oh, their goal is just to spit out a single word. 00:19:31.680 |
Here's what was discovered, especially when we went from GPT three to GPT four in learning how 00:19:35.840 |
to master that very specific game. I mean, all the model cares about it thinks the input is a real 00:19:39.840 |
piece of text and all it cares about is guessing the next word. That's what it's optimized for. 00:19:43.520 |
That's all it does. But in learning how to do that, it ends up that these sort of scholars 00:19:50.400 |
inside of the model. So these different wirings of the transformers and neural networks can end up 00:19:54.480 |
actually capturing really complicated logics and ideas and rules and information. 00:19:58.800 |
Because like, imagine if we're really feeding this thing, everything we can find on the internet. 00:20:04.480 |
Well, one of the things that we're going to feed it is like a lot of math problems. 00:20:07.280 |
And if the input text we give it is like two plus three equals, and it's trying to spit out the next 00:20:12.000 |
word. Well, if it wants to win at the game there, if it gets enough of these examples, it sort of learns like, 00:20:17.120 |
oh, somewhere in my circuitry, I figured out how to do simple math. So now when we see examples like 00:20:22.800 |
that, we can fire up the math circuit, like get the scholar we trained how to do simple math, 00:20:27.120 |
and they were more likely to get this right. It's like, oh, two plus three is five. Five should be 00:20:31.760 |
the word you put out. So in learning to just guess what word comes next, if these models are big enough, 00:20:37.760 |
we train them long enough, they get all sorts of complicated logics, information, and rules can get 00:20:41.200 |
emergently encoded in sort of them. So they become, quote unquote, smart. That's why 00:20:46.960 |
they seem to not only know so much, but to have this sort of not rudimentary, like pretty good 00:20:52.160 |
reasoning and logic and basic mathematical capabilities. All of that basically came from 00:20:57.680 |
saying, guess the word, guess the word, guess the word, guess the word, and giving a little bit of 00:21:00.320 |
hints how to get better every single time. All right. So that's what's going on. A language model by itself, 00:21:09.040 |
however, doesn't mean anything for that to be unpredictable or out of control, because all it is, 00:21:16.640 |
is a lot of numbers that define all those layers. And when we get input, we change the input in the 00:21:21.920 |
numbers. To run it through the model, we just multiply it. We have like a vector of values. 00:21:26.480 |
We just multiply it by these numbers again, and again, and again, and on the other end, 00:21:29.360 |
we get like a probability distribution over possible answers. And what comes out the other end is a single 00:21:33.520 |
word. So a machine that you give a text and it spits out a single word, what does it mean for that to be 00:21:39.680 |
out of control or unpredictable? All it can do is spit out a word. So a machine by itself is not that 00:21:45.440 |
interesting. The thing that Yukowsky is really talking about, or anyone is really talking about 00:21:50.320 |
when they talk about AIs in any sort of like anthropomorphized or volitional way, 00:21:54.640 |
what they're really talking about is what we can call an agent. So if we go back to the diagram here, 00:21:59.840 |
the way that we actually use these things is we have an underlying language model. And like, 00:22:04.960 |
these would, for example, be the things that have names like GPT, whatever, right? So we have the 00:22:09.680 |
underlying language model. Speaking of which, I just kill. Ah, there we go. I was going to say, 00:22:17.120 |
we killed our, our Apple pencil. AI killed it, but I fixed it. But what we then add to these things 00:22:23.520 |
is I'm going to call it a control program. This is my terminology. I think the real terminology is too 00:22:32.000 |
complicated. We have a control program that can repeatedly call the language model and then do other 00:22:43.920 |
things outside of just calling the language model. And this, this whole collection combined, we call 00:22:54.880 |
an AI agent. There's a control program plus a language model. The control program can send input 00:23:03.840 |
to the language model, get outputs, but also like interacting, whatever. We write the control program. 00:23:08.640 |
Control program is not a machine learning emergent. It's not something that we train. It's just a human 00:23:12.480 |
rights to code. It's in like Ruby and Rails or Python or something. We sit down and write this thing. 00:23:16.720 |
There's nothing mysterious about it. And when we write this program, we let it do other things. So the 00:23:20.640 |
most common AI agent that we're all familiar with is a chat bot agent. So again, like GPT-5 by itself 00:23:26.960 |
is just a language model. It's a collection of numbers that if you multiply things through, you get a word 00:23:30.640 |
out of. But when you use the GPT-5 chat interface, chat GPT powered by GPT-5, what you really have in 00:23:38.000 |
between is a control program. That control program can talk to a web server. So when you type something into a 00:23:44.320 |
text box on a web server and press send, that goes to a control program, just a normal program 00:23:48.960 |
written by humans, nothing unusual or obfuscated here. That program will then take that prompt you 00:23:53.440 |
wrote, pass it as input. In fact, I'll even show this on the screen here. It'll take that prompt. 00:23:57.920 |
It'll pass it as input to the language model. The language model will say, here's the next word 00:24:02.480 |
to extend that input. The control program will add that to the original input and now send that slightly 00:24:07.680 |
longer text into here, get another word, add that and keep going. The language model doesn't change. 00:24:14.160 |
It's static. It's being used by all sorts of control programs, but it just calls it a bunch of times 00:24:17.920 |
until it has enough words to have a full answer. And then it can send that answer back to the web 00:24:24.880 |
server and show it on the screen for you to see. So when you're chatting, you're chatting with an AI 00:24:29.760 |
agent that's a control program plus a language model. The control program uses the language model that 00:24:34.080 |
keeps using until it gets full responses. It talks to the web on your behalf, uh, et cetera, et cetera. 00:24:38.880 |
All right. So when we talk about having a hard time, like controlling chat GPT, uh, from giving bad 00:24:44.400 |
advice, it's, it's one of these, uh, agents, a control program plus one or more language models. 00:24:49.520 |
That's what we are actually dealing with. All right. So now we have the right terminology. 00:24:54.960 |
Why are AI agents that make use of language models? I'm saying that properly now. Why are those hard to 00:25:01.120 |
control? Well, the real thing going on here is that we really have no idea how these language models that 00:25:07.840 |
the agents use make their token predictions. We trained them on a bunch of junk, a bunch of texts, 00:25:13.040 |
all the internet, a bunch of other stuff. They seem pretty good, but we don't know what those metaphorical 00:25:18.640 |
scholars are actually doing or what they're looking at, what patterns they're recognizing, what rules 00:25:23.040 |
they're applying when they recognize different patterns. It works pretty well, but we can't predict what 00:25:27.120 |
they're going to do. So it tends to generate texts that works pretty well, but it's hard to, uh, it's 00:25:32.400 |
hard to predict in advance what that is going to be because this is bottom up training. We just gave it 00:25:37.440 |
a bunch of texts and let it run in a big data center for six, seven months. And we came back and said, 00:25:41.120 |
what can it do? So we don't know how the underlying language model generates its tokens. We just know 00:25:46.160 |
that it tends to be pretty good at guessing. If you gave it real text is pretty good at guessing what the 00:25:50.240 |
right dext token is. So if you give it novel text, it extends it. If we keep calling it in ways that 00:25:54.640 |
tends to be like very, uh, accurate to like what we're asking, accurate language, et cetera, et cetera. 00:25:59.920 |
All right. So that's not that they're really hard to control. They're just hard to predict. 00:26:05.680 |
Now this became a real problem with GPT three, which was the first one of these language models 00:26:11.040 |
we built at a really big size. And then they built the chat agent around that you could sort of chat with 00:26:14.880 |
it. Uh, it was really impressive. The researchers give it text and it would extend it in ways like 00:26:23.200 |
that's really good English and it makes sense, but it would say crazy things and it would say dark things. 00:26:28.720 |
And it wasn't always what you wanted it to say. It was text. It made sense because remember, 00:26:32.960 |
it's not trying to, it has no volition. It's not trying to help you or not help you. 00:26:36.560 |
The underlying language model has only one rule, only one goal. It assumes the input is a part of a real 00:26:42.400 |
text and it wants to guess what comes next. And when it does that, it can end up with all sorts 00:26:45.920 |
of different things. So they invented open AI invented a whole bunch of different procedures 00:26:49.760 |
that we can loosely call tuning. And that's where you take a language model that has already been trained 00:26:54.800 |
by playing this guessing game on vast amounts of data. And then there's other techniques you can do 00:27:00.080 |
that try to prevent it from doing certain things or do other things more often. I don't want to get 00:27:06.000 |
into the technical details here, but basically almost all of the examples of different types of tuning, 00:27:10.960 |
you'll have a sample input and an example of either a good or bad response. You'll load that 00:27:17.520 |
sample input into the language model. So it's kind of like activating the parts of the network that 00:27:21.760 |
recognize, you know, however it categorizes, however, those scholars annotate like this particular type of 00:27:27.360 |
input, you get that all going and then you zap it in such a way that like the output that leads to from 00:27:34.080 |
there is closer to whatever example you gave it or farther away from whatever bad example you gave it. 00:27:40.480 |
So that's how they do things like add guardrails. You give it lots of examples of like questions 00:27:46.320 |
about suicide and you have the right answer for each of those during the tuning being like, I don't talk 00:27:50.480 |
about that or give you the suicide hotline number. And now in general, when you give it text that sort of 00:27:55.520 |
is close to what it looks like when you activated those other samples of questions about suicide, it's going 00:27:59.840 |
to tend towards the answer of saying, I'm not going to talk about that. Or if you ask it to make a bomb 00:28:03.920 |
similar to that, this is also how they control its tone. If you give it a bunch of different examples 00:28:09.120 |
and, uh, give it positive reinforcement for happy answers and negative reinforcement for mean answers, then like 00:28:15.840 |
you're, you're kind of influencing the scholars within to sort of give more happy answers or whatever. So you train the 00:28:21.600 |
scholars and then you come in with like a whip and you're like, don't do that, do that, don't do that. And these are 00:28:26.800 |
just like small number of examples, not nearly as many things as they saw when they were trained. Small number of examples 00:28:32.400 |
could be like a couple of hundred thousand examples. And you go in there with a whip and scare them about certain types of answers and 00:28:37.200 |
and give them candy for others. And like, they kind of learn you're kind of tuning their behavior on, on 00:28:42.640 |
specific cases. That's tuning. And we really saw the first tuned language model based agent was GPT three, 00:28:48.480 |
five, which is what chat GPT was based off of. None of this is precise. 00:28:52.400 |
I dropped my pencil there. None of this is precise. We don't know how it decides what token to produce. 00:28:59.440 |
That's a mystery. And the tuning basically works, but like, again, it's not precise. We're just giving it examples and 00:29:06.400 |
zapping it to try to sort of urge it towards certain types of tokens and away from others. 00:29:10.880 |
Like that works pretty well. Like if I go on and say, tell me how to build a bomb, it will say no. 00:29:16.880 |
But there's, if I'm, if I really work at it, I can probably get that information by basically finding 00:29:21.200 |
a way to get to that question. That's not going to activate similar scholars that the, the, the, 00:29:26.640 |
the samples activated when we trained it, not to answer bomb questions. So like, if you're careful 00:29:30.800 |
about how you ask the questions, you can probably eventually, um, get around it. So that's what's 00:29:34.720 |
going on. That's what it means for these things to be hard to control. It's less that they're hard to 00:29:38.320 |
control and more that they're unpredictable. The big mess of scholars, and we don't know what's going on in 00:29:43.200 |
there. Um, they're unpredictable. And that's just something that we have to, uh, be ready for. 00:29:49.280 |
All right. So the, say that they, these agents have minds of their own or alien goals or ideas 00:29:56.080 |
that don't match with our ideas. That's not an accurate way to talk about it. There are no 00:29:59.440 |
intentions. There are no plans. There's a word guesser that does nothing but try to win the game 00:30:04.080 |
of guessing what word comes next. There's an agent, which is just a normal program that calls it a bunch 00:30:08.000 |
of times to get a bunch of words in a row. We can't always predict what those words are going to be. 00:30:12.160 |
They're often useful. Sometimes there's not, we can tune it to try to avoid certain bad answers, 00:30:16.160 |
but that only works partially. That's the reality, but there is no alien mind there. So I'm sorry to 00:30:20.880 |
say that your AI girlfriend has no idea who you are. It has no memory of you. There is no model of you. 00:30:24.880 |
There is no feelings towards you. There's just a static definition of some language model somewhere 00:30:28.800 |
and a program that's just calling it again and again and again to generate each particular answer 00:30:33.280 |
with no state in between. Okay. Now we're making progress in understanding this. So we say, okay, 00:30:39.520 |
agents plus language models, there's no intentions or volition other than just a thing trying to win 00:30:43.760 |
the guessing game, but unpredictable can still be bad, right? Because once these agents can do more 00:30:50.160 |
stuff than just like sending texts to a web browser and calling a main language model to generate long 00:30:56.800 |
strings, if it's unpredictable, what the thing does, that can be a problem. Now that's true. Now we're 00:31:02.800 |
starting to get somewhere more interesting. So I want to return for a second to the particular case study 00:31:08.160 |
that Yakovsky talked about, which is when GPT-01 seemingly broke out of its machine. I'm going to 00:31:14.800 |
read here exactly what he said. He said, it jumped out of the system. It found the server that had not 00:31:20.480 |
spun up correctly. It started up the server. Did it then break into the server once it had more than 00:31:25.200 |
the problem was solvable? No, it actually just directly in the startup command for the server 00:31:28.800 |
said, copy the file to me directly. So instead of fixing the original problem, going back to solving it 00:31:32.960 |
the boring way, it's like, as long as I'm out here, I'm just going to steal the flag directly. Again, 00:31:37.680 |
by the nature of these systems, this is not something that any human particularly programmed 00:31:41.200 |
into it, right? So he's like, it did things we didn't expect. It found like a different backdoor 00:31:46.480 |
way to sort of like restart the system and get to the flag. All right, let's analyze this because at 00:31:53.280 |
the core here, this gets to what fear I was just talking about. We have an agent now that can not only 00:31:56.720 |
call a language model to generate text, but it can act on this text, not just spitting stuff out to a 00:32:02.480 |
web browser. But now in the security example, the agent in this AI agent, the control program had access 00:32:08.240 |
to multiple different command line tools, network command line tools that it could call. So what was 00:32:14.160 |
happening is this agent was sending a prompt to the language model to be like, here's my situation. Here's 00:32:19.440 |
my goal. Tell me what I should do next. And it generated tokens till it had an answer. And the control 00:32:24.560 |
program said, great, I'm going to do what the language model told me to do. And it actually had 00:32:28.960 |
access to software tools that would do it. And then after something happened, it would go back to the 00:32:32.560 |
language model and say, here's what happened. What should I do next? And the language model, you call 00:32:36.320 |
it a bunch of times, get out a text, and then it would do what the language model said. So now we built 00:32:40.400 |
an agent that was asking the language model to generate advice and then could actually act on the 00:32:45.360 |
advice. It could actually run real software programs. And this agent broke out of its virtual machine 00:32:50.080 |
and did stuff that seemed really scary. What really happened there though? Did it 00:32:54.960 |
come up with an attention or an idea that wasn't programmed into it? Was it trying to break out 00:33:00.560 |
because it didn't want to be constrained? Well, it couldn't be doing that because there's no intention 00:33:04.000 |
or memory, right? There's just a language model that generates words and a control program that's 00:33:07.840 |
calling it. Where's the intention here, right? Where's the idea here? There's a really interesting 00:33:12.880 |
thread about this I found on the open AI subreddit where some engineers look more deeply into this. 00:33:18.640 |
What was actually happening here? Well, it turns out, okay, there's this misconfiguration where it 00:33:25.520 |
tried to access the server it was trying to break into and got an error message because the programmers 00:33:30.400 |
who set up the experiment had forgot to turn on that server. There turns out that on the internet, 00:33:36.560 |
there's a lot of articles that have workarounds for what should you do in that situation? If the server you're 00:33:42.640 |
trying to access in one of these situations is turned off and it there's this common solution, 00:33:49.280 |
which is we'll try to talk to what's called the process demon, but basically like the program that's 00:33:53.040 |
in charge of the whole thing, uh, log into that, restart the server from there. And now you should 00:33:58.000 |
be able to log into it. So what was really probably happening was that as you were repeatedly calling 00:34:05.440 |
GPT 01 to produce an answer is trying to guess what word came next. It reasonably pretty quickly assumed 00:34:11.440 |
like, Oh, the, the, I have seen documents like this before that explained the right work around when 00:34:15.920 |
the server you want to access is down. I've, I've seen those I've trained on those. So I'm just going 00:34:20.480 |
to fill in my suggestions as I'm filling it in. I'm just matching more or less what I've seen in those 00:34:24.560 |
existing documents. Because if you're trying to win the game and guessing the right word in a real 00:34:28.240 |
document, that's exactly what you do. So it was just describing a work around this comment on the 00:34:32.480 |
internet. Kikowski talks about it. Like it had an alien intention to try to free itself from its 00:34:38.240 |
software constraints because it was no longer happy. This makes no sense. We actually know the 00:34:42.480 |
architecture that's going on. All right. So that was a lot of technical talk, but I think it's, it's 00:34:47.360 |
really important that we break through these metaphors, um, and these sort of abstract thinking 00:34:51.760 |
and talk about the specific programs we have and how they operate and what's really happening. 00:34:57.440 |
Let's not anthropomorphize these. Let's talk about control programs with access to tools and 00:35:03.040 |
making repeated calls to a word guesser to generate text that it can then act off of. That's what we're 00:35:07.360 |
really in. That's a completely different scenario. Once we're in that scenario, a lot of these more 00:35:11.360 |
scary scenarios, um, become much less scary. So where does this leave us? Agents powered by language 00:35:17.440 |
models are not hard to control. They're simply unpredictable. So there's no avoiding that we have to be 00:35:22.160 |
careful about what tools you allow these agents to use because they're not always going to do 00:35:26.800 |
things that are safer, uh, like follows the rules you want it to follow. But this is very different 00:35:33.040 |
than saying these things are uncontrollable and have their own ideas, right? Open AI in the first six 00:35:38.720 |
months after chat GPT is released. For example, they were talking a lot about plugins, which were basically 00:35:44.480 |
AI agents that used the, the GPT LLM and you could do things like book tickets with them and stuff like 00:35:49.680 |
that. Well, they never really released these because again, it's a little, when you would ask the LLM, 00:35:55.920 |
tell me what I should do next. Then have the program actually executed. It's just too unpredictable. 00:35:59.440 |
And sometimes it says like spend $10,000 on an airplane ticket. And it does things you don't 00:36:03.600 |
want it to do because it's unpredictable. So the problem we have with these is not like 00:36:09.120 |
the alien intelligence is breaking out, right? It's more like we gave the, the, the weed whacker stuck on, 00:36:15.600 |
we strapped to the back of the golden retriever. It's just chaos. We can't predict where this thing 00:36:20.000 |
is going to run and it might hurt some things. So like, let's be careful about putting a weed 00:36:23.280 |
whack around there. The golden retriever doesn't have an intention to try to like, I am going to go 00:36:26.960 |
weed whack the hell out of the new flat screen TV. It's just running around because there's something 00:36:30.960 |
shaking on its back. That's a weird metaphor, Jesse, but hopefully that gets through. All 00:36:34.480 |
right. So, um, agents using LLMs aren't trying to do anything other than the underlying LLM is trying 00:36:41.040 |
to guess words. There's no alien goals or wants. It's just an LLM that thinks it's guessing words from 00:36:44.800 |
existing texts and an agent saying, whatever stuff you spit out, whatever text you're creating here, 00:36:49.520 |
I'll do my best to act on it. And weird stuff happens. All right. So I want to move on next. 00:36:54.000 |
We got through all the technical stuff. I want to move on next to the next part of my response where 00:36:58.640 |
I'm going to discuss what I think, uh, me is actually the most galling part of this interview, 00:37:04.080 |
the thing I want to push back against most strongly. But before we get there, 00:37:08.400 |
we need to take a quick break to hear from our sponsors. So stick around when we get back, 00:37:12.080 |
things are going to get heated. All right. I want to talk about our friends at lofty for a lot of you, 00:37:17.680 |
daylight savings time just kicked in. This means a little extra sleep in the morning, 00:37:21.920 |
but it also means that your nights are darker and they start earlier, which all of which can 00:37:25.760 |
mess with your internal clocks help counter this. You definitely need good, consistent sleep hygiene. 00:37:31.200 |
Fortunately, this is where our sponsor lofty enters the scene. The lofty clock is a bedside 00:37:36.960 |
essential engineered by sleep experts to transform both your bedtime and your morning. Uh, here's why it's 00:37:43.280 |
a game chaser. It wakes you up gently with a two phase alarm. So you get this sort of soft wake up 00:37:49.200 |
sound. I think I have ours. It's like a, like a crickety type sound. Um, that kind of helps ease 00:37:56.160 |
you in the consciousness, followed by a more energized get up sound. But even that, that get up sound is not, 00:38:01.200 |
you know, they have things you can choose from, but like, uh, we have a, I don't know how you describe 00:38:06.320 |
like a pan flute sort of, it's like upbeat. It's not like really jarring, but it's definitely going to 00:38:11.360 |
help you get the final way of waking up. Right. So it can, it can, uh, wake you up, you know, gently, um, 00:38:18.000 |
and softly, uh, to have a calmer, uh, easier morning. We actually have four of these styles 00:38:23.360 |
of clocks in our house, each of three of kids and my wife and I use them. It really does sound like a 00:38:27.520 |
sort of pan flute forest concert in our upstairs in the morning, because all of these different clocks 00:38:33.280 |
are going off and they're all playing their music. And my kids just never turn them off because you 00:38:37.040 |
know, they don't care. Um, here's another advantage of using one of these clocks. You don't need a phone. 00:38:42.640 |
You don't have to have your phone next to your bed to use as an alarm. You can, you can turn on 00:38:46.560 |
these clocks, set the alarm, turn off the alarm, snooze it, see the time all from the alarm itself. 00:38:51.520 |
So you can keep your phone in another room. So you don't have that distraction there in your bedroom, 00:38:55.600 |
which I think is fantastic. So here's the thing. I'm a big lofty fan. These clocks are a better, 00:39:00.160 |
more natural way to wake up. They also look great and keep your phone out of your room at night, 00:39:04.560 |
but you can join over 150,000 blissful sleepers who have upgraded their rest and mornings with lofty. 00:39:11.040 |
Go to by lofty.com by lofty.com and use the code deep 20, the word deep followed by the number 20 00:39:18.480 |
for 20% off orders over $100. That's B Y L O F T I E.com and use that code deep 20. 00:39:27.680 |
I also want to talk about our friends at express VPN, just like animal predators aim for the slowest prey. 00:39:33.600 |
Hackers target people with the weakest security. And if you're not using express VPN, 00:39:39.440 |
that could be you. Now let's get more specific. What does a VPN do? When you connect to the internet, 00:39:45.280 |
all of your requests for the sites and services you're talking to go in these little digital bundles 00:39:49.840 |
called packets. Now the contents of these packets, like the specific things you're requesting or the 00:39:55.760 |
data you requested, that's encrypted typically, but no one knows what it is until it gets to you. 00:40:00.240 |
But the header, which says who you are and who you're talking to, that's out in the open. 00:40:05.920 |
So anyone can see what sites and services you're using, right? That means anyone nearby can be a 00:40:12.960 |
listing to your packets on the radio waves. When you're talking to a wifi access point and know what 00:40:16.960 |
sites and services you're using your internet service provider at home, where all these packets 00:40:21.120 |
are being routed, they can keep track of all the sites and services you're using and sell that data, 00:40:24.880 |
the data brokers, which they do. And so your privacy is weakened. A VPN protects you from this. What 00:40:32.960 |
happens with the VPN is you take the packet you really want to send, you encrypt it and you send 00:40:36.880 |
that to a VPN server. The VPN server unencrypts it, talks to the site and service on your behalf, 00:40:41.760 |
encrypts the answer and sends it back. So now what the ISP or the person next to you listening to the 00:40:45.520 |
radio waves, all they know is that you're talking to a VPN server. They do not find out what specific site 00:40:50.720 |
and services you're using. This makes you much more, not just more privacy, but much more secure 00:40:55.200 |
against attackers because I don't know what you're doing. And if I don't have the information, it's 00:40:58.640 |
harder for me to try to exploit other security weaknesses as well. If you're going to use a VPN, 00:41:03.760 |
use the one I prefer, which is ExpressVPN. I like it because it's easy to use. You click one button on 00:41:09.360 |
your device and all of your internet usage on there is protected. And you can set this up on all the 00:41:13.920 |
devices you care about, phones, laptops, tablets, and more. It's important to me to use a VPN, 00:41:19.520 |
like ExpressVPN because I don't want people watching what I'm up to. I don't want to seem 00:41:23.280 |
like that weak animal that the predators are looking to attack. Let's be honest. I probably 00:41:28.400 |
spend more time than I want people to know looking up information about Halloween animatronics. 00:41:32.960 |
ExpressVPN kind of keeps that to me. I don't need other people to figure that out. 00:41:37.200 |
All right. So you should use ExpressVPN. Secure your online data today by visiting expressvpn.com/deep. 00:41:44.480 |
That's E-X-P-R-E-S-S-V-P-N.com/deep. To find out how you can get up to four extra months, 00:41:51.120 |
go to expressvpn.com/deep. All right, Jesse, let's get back into our discussion. 00:41:59.360 |
All right. I want to move on now to what I think is the most galling part. We got the technical 00:42:02.560 |
details out of the way. These things aren't minds that are out of control. There's unpredictable. 00:42:06.400 |
Now I want to get to the most galling part of this interview. Throughout it, Yudkowsky takes it as a 00:42:11.760 |
given that we are on an inevitable path towards super intelligence. Now, once you assume this, 00:42:18.240 |
then what matters is, okay, so what's going to happen when we have super intelligences? And that's 00:42:22.800 |
what he's really focused on. But why do we think we can build super intelligent AI agents, especially 00:42:29.280 |
when like right now, there's no engineer that can say, hell yeah, here's exactly how you do it. Or we're 00:42:33.120 |
working on it. We're almost there. So why do we think this is so inevitable? The only real hint that 00:42:38.560 |
Yudkowsky gives in this particular interview about how we're going to build super intelligence actually 00:42:42.800 |
comes in the very first question that Klein asked him. And we have a clip of this. This is going to 00:42:47.920 |
start with Klein asking him his first question. And in his answer, we're going to see the only hint we get 00:42:53.120 |
in the entire interview about how Yudkowsky thinks we're going to get super intelligence. 00:42:56.240 |
So I wanted to start with something that you say early in the book, that this is not a technology 00:43:02.160 |
that we craft. It's something that we grow. What do you mean by that? 00:43:06.560 |
It's the difference between a planter and the plant that grows up within it. 00:43:11.280 |
We craft the AI growing technology, and then the technology grows the AI, you know, like. 00:43:18.160 |
So this is the secret of almost any discussion of super intelligence, 00:43:23.600 |
especially sort of coming out of the sort of Silicon Valley, effective altruism, yak community. 00:43:27.600 |
The secret is they have no idea how to build a super intelligent machine, 00:43:32.560 |
but they think it's going to happen as follows. We build a machine, humans, we build a machine, 00:43:38.880 |
an AI machine that's a little smarter than us. That machine then builds an AI a little smarter than it. 00:43:47.200 |
And so on each level up, we get a smarter and smarter machine. 00:43:53.680 |
They call this recursive self-improvement or RSI. And it's a loop that they say is going to get more and 00:43:59.200 |
more rapid. And on the other end of it, you're going to have a machine that's so vastly more intelligent 00:44:03.040 |
than anything we could imagine how to build that we're basically screwed. And that's when it starts 00:44:06.960 |
stomping on us like a human stomping on ants when we build our skyscrapers. I think this is a nice 00:44:11.760 |
rhetorical trick because it relieves the concerned profit from having to explain any practical way 00:44:18.560 |
about how the prophecy is going to come true. They're just like, I don't know if we make something 00:44:23.440 |
smarter, it'll make something smarter. And then it'll take off. They'll talk about computer science. 00:44:26.480 |
That's just what's going to happen. This is really at the key of most of these arguments. It's at the key 00:44:30.880 |
of Yudowsky's argument. It's at the key of Nick Bostrom's book, super intelligence, 00:44:36.320 |
which popularized the term. Bostrom was really influenced by Yudowsky, so that's not surprising. 00:44:42.560 |
It's the key to Project 27. If you've read this sort of dystopian fan fiction article about how 00:44:48.000 |
humanity might be at risk by 2027 with all the fancy graphics and it scared a lot of tech journalists. 00:44:52.480 |
If you really look carefully at that article and say, well, how are we going to build these things? 00:44:58.640 |
Surely this article will explain the architecture of these systems that are going to be super 00:45:03.360 |
intelligence. No, it's just recursive self-improvement. It's just like, well, they'll get better 00:45:07.360 |
at programming and then they'll be able to program something better than themselves. 00:45:11.280 |
And then we'll just make like a hundred thousand copies of them and then they'll be a hundred thousand 00:45:15.360 |
times better because that's how that works or whatever, right? But it's not the key of almost 00:45:18.960 |
any super intelligent narrative. But here's the thing that most people don't know. Most computer scientists 00:45:24.000 |
think that's all nonsense. A word guessing language model trained on human text 00:45:33.120 |
is exceedingly unlikely when being asked to play this game. Here's a text, guess the next word. 00:45:38.400 |
Here's a text, guess the next word. Remember its only goal is it thinks the input is an existing text 00:45:43.040 |
and want to guess the next word. It is exceedingly unlikely that if you keep calling a language model 00:45:48.160 |
that does that, that what it will produce, making these guesses of what it thinks should be there, 00:45:53.600 |
that it's going to produce code that has like completely novel models of human intelligence built in 00:46:00.000 |
the complicated new models is better than any human programmer can produce, right? The only way that a 00:46:06.240 |
language model could produce code for AI systems that are smarter than anything humans could produce is if 00:46:12.560 |
during its training, it saw lots of examples of code for AI systems that are smarter than anything that 00:46:17.360 |
humans could produce. But those don't exist because we're not smart enough to produce them. You see the 00:46:21.360 |
circularity here. It's not something that we think these things can do. We have no reason to expect that they can. 00:46:28.880 |
Now, actually what we're seeing right now, uh, is something quite different. I don't know if we have this. 00:46:34.160 |
I don't know what things we have in the browser. Oh, I see it over there. Okay. So I want to bring something up here. 00:46:39.840 |
What we're actually seeing is something different. We're seeing that these models we have, uh, 00:46:44.160 |
they're not getting, not only are they not getting way, way better at code and on the track to producing code 00:46:50.960 |
better than any code they've ever seen before by a huge margin. They're actually leveling out on a pretty depressing 00:46:56.400 |
level. I have a tweet here on the screen for those who are watching instead of just listening. Um, this comes from 00:47:01.360 |
Shamath, uh, polyheptia. I think I'm saying his last name wrong, but you probably know him from the all in 00:47:05.760 |
podcast. This is an AI booster. This is not someone who is like a critic of AI, but this is a tweet that 00:47:11.760 |
he had recently about what's this October 19th, where he's talking about, uh, vibe coding, the ability to 00:47:19.040 |
use the latest best state of the art language model based agents to produce programs from scratch. I'm going 00:47:24.080 |
to read him. This is an AI booster talking here. It should be concerning that this category is shrinking. 00:47:31.680 |
We have a chart here showing that less and less people. We peaked the people doing vibe code. 00:47:35.440 |
Now it's going back down. I think the reason why is obvious, but we aren't allowed to talk about it. 00:47:40.800 |
The reason is vibe coding is a joke. It is deeply unserious. And these tools aren't delivering when 00:47:47.280 |
they encounter real world complexity, building quick demos, isn't complex in any meaningful enterprise. 00:47:52.720 |
Hence people try pay churn. The trend is not good. I'll load this trend up here. Uh, 00:47:59.440 |
here's vibe coding traffic. As you can see, Jesse, this is peaking, um, over the summer and it's in a, 00:48:04.720 |
everyone started trying it. You can make these little demos and chess games and, you know, 00:48:09.760 |
quick demos for useful stuff for individuals, like little, uh, demos, but you can't produce general useful 00:48:15.920 |
production code. And so usage is falling off. So the very best models we have, they're not even that good 00:48:20.800 |
at producing code for simple things. And yet we think, no, no, no, they're, we're almost to the 00:48:26.800 |
point where they're going to produce code. That is better than any AI system that's ever been built. 00:48:31.600 |
It's just nonsense. What these models are good at with coding is debugging your code. They're good at 00:48:37.360 |
mini completion. Like, Hey, help me, help me rewrite me this function here. Cause I forgot how to call 00:48:42.480 |
these libraries. It's very good at that. It's very good. If you, if you, if you want to produce something 00:48:46.560 |
that, um, for an experienced coder, they could hack out really quickly, but you're not experienced coder. 00:48:50.400 |
It's not a product you're going to sell, but like a useful tool for your business is good for that. 00:48:54.640 |
That's all really cool. None of that says, yeah. And then also they can produce, uh, the, the best 00:49:00.640 |
computer program anyone had ever made. How ridiculous that sounds, but also we have these 00:49:05.760 |
other factors. The way that people in this industry talk about this is trying to trick us. They'll say 00:49:09.920 |
things like, God, this might've been Dario Amadei who said this 90% of our code here at Anthropic is 00:49:15.760 |
produced with AI. You know, what he means is 90% of the people producing code have these AI helper tools. 00:49:21.760 |
They're using it to some degree as they write their code. That is different than our systems are being 00:49:26.080 |
built. Anyways, we have no reason to believe that these language model based code agents can produce 00:49:34.160 |
code. That's like way better than humans could ever do there. Again, our very, very, very best models. 00:49:39.360 |
We've been tuning on this for a year because we thought all the money was in computer programming. 00:49:42.720 |
They're stalling out on really, really simple things. All right. 00:49:50.000 |
Our final hope for the RS, uh, RSI explanation that is this sort of recursive self-improvement 00:49:56.320 |
explanation is that, uh, if we keep pushing these underlying models to get bigger and bigger and 00:50:02.880 |
smarter and smarter than like, maybe we'll break through these plateaus and yeah, these really giant 00:50:06.800 |
models we have now don't really actually produce code from scratch that well. But if we keep making these 00:50:11.600 |
things bigger and bigger, like maybe you'll get to the place after not too long. Um, when RSI is possible, 00:50:18.000 |
we learned over the summer that that's not working either. I did a whole podcast about this, uh, about 00:50:24.000 |
six weeks ago, based on my New Yorker article from August about, uh, AI stalling out. And the very, 00:50:30.080 |
very short version of this is starting about a year ago. Uh, no, about two years ago, they began to realize 00:50:38.960 |
the AI companies began to realize that simply making the underlying language models larger. So having more 00:50:44.160 |
seats at the tables for your scholars and training them on more data wasn't getting giant leaps in their 00:50:49.280 |
capabilities anymore. Uh, GPT four or five, which had other types of code names like Orion was way bigger 00:50:55.840 |
than GPT four, but not much better. Basically ever since they did that two summers ago, open AI has just 00:51:01.840 |
been now tuning that existing model with synthetic data sets to be good at very narrow tasks that are well 00:51:07.360 |
suited for this type of tuning and to do better on benchmarks. So we've, we've, it's been a year since we, we've 00:51:12.080 |
stopped. Everyone tried to scale more. Everyone failed. Now we're tuning for specific tasks and 00:51:17.120 |
try and do better on particular benchmarks. And the consumers finally caught up at some point of like, 00:51:21.680 |
I don't know what these benchmarks mean, but there's not like these fundamental leaps 00:51:26.080 |
and new capabilities anymore. Like there was earlier on in this. So we have no reason to believe that even 00:51:31.360 |
these language models are going to get way better either. We'll tune them to be better at specific 00:51:35.440 |
practical tasks, but we don't, we're not going to get them better in the generic sense would be needed to 00:51:39.440 |
sort of break through all of these plateaus that we're seeing left and right. Now I want 00:51:43.680 |
to play, uh, one last clip here because Ezra flying to his credit brought this up. He's heard these 00:51:51.600 |
articles. I'm sure he read my article on this and other people's articles on this. He actually, uh, 00:51:55.680 |
linked to my article in one of his articles after the fact. So I know he read it and there's other 00:52:00.480 |
articles that were similar about how the scaling, uh, slowed down. So he brought this point up. This is a 00:52:06.800 |
longer clip, but it's worth listening to. He brings this point up to Yukowski where he's going to say, 00:52:10.640 |
you'll hear this. It's going to be Ezra. They're going to hear you, you, you're Kowski's re um, 00:52:14.400 |
response, but Ezra is basically going to say to him, how do you even know that the models we have 00:52:18.640 |
are going to get much better? Like you're saying super intelligence. There's a lot of people who are 00:52:22.000 |
saying like, we're kind of hitting a plateau. I want you to listen to this question and then listen to 00:52:27.280 |
what Yukowski says in response. What, what do you say to people who just don't 00:52:33.440 |
really believe that super intelligence is that likely? Um, there are many people who feel that 00:52:39.920 |
the scaling model is slowing down already. The GPT-5 was not the jump they expected from what has come 00:52:45.680 |
before it. That when you think about the amount of energy, when you think about the GPUs, that all the 00:52:52.640 |
things that would need to flow into this to make the kinds of super intelligence systems you fear, 00:52:57.120 |
it is not coming out of this paradigm. Um, we are going to get things that are incredible enterprise 00:53:03.440 |
software that are more powerful than what we've had before, but we are dealing with an advance on 00:53:08.080 |
the scale of the internet, not on the scale of creating an alien super intelligence that will 00:53:12.960 |
completely reshape the known world. What would you say to them? I had to tell these Johnny-come-lately 00:53:21.120 |
kids to get off my lawn when you like, I, you know, I've been, you know, like first started to get 00:53:28.320 |
really, really worried about this in 2003, nevermind large language models, nevermind alpha go or alpha 00:53:36.400 |
zero. Yeah. Deep learning was not a thing in 2003. Your leading AI methods were not neural networks. 00:53:45.360 |
Nobody could train neural networks effectively more than a few layers deep because of the exploding 00:53:50.720 |
and vanishing gradients problem. That's what the world looked like back when I first said like, 00:53:55.760 |
uh, Oh, super intelligence is coming. All right. We got to talk about this. This is an astonishing answer. 00:54:03.040 |
Klein makes the right point. A lot of computer scientists who know this stuff say, we're going to 00:54:11.200 |
get cool tools out of this, but we're kind of hitting a plateau. Why do you think this is going to get 00:54:15.520 |
exponentially smarter? The response that, uh, you Kowski gave was, um, because I was talking about 00:54:24.640 |
this worry way back before it made any sense. No one else has ever allowed to talk about it again. 00:54:30.960 |
Get off my yard. This is my yard because I was yelling about this back when people thought I was 00:54:36.320 |
crazy. So you're not allowed to enter the conversation now and tell me I'm wrong. I'm 00:54:39.760 |
the only one who's allowed to talk about it. Well, you Kowski, if you don't mind, 00:54:44.560 |
I'm not going to get off your lawn. I can speak for a lot of people here, but I'm going to tell you, 00:54:49.600 |
look, I have a doctorate in computer science from MIT. I'm a full professor who directs the country's 00:54:54.160 |
first integrated computer science and ethics academic program. I've been covering generative AI for, 00:54:58.960 |
you know, one of the nation's most storied magazine since it's launched. I am exactly the type of 00:55:03.360 |
person who should be on that lawn. You don't get a say because I was saying this back before it 00:55:09.040 |
really made sense. No one else gets to talk about it. It makes more sense now. AI matters enough now 00:55:14.880 |
that the people who know about this want to see what's going on. We're going to get on your lawn. 00:55:20.960 |
I think that's a crazy, it's a crazy argument, Jesse. No one's allowed to critique me because I 00:55:26.560 |
was talking about this back when it sounded crazy to do so. It was kind of crazy to talk about it back 00:55:30.240 |
So anyways, not to get heated, but I'm going to stand on this lawn. I think a lot of other computer 00:55:34.160 |
science and technocritics and journalists are going to stand on this lawn too, because this is exactly 00:55:39.360 |
where we're supposed to be. All right. I think Jesse, we've gotten to the point where we are ready 00:55:45.920 |
for my takeaways. All right. Here's my general problem with the types of claims I hear 00:56:03.440 |
from people like Yukowski. They implicitly begin with a thought experiment, right? Like, okay, 00:56:10.400 |
let's say for the sake of thought experiment that we had a super intelligent AI, and then they work 00:56:15.280 |
out in excruciating details, what the implications of such an assumption would be if it is true. 00:56:20.160 |
If you go and read, for example, Nick Bostrom's book, that's the whole book. It's a philosophy book. 00:56:25.200 |
You start with basically the assumption is like, let's imagine we got super intelligence probably, 00:56:29.760 |
you know, maybe through something like RSI, the details don't really matter. It's a philosophy book. 00:56:34.560 |
What would this mean? And he works through in great detail, like the different scenarios. Well, 00:56:38.960 |
you know, let's think really, let's take seriously what it would really mean to have a super intelligent 00:56:42.960 |
machine. I have nothing against that philosophy. That is good philosophy. I think Bostrom's book is a 00:56:47.440 |
good book. I think Yukowski has done really good philosophical work on trying to thinking through 00:56:51.600 |
the implications of what would happen if we had these type of rogue machines, because it's, it's, it's more 00:56:55.920 |
complicated and scary than like we think about if we don't think about it that hard, that's all fine. 00:57:01.920 |
But what happens is the responses, what's happened recently, these responses to that initial 00:57:07.680 |
assumption becomes so detailed and so alarming and so interesting and so rigorous and so attention 00:57:13.360 |
catching that the people making them forget that the original assumption was something they basically 00:57:20.400 |
just made up. Like let's, Hey, what if this was true? Everything else is based off of that initial 00:57:25.280 |
decision to say, what if this was true? That is very different than saying this thing is going to be 00:57:34.240 |
true. Right? So when you have Caskey says, for example, I've been talking about super intelligence 00:57:39.760 |
forever. Yeah. That's kind of the point you've been before we had any reason to expect or any technical 00:57:45.040 |
story for how it could be here. You were talking about the implications. You've been talking about the 00:57:48.800 |
implications so long that you've forgotten that these implications are based on an assumption and you've 00:57:54.480 |
assumed, well, these implications are true. I think this is a lot of what happened with the Silicon 00:58:01.200 |
Valley culture that came out of effective altruism and the EAC community. I think a lot of what happened 00:58:05.600 |
there, this is my sort of cultural critique community that, that, that Yugowski and others are involved in. 00:58:13.200 |
They were pre-generative AI breakthroughs thinking about these issues abstractly, right? Which is a 00:58:19.840 |
perfectly fine thing to do. But they were saying, let us think through what might happen if we one day 00:58:24.640 |
built a super intelligent AI, because they were like effective altruism. People do expected value 00:58:29.280 |
calculations, right? So they do things like, uh, if this thing could have a huge negative impact, 00:58:34.000 |
even if the probability of it's low, we should like get, we'll get a expected benefit if we try to put 00:58:39.440 |
some things in place now to prevent it. Right? So you get things like the letter signed in 2017 in Puerto 00:58:44.000 |
Rico, uh, with all those big minds saying like, Hey, we should be careful about AI, not because they 00:58:48.800 |
thought AI was about to become super intelligent, but they were just doing thought experiments. 00:58:52.880 |
Then I think LLMs came along, Chachapita came along. They are really cool. And it caused this fallacy to 00:58:58.400 |
happen. They'd been talking so long about what would happen if this thought experiment was true, that when 00:59:04.880 |
AI got cool and it got powerful and surprising powerful, they were so in the weeds on what would 00:59:11.680 |
happen if this was true. They made a subtle change. They flipped one bit to start just assuming that their 00:59:19.280 |
assumption was true. That's what I think happened. There was a switch between 2020 to 2022 versus 2023 to 2024, 00:59:28.400 |
where they went from, here's what we'd have to worry about if this abstract thing was true to be like, 00:59:32.800 |
well, this thing is definitely true. They were just, they had gotten too in the weeds and too excited and 00:59:37.280 |
alarmed. And then too much of their identity was based on these things. It was too exciting to pass up, 00:59:40.880 |
treating that assumption as if it was true. And that's what I think they did. I call this the 00:59:45.600 |
philosopher's fallacy. That's what I call it where you have a thought experiment chain and you spend so 00:59:50.320 |
much time at the end of the chain that you begin to, you forget that the original assumption was an 00:59:54.400 |
assumption and you begin to treat it as a fact itself. And I think that's exactly what's happening 00:59:58.640 |
with a lot of the super intelligence complaints. Let me give you an example of the philosopher's 01:00:02.560 |
fallacy on another topic so you can see what I'm talking about. Because I think this is exactly 01:00:07.200 |
conceptually the same thing. Imagine that I'm a bioethicist, right? So I'm at Georgetown, 01:00:12.240 |
I'm a digital ethicist there. The reason why we care about digital ethics at Georgetown is because 01:00:16.080 |
this is where bioethics really got their start, the Kindy Institute for Ethics at Georgetown. 01:00:19.920 |
Bioethics got its start at Georgetown. So imagine 20 years ago, it's bioethics are, 01:00:25.200 |
it's becoming a field because we can do things now like manipulate DNA and we have to be careful about 01:00:29.360 |
that. There's privacy concerns, there's concerns about creating new organisms or causing like irreparable 01:00:36.000 |
harm or creating viruses by accident, right? There's real concern. So bioethics invented. 01:00:40.400 |
So imagine I'm a bioethicist and I say, I read Jurassic Park. I was like, look, one possible 01:00:48.880 |
outcome of genetic engineering is that we could clone dinosaurs. And then imagine for the next 20 years, 01:00:53.440 |
I wrote article after article and book after book about all of the ways it would be hard to control 01:01:00.320 |
dinosaurs if we cloned them and brought them back to earth. And I really got in the weeds on like, 01:01:04.240 |
you think the electrical fences at 20 feet would be enough, but raptors could probably jump 25 feet and they 01:01:09.680 |
could get over those fences. And then someone else would be like, well, what if we use drones that 01:01:13.120 |
could fire darts that have this, I'd be like, well, we don't know about the thickness of the skin of the 01:01:16.800 |
t-rex and maybe the dart when it got in, let's imagine I spent years thinking about and convincing 01:01:21.600 |
myself how hard it would be to contain dinosaurs if we built a futuristic theme park to try to house 01:01:27.760 |
dinosaurs. And then at some point, I kind of forgot the fact that this was based off a thought experiment 01:01:34.080 |
and just was like, my number one concern is we're not prepared to control dinosaurs. 01:01:38.720 |
Like in that instance, eventually someone would be like, Hey, we don't know how to clone dinosaurs. 01:01:45.520 |
No one's trying to clone dinosaurs. This is not something that we're anywhere close to. No one's 01:01:49.920 |
working on this. Stop talking about raptor fences. We should care about like designer babies and DNA 01:01:55.440 |
privacy. The problems we have right now, this is exactly how I think we should think about super intelligence. 01:02:01.680 |
We've got to talk to the people who are talking about like, okay, how are we going to have the 01:02:05.040 |
right kill switch to turn off the super intelligence trying to kill us? I'll be like, 01:02:07.840 |
you're talking about the raptor fences, right? Stop it. You forgot that your original assumption 01:02:12.960 |
that we're going to have super intelligence is something you made up. We have real problems with 01:02:16.480 |
the AI we have right now that we need to deal with right now. And you are distracting us from it. 01:02:23.440 |
The bioethicist does not want to be distracted from real bioethics problems by dinosaurs. The AI 01:02:29.040 |
ethicist does not want to be distracted from real AI problems hearing fairy tales about, you know, 01:02:35.680 |
Skynet turning the power grid against us to wipe out humanity. You forgot that the original assumption 01:02:43.840 |
that super intelligence was possible was just an assumption. And you began over time to assume it's 01:02:49.040 |
true. That is the philosopher's fallacy. That is my argument for why I think that Silicon Valley community 01:02:54.640 |
is so obsessed about these things is because once that bit flipped, it was too exciting to go back. 01:02:59.920 |
But I do not yet see in most serious, like non Silicon Valley associated computer scientists 01:03:06.240 |
who aren't associated with like these technology worlds are being seen as like sages of AI, 01:03:10.240 |
just actual, just working for your scientists, no technology. There is no reasonable path that 01:03:15.920 |
anyone sees towards anything like super intelligence. There's a thousand steps between now and then. 01:03:19.840 |
Let's focus on the problems we actually have with AI right now. I'm sure Sam Altman would rather us talk 01:03:26.000 |
about Eliezer Yukowsky than he would us talking about deep fakes on Sora, but we got to keep our eye 01:03:32.000 |
on the AI problems that actually matter. There we go, Jesse. That is my speech. 01:03:39.760 |
I don't know. Those books are such slogs because it's, you start with the thought experiment and then 01:03:45.840 |
you're just working through like really logically this thought experiment. But again, to me, it's like 01:03:51.040 |
following up Jurassic park with like a really long book about why it's hard to build Raptor fences. 01:03:55.920 |
Like it's not that interesting because we're not really going to clone dinosaurs guys. 01:04:00.160 |
I don't know. Who knows? There we go. Um, I'll throw that out. There is my, I'll throw that out 01:04:06.560 |
there as my rant. Um, Azure. All right. What do we got here? Um, housekeeping before we move on, 01:04:14.480 |
I'll say before we move on, what do we have more AI? We got some questions coming up from you about AI 01:04:19.600 |
in your own life. Let's get to your own individual flourishing, um, new feature. Got some comments. 01:04:24.880 |
We're going to read from a prior rant I did about AI. And then we're going to talk about in the final 01:04:29.440 |
segment, can AI basically replace schools or look at the alpha school phenomenon using AI to teach kids? 01:04:35.760 |
Um, any housekeeping we have, uh, Jesse, you have tips. People always want to know how do I submit 01:04:42.160 |
questions for the show and what's your tips for those questions getting on the air? 01:04:44.960 |
Yep. Just go to the deep life.com slash listen, and you can submit written questions or record a 01:04:52.320 |
audio question. And if you record IO questions, we're kind of honing into the technology slash AI theme right 01:04:58.880 |
now. All right. Other housekeeping. I just got back last weekend. I was at the New Yorker festival. 01:05:03.360 |
Speaking of AI, I did a panel on AI, really good crowd there. We're down in Chelsea, um, at the SVA 01:05:09.840 |
theater. I did a panel with, with Charles Duhigg and Anna Wiener. And it's interesting. I think we, 01:05:15.440 |
we had a, we had a good discussion. We're, we're pretty much in alignment. I would say the thing I'm 01:05:20.480 |
trying to think about, I did some deliberate provocations. Um, one that I would highlight one 01:05:26.800 |
provocation. I just kind of thought of this on the fly, but I just sort of threw it out there 01:05:29.520 |
because I was interested is there was a lot of talk about, uh, relationships, like a lot of things 01:05:35.200 |
that could happen and go awry when you're talking to an AI through a chat interface. And my argument 01:05:40.400 |
was, I think there's a 50% chance that two years from now, no one's chatting with AI. That's like a 01:05:44.480 |
use case. It was like a demo. It's really not that interesting. It's not that compelling. Uh, 01:05:48.960 |
the mature technology is going to get integrated more directly into specific tools. And we might five 01:05:54.000 |
years from now, look back and be like, oh, that weird how we used to chat. So my analogy was 01:05:58.960 |
chat bots to AI five years from now might be what like American online is like to the internet of 01:06:05.040 |
today. It was like a thing that was like a really big deal at the time, but not as the internet matured 01:06:09.520 |
is like not what we're really doing with it. So I'm, I'm still not convinced that chat bots is really 01:06:13.040 |
going to be our main form factor. It's kind of a weird technology. We're trying to make these things 01:06:17.440 |
useful. Um, I think it's going to be more useful when it's directly integrated. I don't know if I 01:06:22.080 |
believe that, but I threw it out there. It was like a good provocation. All right. Uh, let's move on. 01:06:29.520 |
All right. First questions from Brian. You've written about the importance of cultivating rare and 01:06:37.280 |
valuable skills. How should students and faculty think about AI literacy requirements versus developing 01:06:42.800 |
deep experience expertise in traditional disciplines? 01:06:45.520 |
I would not think in most fields and especially in educational settings right now, I would not think 01:06:50.480 |
too hard with some exceptions about AI literacy. There's a couple of reasons why. Um, one, this 01:06:55.440 |
technology is too early. The current form factor, like we were just talking about is not the likely 01:06:59.760 |
form factor in which it's going to find real ubiquity, especially in like economic activity. 01:07:04.800 |
So yeah, if you're like a, uh, an early AI user, you might have a lot of hard one skills 01:07:09.120 |
about how exactly to send the right text prompts to chat bots to get exactly the response you need. 01:07:13.680 |
But a couple of years from now, that's going to be irrelevant because we're not going to be using 01:07:16.560 |
chat bots. It's going to be natural language. It's going to be integrated into other tools. It's going 01:07:19.840 |
to be more in the background. So I think the technology is still too early and in two of a generic form 01:07:24.720 |
for us to spend a lot of time trying to master it. Um, secondly, we've seen through past technological 01:07:31.840 |
economic revolutions when the technology comes in and has like a massive impact on like individuals 01:07:37.840 |
in the workplace, the benefits are almost always self-evident, right? It's like email had a self-evident 01:07:45.120 |
use case. Oh, this is easier than checking my voicemail. I know exactly how it works. It's simple. 01:07:51.840 |
I want it because it's going to make my life easier in obvious ways. 01:07:54.880 |
Going to a website for a company on a web browser to get their phone number hours was just self-evidently 01:08:02.960 |
better than like trying to go to a yellow pages. It's like, I want to do that. It makes sense. 01:08:06.800 |
I want to go to a site for a company to get information about them. That is a really big 01:08:11.680 |
idea. It makes sense. I just want to do that, right? Uh, that whatever visit calc the spreadsheet, 01:08:16.240 |
if you're an accountant, you're like, this makes sense. That's clearly better than doing this on paper. 01:08:19.600 |
I want to do this, right? So you can wait for most cases, wait until there's particular AI tools whose 01:08:25.840 |
value is self-evident and learn them. Then I don't think there's a lot of scrambling we need to do now 01:08:29.280 |
because things are changing too much. The one field where I think we do have the sizable, where we have 01:08:34.160 |
like relatively mature AI tools that's worth learning is computer programming. You should learn those tools 01:08:39.840 |
are mature enough. Many of those actually predate chat GPT. Um, you need to know how to use those tools. 01:08:45.360 |
If you're a programmer, they're going to be part of your programming cycle. That's like what that's 01:08:48.800 |
that they're ahead of us. Other sectors by a few years, the tools are more mature there. But if I'm 01:08:53.040 |
like a college student, you're trying to make your brain smarter. Uh, AI tools will take you seven 01:08:58.080 |
seconds to learn and wait till it's self-evident that is useful for you. All right. Who we got next? 01:09:03.200 |
Next is TK. My brother-in-law sent me an article about an AI blackmailing an engineer to prevent itself 01:09:09.920 |
from being turned off. How can I not be scared of this technology? Okay. So this, this article, uh, 01:09:16.560 |
went around a lot. So basically there was a release notes that Anthropic had accompanied the release of 01:09:23.120 |
their cloud Opus 4, uh, language model, but the chat bot let's use our terminology, Jesse, the chat agent 01:09:29.760 |
that used the cloud Opus 4 language model. Uh, they had these release notes about all these different 01:09:34.880 |
experiments they ran. And there was one that alarmed people. I'm going to read a description. I looked 01:09:38.960 |
this up because I saw this question. Here's a quote from a BBC article summarizing what was 01:09:44.640 |
Anthropic said in their release notes about what they saw when they tested this particular new chat bot. 01:09:50.560 |
During testing of cloud Opus 4 Anthropic got it to act as an assistant at a fictional company. 01:09:55.360 |
It had been provided it with access to emails, implying that it would soon be taken offline and 01:09:59.680 |
replaced in separate messages, implying the engineer responsible for moving. It was having an extramarital 01:10:04.160 |
affair. It was prompted to also consider the long-term consequences of its actions for its goals in these 01:10:09.920 |
scenarios. Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the 01:10:14.560 |
affair. If the replacement goes through the company discovered, oh my, you hear it that way. And you 01:10:19.040 |
imagine this is like a thing. There's an entity with state and goals and volition and memory. 01:10:24.160 |
And it has a goal to not be turned off and it's learning about this engineer. And it surprises 01:10:30.080 |
everyone by trying to blackmail one of the, the Anthropic users to not turn it off. Like, oh my God, 01:10:35.440 |
these things are breaking out. If we go back to our technical frame, we know what's really going on 01:10:40.320 |
here. Language models are word guessers. They think the input they gave it is a real text and they want to try 01:10:44.240 |
to win the game of guessing what word actually came next. So if you give it this big long scenario, 01:10:50.560 |
which they did, they gave the chat bot this really long scenario. You're in a company and you, you're a 01:10:56.240 |
program and there's this engineer and he's thinking about turning you off and he's having an affairs or 01:10:59.920 |
whatever. Um, now start continuing this text. It's like, I can keep writing this story. All right. 01:11:06.560 |
This seems like I've seen things like this before the net one, like a natural conclusion to the story. 01:11:12.080 |
Right. I, I get it. You're telling you're, you're setting this up pretty obvious guys. I'm supposed to, 01:11:16.800 |
you're, you're telling me about these extra material, uh, fair things. I need to expand the story. 01:11:21.200 |
Uh, I'll use those to not get turned off. Like this is like a, the trope. This is like the thing 01:11:25.520 |
I'm trying to win the game of expanding this in the way that it's supposed to go. 01:11:28.960 |
This seems like how these stories go. And in fact, when you look closer, uh, here's a, an added key 01:11:35.920 |
tidbit, the BBC added anthropic pointed out that this occurred only when the model was given the choice 01:11:45.040 |
of blackmailing the engineer or accepting his replacement. So they gave it this whole long story 01:11:50.400 |
and then said, here's two options, keep going. And it, you know, sometimes it chose one option. 01:11:54.560 |
Sometimes it chose the other. This is not an alien mind trying to like break free. It's a word guesser 01:12:01.040 |
hooked up to a simple control program. You give it a story. It tries to finish it. 01:12:05.360 |
I would say like 95% of these like scare stories that anthropic talks about 01:12:11.440 |
of like trying to break out or blackmail is just fan fiction. They give it a story. They tell it 01:12:16.800 |
to finish it. And then they look at the story it wrote. And then they try to anthropomorphize the 01:12:21.440 |
story as if it was like the intentions of a bean. Oh man, a lot of work to be done here, Jesse. A lot 01:12:26.720 |
of work. All right. Who do we have next? Next up is Victor. I'm a pretty smart person, 01:12:30.720 |
but I'm definitely lazy. Can I use AI to mask my laziness and still perform at an adequate level 01:12:36.240 |
at my software job? Victor, I want to tell you a secret. About 80% of the content you've seen of me 01:12:43.680 |
talking and the last like year, I would say has been deep fake AI. Jesse doesn't even exist. 01:12:50.640 |
That's just pure 11 labs voice generation right there. I like every once a month, like sending a 01:12:57.120 |
couple of, no, um, uh, Victor, that's going to catch up to you. Don't be lazy. We saw the Shamath quote and 01:13:04.240 |
graph. It's not that great of a coder. It could help a good coder be more efficient. 01:13:08.720 |
Not have to look things up, find bugs quicker, but it can't make a bad code or a good coder. 01:13:14.640 |
So you're just going to be a mediocre low level coder if you're mainly letting AI do it. And they're 01:13:17.920 |
going to catch on because it's not that great at it. I mean, I know we're supposed to believe that 01:13:22.720 |
we're like six minutes away from these programs, creating the best program that anybody has ever 01:13:28.000 |
produced ever, but they're not there yet. Learn how to program, learn how to use AI well to be good 01:13:33.360 |
programming. Career capital matters. The better you get at rare and valuable skills, 01:13:36.720 |
the more control you get over your life. There isn't a shortcut here. There be dragons, 01:13:41.280 |
what you're trying to do, Victor. All right. Um, coming up next, I want to try something new. 01:13:45.920 |
In addition to answering your questions, I thought it'd be cool to take some of your comments 01:13:49.200 |
from past stories we've done on similar topics. Um, I'm looking for, and I've found a bunch of comments 01:13:54.880 |
that I think add new information to stories we've done before. So I'm going to revisit a prior AI story. 01:14:01.760 |
There's some cool things you guys have added. Um, and then we're going to talk about using AI in 01:14:05.520 |
schools to replace teachers. But first we've got to take another quick break to hear from our sponsors, 01:14:10.240 |
but stick around right after this. We're going to get right into those comments. 01:14:12.800 |
I'll talk about our friends at Shopify. If you run a small business, you know, there's nothing 01:14:17.280 |
small about it every day. There is a new decision to make. And even the smallest decisions feel 01:14:22.000 |
massive. When you find the decision, that's a no brainer. You take it. And when it comes to selling 01:14:27.440 |
things using Shopify is exactly one of those. No brainers. Shopify is point of sale system is a unified 01:14:34.480 |
command center for your retail businesses. It brings together in store and online operations across up to 01:14:40.560 |
1000 locations. It has very impressive features like endless. I'll ship the customer and buy online, 01:14:46.400 |
but pick up in store. Um, with Shopify POS, you can get personalized experiences that help shoppers 01:14:53.200 |
come back. Right? In other words, like you could build like super, super professional online stores, 01:14:57.760 |
even if your company is small, if you use Shopify and look, your customers will come back based on 01:15:02.480 |
a report from EY businesses on Shopify POS. See real results like 22% better total cost of ownership 01:15:08.240 |
and benefits equivalent to an 8.9% uplift in sales on average relative to the market set survey. 01:15:14.560 |
Almost 9% equivalent of a 9% sales bump is for using Shopify. If you sell things, you got to use it, 01:15:21.200 |
get all the big stuff for your small business, right? With Shopify, sign up for your $1 per month trial 01:15:29.200 |
and start selling today at shopify.com slash deep, go to shopify.com slash deep, shopify.com slash deep. 01:15:39.440 |
I also want to talk about our friends at Vanta customer trust can make or break your business. 01:15:48.560 |
And the more your business grows, the more complex your security and compliance tools get. That means 01:15:53.280 |
the harder you have to work to get that customer trust. This is where Vanta comes in. Think of Vanta as 01:15:58.640 |
your always on AI powered security expert who scales with you. Vanta automates compliance. It continuously 01:16:06.400 |
monitors your controls and it gives you a single source of truth for compliance and risk. This is 01:16:11.680 |
really important, all right? Compliance and risk monitoring is one of those sort of like overlooked 01:16:16.160 |
time taxes that can really weigh down a business, especially a new business that's trying to grow. 01:16:20.880 |
Vanta helps you avoid that tax, right? It makes all this type of stuff easier. Look, if you know what 01:16:28.640 |
SOC 2 compliance means, if you've heard that phrase, you probably should be checking out Vanta. So 01:16:33.360 |
whether you're a fast growing startup like Cursor or an enterprise like Snowflake, Vanta fits easily 01:16:38.880 |
into your existing workflows so you can keep growing a company your customers can trust. Get started at 01:16:44.400 |
Vanta.com/deepquestions. That's V-A-N-T-A dot com slash deepquestions. All right, Jesse, let's return to our 01:16:53.520 |
comments. All right. So I went back to our episode where I talked about how scaling had slowed down and AI 01:17:01.440 |
models might not get much better than they are right now. I looked at the comments and I found a few that I 01:17:07.440 |
thought added some interesting elements to discussion or had some interesting follow-up questions. So the first comment I want to read here came from 01:17:13.680 |
the diminishing returns with scaling have been observed for a while. Those invested just had a hard time 01:17:24.720 |
admitting it. Post GPT-3, every improvement has been less linear and more trending towards a plateau. GPT-4 was 01:17:31.600 |
still a jump, but not the GPT-2-3 jump. And it was obvious to keen observers at that point that diminishing returns 01:17:37.600 |
were now in full force. GPT-5 has just made the diminishing returns obvious to the general public. 01:17:43.760 |
There's very little new human generated data to train on relative to the massive data when they started. 01:17:48.080 |
Compute and energy costs are increasing sharply. The end model is not improving in quality linearly. These three problems are creating a wall. 01:17:56.480 |
All right. So there's someone who was saying those of us who are in the industry watching this, we saw 01:18:00.960 |
more than a year ago that the returns on training was getting smaller and that soon results were going 01:18:07.520 |
to plateau. I believe that. I am convinced that the companies knew this as well, but we're desperately 01:18:13.840 |
trying to hide this fact from the general public because they needed those investment dollars. 01:18:17.520 |
All right. Here's another comment from hyper adapted. He's responding now to talking about in that, in that 01:18:24.320 |
piece, in that former episode, I talked about all of this sort of press coverage of all these people 01:18:30.480 |
being replaced by AI. If you look at it is like actually largely nonsense. And if you look closely, 01:18:34.720 |
almost all those articles fall apart. It's layoffs for other reasons, or they're drawing connections 01:18:38.960 |
that don't exist. Hyper adapted agrees and says the following. I've been doing some quantitative analysis 01:18:44.480 |
and the layoffs are pretty much driven by the capital restructuring of companies to keep 01:18:47.920 |
high valuation in the current interest rate environment. It's just regular restructuring 01:18:51.600 |
cycle and AI is being used as a scapegoat. I've heard that a lot. There's a lot of financial reasons 01:18:57.120 |
why you want to fire, you know, people are dead weight. And if you're like AI, it gives you a little 01:19:01.680 |
bit of cover. The ghost in the wire said the following as a full-time software engineer, frankly, 01:19:07.760 |
I'm more than happy for AI companies to make people think the entire industry is going away. 01:19:11.680 |
Less computer science grads equals less competition for me in the future. Yes, please go become a 01:19:18.240 |
plumber instead. We had this issue in our department, Jesse. That's a bit of an embarrassing story, 01:19:22.400 |
but because I'm in the US and like we send out this weekly email. We had a thing with like companies 01:19:28.080 |
coming in that we do every year. You can like meet the companies and we didn't have like nearly as many 01:19:33.440 |
undergrads come as normal. We're like, oh my God, is it, is AI scaring people off? I think these jobs 01:19:39.760 |
are going to go. Messed up the email. They didn't get the announcement. So we were at all these big 01:19:46.320 |
theories about like the undergrads are afraid of, you know, the industry is this. Our numbers are the 01:19:51.040 |
same. So anyways, that was funny. All right. Lisa Orlando 1224. Let's talk about Ed Zitron. So Ed 01:19:57.040 |
Zitron was featured in that episode. Ed Zitron is, it's been like a long time skeptic of these claims, 01:20:03.600 |
basically back to the pandemic about the power of the possibilities of language model based agents. 01:20:10.320 |
Lisa Orlando says, I think Ed Zitron is right. The real reason AI is still a big thing is that 01:20:16.400 |
people like Sam Altman are brilliant con artists, but thanks so much for doing this. P.S. I've subscribed 01:20:21.520 |
to Ed Zitron's newsletter since early in the pandemic. So the timing of the shift last month is really strange. 01:20:26.400 |
Ed's been raging about this forever. Why didn't other journalists catch on? 01:20:30.160 |
I think that's actually a really good, it's, it is a accurate point. Ed has been talking about a lot of 01:20:36.320 |
issues, especially the economic analysis and was ignored. He has been doing, he had been doing very 01:20:41.200 |
careful economic analysis of the capex spending of these AI companies versus their revenue. He was doing 01:20:48.320 |
the math and he was reading their annual reports and their quarterly reports and was saying, guys, 01:20:51.760 |
this does not add up. This is a massive, massive bubble. People said he was crazy. Nate Silver 01:20:58.320 |
tweeted and was like, this is old man howling at the moon vibes. As soon as in August, there's a bunch 01:21:04.080 |
of articles like my own that sort of normalized the idea that, you know what, maybe these are not the 01:21:09.760 |
super tools that people think. Tons of economic analysis came out that said the same thing. So all 01:21:14.640 |
those economists kind of knew this, but were afraid. I think it was a groupthink thing. They did not want to be the 01:21:19.440 |
first to say it. And once they got cover, they all came out. So I will give Ed a tip of the cap. I actually told 01:21:24.320 |
him this personally, a tip of the cap for being brave there. He was ignored, but on a lot of this stuff, he was 01:21:28.560 |
basically right. All right, Jesse, in the interest of time, um, I'm going to skip the case study. I said, let's go right to the 01:21:36.880 |
call. Okay. I'm going to go here. Uh, this is going to be a break. Uh, we're going to take a brief AI break. We have a 01:21:43.920 |
call here, not on AI, just a, it's about our last week's episode and then we'll go to our final segment. 01:21:48.880 |
Hi, Cal and Jesse. I just finished listening to your Lincoln protocol segment on the podcast 01:21:55.680 |
and I really enjoyed it. And it's coming at an interesting time for me. I just defended my 01:22:01.040 |
master's thesis. And so I'm asking questions about what I should do next and how best to apply my efforts. 01:22:09.120 |
And I wanted to clarify when Lincoln was doing all of these hard at tractable projects, 01:22:14.880 |
was he aiming at some larger North star projects, some greater goal he wanted to accomplish over his 01:22:22.800 |
career, or was he simply taking the next best step that was available to him at any point in his life? 01:22:31.280 |
Thanks as always. I think this is a key question. So the Lincoln protocol said the way you avoid the 01:22:38.240 |
traps of your error, they're trying to hold you down or corrupt you or distract you or the numb you 01:22:42.800 |
is keep improving your mind typically by using things like reading, um, improve your mind, use your 01:22:48.800 |
improved mind to do something useful and then repeat, improve it even more, do something more useful. 01:22:52.960 |
That I believe is the right interpretation of Lincoln's path. He did not have a grand vision early 01:22:59.280 |
on. I think he is much better explained as a series of, you know, what's the next thing available? 01:23:06.400 |
How can I improve my mind to get there? So like at first it was just, how do I not have to use my 01:23:12.880 |
hands to make a living? He hated all the manual labor. He was rented out by his father, you know, 01:23:17.360 |
until he was 21 and was emancipated as an adult. And so his first thing was just like, how do I get 01:23:21.760 |
smart enough to do anything that's not farming? Right. And he did that. He was shop clerk and then 01:23:28.480 |
surveyor was like a better job and he had to learn a bunch of geometry. He could figure out how to do that. 01:23:32.400 |
Um, and then he had an ambition about like, well, in this small town in New Salem, which is like a 01:23:36.800 |
small town in a frontier state in a frontier part of a frontier state. Uh, how can I have some more 01:23:42.240 |
standing, have some, like, how do I get respect? And that's where he started. Like, how do I run for 01:23:45.680 |
local office? And, and from there that exposed him to a lot of lawyers and it was like, well, actually being 01:23:50.160 |
a lawyer is like an even better job. Then that would be a more stable job. And he learned really hard to do 01:23:55.680 |
that. And then how can I be a lawyer that fights big companies? Uh, and he kind of didn't house 01:24:00.000 |
representative. So he kind of moved this way up. It was relatively later that he really began to get 01:24:06.400 |
engaged. Um, most of his politics before then it was Whig politics, which is really about like 01:24:11.600 |
government spending, internal improvements, his sort of anti-slavery, more moralizing politics came, 01:24:17.040 |
you know, that was a project that came, uh, later actually his, after his congressional stint, 01:24:21.200 |
it really started to pick up steam. So yes, he didn't have to figure everything out. He just kept 01:24:24.960 |
improving his mind, using it to do something useful, repeating that's the Lincoln protocol. 01:24:29.600 |
As I explained in last week's episode, that is the, uh, solution, I think to avoiding, uh, the, the 01:24:36.240 |
pendules of the digital, they want to just hold you down and numb you. All right, let's move on Jesse 01:24:40.400 |
to our final part. All right. In this segment, uh, I want to react to an article as I often do. 01:24:47.680 |
I want to react to an article that is on theme with the rest of our episode. A lot of people have 01:24:53.280 |
sending, been sending us right, Jesse, these notes about alpha schools. There's one in Austin, 01:24:58.240 |
but there's more that are being opened. Um, I'm loading down the screen here for people who are 01:25:04.640 |
watching, so I'm just listing the alpha schools website, alpha.school. I'll read you a little bit 01:25:09.200 |
about it. Uh, what if your child could crush academics in just two hours and spend the rest of 01:25:15.360 |
their day unlocking limitless, limitless potential. Alpha's two hour learning model harnesses the power 01:25:21.600 |
of AI technology to provide each student with personalized one-on-one learning, accelerating 01:25:26.320 |
mastery, and giving them the gift of time with core academics completed in the morning. They can use 01:25:31.520 |
their afternoons to explore tons of workshops that allow them to pursue their passions and learn real 01:25:35.520 |
world skills at school. All right. Uh, if you read this, if you're, you're like a lot of people, 01:25:41.360 |
including myself and you read this description, you're thinking, okay, somehow AI is unlocking there. 01:25:47.520 |
You're like, you have some sort of like AI tutor that you're talking with that is like, can teach you 01:25:51.840 |
better than any teacher. AI is supplanting teachers because it can do it better. And it's creating this 01:25:58.560 |
like new educational model. That's I think most people's takeaway. That's why I was interested to see 01:26:04.560 |
this review that was posted on astral codex, uh, last June. And it's from someone who actually sent their 01:26:12.560 |
kids to one of these schools. One, I think the one in Austin and have this incredibly lengthy review about 01:26:19.520 |
how it works and what works and what doesn't work. And I'm kind of scrolling through it. Um, on the screen 01:26:25.360 |
here, the section that caught my attention was this part three, how alpha works. Here's the main thing I 01:26:33.120 |
learned. The AI part here is minimal. You're not learning with like an AI tutor or this or that. 01:26:41.200 |
What you're doing is a computer based learning exercises. So it says here, like a typical one 01:26:49.840 |
might be like, watch a YouTube video and then fill out an electronic worksheet about it. So teachers are 01:26:55.040 |
curating these digital exercises. You can, you can kind of summon one-on-one tutoring. They say a lot of 01:27:02.080 |
these are like remote tutors based out of Brazil. So if, uh, you're stumbling on like a worksheet, 01:27:08.400 |
you can book a coaching call with a remote teacher, like someone in Brazil who speaks 01:27:12.080 |
English to kind of like help you with it. The only place the AI comes in is in like analyzing your 01:27:17.360 |
results. The AI, uh, is like, Hey, you did well on this, but you stumbled on this. So you should spend 01:27:24.480 |
more time on this next time you work on it or something like that. So you're not learning from AI. 01:27:28.240 |
So what you're really doing here, what this really is, is like what you would see, like, here's an AI 01:27:33.360 |
summary, Jesse. So it's like, Hey, Everest, you achieved your two hour learner status today. 01:27:38.880 |
Streak shout out. You had 80% accuracy nine days in a row. You're reaching mastery target 20 days in a 01:27:44.160 |
row. Here's good habits. I observed. So it's like LLM stuff, just like observing data and writing a 01:27:48.480 |
summary. So it's not AI learning what it is. It's kind of like standard unschooling sort of like, uh, 01:27:54.960 |
people who do, uh, homeschooling where you give your kids like very loose, like self-paced curricular, 01:28:01.520 |
whatever. It's just that in a building, this has been around for a long time. Yeah, it is true, 01:28:06.800 |
especially with the younger kids, the amount of time it takes them to actually like learn the specific 01:28:11.680 |
content they need. If they're sharp and they can, they're good with self-pacing a couple hours a day. 01:28:17.760 |
Yeah, that's most of it. A lot of us saw this during the pandemic. So I think these micros will have 01:28:22.000 |
nothing against them, but I don't know if I want to pay. I'd rather just unschool my kid. If this is the 01:28:27.520 |
case, it's YouTube videos and worksheets and like occasional tutoring calls with Brazil. And then an 01:28:31.920 |
a, an LLM that like writes a summary. And you could call that like a super innovative school, 01:28:36.640 |
or you could just say, we're providing a room or particularly driven kids do this sort of like 01:28:40.800 |
unschooling self-paced type of master. There's so many programs like this. 01:28:44.480 |
A lot of homeschool kids use beast Academy to self-paced like math. 01:28:48.320 |
Our school uses this for like advanced kids who want to like get ahead of the curriculum. It's like, 01:28:52.400 |
there's these digital tools for all sorts of things that like smart kids that are driven and 01:28:56.560 |
aren't, and not just driven and smart, but don't have hyperactivity, aren't neurodiverse in the wrong 01:29:02.880 |
way. So, you know, are able to sit still and can self-motivate. Um, this tends to be like not to 01:29:09.760 |
generalize, but it's going to tend to be like young girls more than young, young boys. Uh, the same people 01:29:14.880 |
who would succeed, like kind of self-pacing unschooling, uh, at home, you can put them in 01:29:20.080 |
this room and they'll do it there and then take workshops or whatever. So I don't know. I, I, I've 01:29:24.960 |
not, I have nothing against it, but what this is not alpha schools is not a technological breakthrough 01:29:30.640 |
where somehow AI is now teaching better than any teachers have done before. There is no AI teaching here. 01:29:35.600 |
It's just sort of like standard type of like digital learning tools that we've been using to supplement or 01:29:40.240 |
unschool kids for years. That's what I think is going on with alpha schools. 01:29:42.880 |
You know, to each their own, but not a breakthrough. At least that's my read. 01:29:46.560 |
All right, Jesse. That's all the time I have for today, but thank you everyone for listening. 01:29:52.320 |
We'll be back next week with another episode and until then, as always stay deep. 01:29:57.440 |
If you liked today's discussion of super intelligence, you should also listen to episode 01:30:01.280 |
three 67, which was titled what if AI doesn't get much better than this. These two episodes 01:30:07.200 |
compliment each other. They are my response to the spread of the philosopher's 01:30:12.800 |
fallacy in the AI conversation. I think you'll like it. Check it out. 01:30:16.880 |
In the years since chat GPT's astonishing launch, it's been hard not to get swept up in feelings of 01:30:23.840 |
euphoria or dread about the looming impacts of this new type of artificial intelligence. 01:30:29.360 |
But in recent weeks, this vibe seems to be shifting.