back to indexOpenAI Flip-Flops and '10% Chance of Outperforming Humans in Every Task by 2027' - 3K AI Researchers
00:00:00.000 |
If you blinked this week, you might have missed all four of the developments we're going to 00:00:05.040 |
investigate today. From OpenAI's flip-flop on whether to maximize engagement, to doubts at the 00:00:11.600 |
top of that company as to whether they should even be building superintelligence, and from the 00:00:16.720 |
heating up of the battle to control quality data and the share of what Sam Altman calls the $100 00:00:22.800 |
trillion AI future, to this - the startling findings from the biggest survey of AI researchers 00:00:29.680 |
to date. I'll cover the highlights of this 38-page paper, but let's start with something I noticed 00:00:35.440 |
about the GPT store. That's the store basically where you can create your own version of chat GPT 00:00:41.760 |
based on your own data, your own custom instructions, that kind of thing. Various companies 00:00:46.000 |
are getting involved and creating their own GPT and I'll talk about some of the GPTs that I've 00:00:50.960 |
looked at. But it was actually this short paragraph in the announcement of the GPT store that I want 00:00:56.720 |
to focus on first. It might seem like a small deal now, but in a few months or at the latest a few 00:01:02.720 |
years, this short paragraph could have major ramifications. In the paragraph, OpenAI are 00:01:08.320 |
describing the opportunity for builders to create their own GPTs and what they'll get out of it. 00:01:14.480 |
Basically, how can builders monetize those GPTs that they're creating? Well, it seems like the 00:01:19.680 |
way that OpenAI want to monetize their GPTs is through engagement. Here's the key sentence. 00:01:25.440 |
As a first step, US-only builders will be paid based on user engagement with their GPTs. In 00:01:31.520 |
other words, get your users to use your GPT as much as possible for as long as possible. And 00:01:37.200 |
you're probably thinking, "Okay, Philip, what's the big deal with that sentence?" Well, here's 00:01:41.600 |
Sam Altman testifying before Congress mid last year. - Companies whose revenues depend upon 00:01:47.200 |
volume of use, screen time, intensity of use, design these systems in order to maximize 00:01:54.320 |
the engagement of all users, including children, with perverse results in many cases. And what I 00:01:59.680 |
would humbly advise you is that you get way ahead of this issue. - First of all, I think we try to 00:02:05.680 |
design systems that do not maximize for engagement. In fact, we're so short on GPUs, the less people 00:02:11.520 |
use our products, the better. But we're not an advertising-based model. We're not trying to get 00:02:15.280 |
people to use it more and more. - So what might explain this flip-flop? Well, I think it's 00:02:20.480 |
character AI. Their platform full of addictive chatbots is apparently catching up to ChatGPT 00:02:26.800 |
in the US. As of just a few months ago, they had 4.2 million monthly active users in the US 00:02:33.840 |
compared to 6 million monthly active US users for ChatGPT's mobile apps. It's also well known 00:02:40.400 |
that people spend longer with character AI than they do on ChatGPT. There's also, of course, 00:02:45.920 |
competition from Inflection AI, who have their Pi personal intelligence. It's basically a chatbot 00:02:52.320 |
that kind of wants to be your friend, and it's soon going to be powered by Inflection 2, their 00:02:57.600 |
most powerful LLM. Then there's Meta's universe of characters where you can, quote, "chat to your 00:03:03.360 |
favorite celebrity" via a chatbot. So that commitment not to maximize engagement does seem 00:03:09.600 |
to be waning. Indeed, when Sam Altman had fun speculation about what GPTs will be doing the 00:03:15.040 |
best by the end of today, he had his own theory. There are incredibly useful GPTs in the store, 00:03:20.480 |
but probably everything is waifus runs away with it. I'm not too familiar with that, 00:03:25.200 |
but I believe it's like AI boyfriends and AI girlfriends. Just before we move on from the 00:03:30.320 |
GPT store, there's an experience that happened to me quite a few times. I would test out one of the 00:03:34.640 |
GPTs, like Write For Me, and that GPT says it can do relevant and precise word counts. I tried 00:03:41.760 |
multiple times and every time it failed on word counts. I'm not trying to pick on one GPT, but 00:03:47.760 |
very often using one of the GPTs wasn't really an improvement on just using GPT-4. There was, 00:03:53.680 |
however, one notable exception to that, which was the Consensus GPT. Obviously, this is not sponsored, 00:04:00.640 |
but by giving relevant links that I could then follow up with, it was genuinely helpful and 00:04:05.440 |
better than GPT-4 base. This is for the specific task of looking up scientific research on a given 00:04:11.840 |
topic. But now I'm going to give you a little dose of nostalgia. We are almost a year to the day when 00:04:17.600 |
I started AI Explained, and one of my very first videos in that week was a review of the original 00:04:24.080 |
Consensus app. That video is unlisted now because it's simply not relevant anymore, but I do talk 00:04:28.960 |
about the one year anniversary of the channel and how it's changed on my latest podcast episode on 00:04:34.560 |
AI Insiders for Patreon. There's even a link to the unlisted video if you want to see what AI 00:04:39.520 |
Explained was like back in the day. But there was a much more cryptic announcement from OpenAI at 00:04:45.200 |
the same time as the GPT store that I'm not sure was fully intentional. The reason I say that is 00:04:50.800 |
that it was leaked/announced by someone not from OpenAI. It seems a bit like an inadvertent launch 00:04:57.200 |
because a few minutes later, Greg Brockman put out a hastily written tweet, which he then edited into 00:05:02.400 |
this tweet. Greg Brockman is the president and co-founder of OpenAI. The update was about GPT4 00:05:08.320 |
learning from your chats, carrying what it learns between chats, and improving over time by 00:05:14.800 |
remembering details and preferences. You're then allowed to reset your GPT's memory or turn off this 00:05:21.200 |
feature. But I don't know about you, but it reminds me directly of my previous video on what Mamba 00:05:26.880 |
would allow. If you're curious about that architecture or selective state space models, 00:05:31.120 |
then check out that video. But the point for this video is this. We don't know if this announcement 00:05:36.800 |
was intentional and we don't know what powers it. It could of course just be storing your 00:05:41.360 |
conversations to load into the context window, but it feels more significant than that. As Brockman 00:05:46.320 |
says, they are still experimenting, but hoping to roll this out more broadly over upcoming weeks. 00:05:52.560 |
And as the investor Daniel Gross says, it's the early signs of a moat more significant than 00:05:58.320 |
compute flops. He shared this screenshot where you can ask GPT, "What do you know about me? 00:06:03.920 |
Where did we leave off on my last project? And remember that I like concise responses." 00:06:09.600 |
So why would that be a moat? Well, just like we saw with user engagement for GPT's, 00:06:14.960 |
it makes models more addictive, more customized to you. Why go to a different AI model, a different 00:06:20.640 |
LLM, if this one from chatGPT knows who you are, knows what you like, knows what projects you like 00:06:26.640 |
to work on, what code style you use, or more controversially, feels a bit more like your friend 00:06:32.160 |
remembers your birthday, that kind of thing. If you think that's far fetched, by the way, 00:06:36.080 |
remember that photorealistic video avatars are coming as well soon. And of course, the natural 00:06:41.520 |
end point of this is that these chatbots become as intelligent or more intelligent than you. 00:06:46.320 |
After all, that's what OpenAI have said that they're working towards all along. They want to 00:06:50.720 |
build super intelligence, the super meaning beyond, beyond human intelligence. In this 00:06:56.080 |
fairly recent interview with the Financial Times, Sam Altman said he splits his time between 00:07:00.720 |
researching how to build super intelligence and ways to build up the computing power to do so. 00:07:06.160 |
Also on his checklist is to figure out how to make it safe and figure out the benefits. 00:07:11.920 |
And I promise I'm getting to a point soon enough here, but OpenAI have also said that they admit 00:07:18.160 |
that by building AGI and super intelligence, it's going to replace human work. Indeed, 00:07:22.560 |
their definition of AGI on their website is a system that outperforms humans at most 00:07:28.160 |
economically valuable work. And that's just AGI, remember, not even super intelligence. 00:07:32.480 |
Indeed, in a blog post on super intelligence, Sam Altman and Greg Brockman said that we believe 00:07:37.760 |
it will be actually risky and difficult to stop the creation of super intelligence. Yes, 00:07:43.360 |
it might automate human labor, but the upsides are so tremendous in quotes, and so many people 00:07:49.440 |
are racing to build it. It's inherently part of the technological path that we are on. 00:07:54.640 |
Now, you are probably really wondering why I'm bringing up these recent quotes to recap 00:07:59.680 |
they're building super intelligence. They have to build super intelligence and yes, 00:08:03.760 |
it will mean the automation of human labor. Well, here's why I bring all of that up. It 00:08:09.120 |
seems like there might be the first inkling of second thoughts about that plan. Just four days 00:08:15.120 |
ago, one of the key figures at OpenAI, Andrej Karpathy said this, he says that the best possible 00:08:21.520 |
thing that we can be is not EAC or EA, but all about intelligence amplification. In other words, 00:08:27.360 |
we should not seek to build super intelligent, godlike entities that replaces humans. So that's 00:08:34.800 |
no super intelligence and not for the replacement of humans. It's about tools being the bicycle for 00:08:41.280 |
the mind, things that empower all humans, not just a top percentile. And it's pretty hard to disagree 00:08:47.680 |
with that. It seems like a wonderful vision to me too, but it was quite fascinating to see Sam 00:08:53.280 |
Altman retweet or repost that tweet. The obvious questions are like, are you trying to build super 00:08:59.280 |
intelligence or are you not? Are you trying to replace human labor or are you not trying to do 00:09:04.320 |
so? You can't keep describing your latest advancements as tools, but then admit that they 00:09:09.760 |
are going to replace human labor and eventually be more intelligent than all of us. I mean, 00:09:13.920 |
I do think almost everyone could get behind this vision from Andrej Karpathy. It just seems to 00:09:19.920 |
contradict some of the other stuff that OpenAI have put out. Also, if I had a chance to ask 00:09:25.360 |
Karpathy or Altman, I would say, how do you draw the line between a tool and something that 00:09:31.440 |
replaces us? Making an AI model more intelligent indeed makes it a better tool, but it also makes 00:09:37.040 |
us one step closer to being replaced in the labor market. Giving an AI more agency and independence 00:09:43.120 |
will make it a better assistant, but again, makes it one step closer to being able to replace your 00:09:48.800 |
job. Indeed, as another key worker at OpenAI said, Richard Ngo, he thinks we'll continue hearing that 00:09:55.760 |
LLMs are just tools and lack any intentions or goals until well after it's clearly false. I guess 00:10:02.640 |
I'm seeking clarity on what divides something from being a tool and being a replacement. And if OpenAI 00:10:09.600 |
are trying to be on the same page on this, they need to state clearly what that dividing line is. 00:10:15.040 |
Now, just quickly before we get to that AI researcher prediction paper, there's one more 00:10:19.040 |
thing from OpenAI that I find hard to reconcile. Last month, they did a deal with Axel Springer, 00:10:25.280 |
a publishing house, to create new financial opportunities that support a sustainable future 00:10:30.960 |
for journalism. And as of two days ago, it was revealed that they're in talks with CNN, Fox, 00:10:36.720 |
and Time to get more content. And finally, the information revealed what OpenAI is typically 00:10:42.800 |
offering publishers. It's between $1 million to $5 million annually. And what we also learned from 00:10:49.200 |
this article is that battle to control data is not just being waged by OpenAI or even OpenAI 00:10:56.240 |
and Google. Apple has actually launched into the fray and they are trying to strike deals with 00:11:01.440 |
publishers for the use of their content. But the difference with Apple is that they want to be able 00:11:06.400 |
to use content for future AI products in any way the company deems necessary. That could include, 00:11:12.320 |
for example, imitating the style of a particular publisher or journal. Or, for example, developing 00:11:18.320 |
a model that acts as the world's newspaper. Remember, you can customize models now so you 00:11:24.000 |
can have your own Fox News, your own MSNBC. You could have, for example, your own personalized 00:11:29.600 |
AI video avatar giving you exactly and only the news that you want. What that does to society 00:11:36.560 |
is a question for another day. But Apple are apparently offering up to $50 million for those 00:11:42.560 |
kind of rights. So what's so hard to reconcile then? Well, remember, this is all about creating 00:11:47.680 |
new financial opportunities that support a sustainable future for journalism. But Sam 00:11:52.320 |
Altman has already said that his grand idea is that OpenAI will capture much of the world's wealth 00:11:58.560 |
through the creation of AGI and then redistribute this wealth. And he's talked about figures like 00:12:04.080 |
$1 trillion and $100 trillion. In a world where OpenAI, Google and Apple are creating $100 trillion 00:12:12.560 |
worth of wealth and profit, seems like in that world they would have gobbled up independent 00:12:18.720 |
journalism or at least the major profits of independent journalism. Indeed, $100 trillion 00:12:24.240 |
is about the size of the entire global GDP. And that kind of makes sense, right? If AGI or 00:12:29.520 |
superintelligence can do the task of any human being, it would make sense to equate it with the 00:12:35.200 |
global economy size. How that fits in with a sustainable future for journalism, I'll have 00:12:40.240 |
to work that one out. But it's time at last for AI researchers to weigh in themselves on all of 00:12:46.240 |
these debates. Thousands of AI authors submitted their predictions for this paper. It's predictions 00:12:52.400 |
about everything. Timelines, safety, economic impact, everything. And of course, as you might 00:12:57.840 |
expect, I've read the paper in full and I'm going to give you only the most juicy bits. This paper, 00:13:03.120 |
by the way, came out just a week ago. But let's start with the very first paragraph. These results, 00:13:08.560 |
by the way, come from a survey of 2,778 AI researchers. So here's the first prediction. 00:13:15.520 |
If science continues undisrupted, the chance of unaided machines outperforming humans in every 00:13:22.080 |
possible task was estimated at 10% by 2027. And yes, at 50% by 2047. But let's focus on that 10% 00:13:30.560 |
by 2027. That's a one in 10 chance that all human tasks are potentially automatable in three years 00:13:40.640 |
from now. Now, there is one quick caveat that I'm going to add to that on behalf of the paper. 00:13:45.760 |
Even if there is one model out there that can wire a house and solve a math competition, 00:13:52.160 |
doesn't mean that there's instantly billions of such models. If we're taking embodiment into 00:13:57.440 |
account, the mass manufacturing of all of those models would take a lot longer than just a few 00:14:04.000 |
months or years. But nevertheless, if you sit back and just read that prediction again from AI 00:14:10.400 |
researchers, like it's easy to get lost in the noise and the news and next week there's going 00:14:17.120 |
to be another model, another development, but a 10% chance in three, maybe three and a bit years 00:14:24.240 |
for every human task to be potentially automatable unaided is pretty insane. These estimates, 00:14:31.440 |
as they say, are typically earlier than they were when these researchers were surveyed last year. 00:14:36.720 |
Now you may have noticed that this sentence uses the word outperforming, and there's a later 00:14:41.600 |
sentence in the first paragraph talking about being fully automatable. And I'll come back to 00:14:46.080 |
that in a moment. And when I say a moment, I mean like literally right now, because in the paper, 00:14:51.040 |
there was an incredibly stark and unjustified difference between the predictions for high 00:14:57.280 |
level machine intelligence, that's all human tasks and the full automation of labor or human jobs 00:15:03.040 |
with the full automation of labor being predicted for the 2100s. And of course, on hearing that many 00:15:09.680 |
of you will be like, wait, what's high level machine intelligence then and all human tasks? 00:15:14.320 |
Well, here is where they describe high level machine intelligence as defined for this survey. 00:15:19.360 |
High level machine intelligence is achieved when unaided machines can accomplish every task, 00:15:25.680 |
every task better and more cheaply than human workers. So not just better, but also more cheaply 00:15:31.600 |
think feasibility, not adoption. The only caveat is that we're assuming here that human scientific 00:15:36.960 |
activity continues without major negative disruption. But the date by which there is 00:15:41.920 |
apparently a 50% chance of this happening is 2047 down 13 years from 2060. But how on earth 00:15:52.000 |
would it take us from 2047 to the 2100s to go from a machine that can accomplish every human task 00:16:00.160 |
better and more cheaply to the full automation of labor? Like literally how long do they think 00:16:05.120 |
it's going to take to manufacture these robots? And don't forget the manufacturing of these 00:16:09.920 |
embodied AIs is going to be assisted presumably by the high level machine intelligences. It's 00:16:15.520 |
not like manufacturing is going to continue at the pace it always has. Each factory would have 00:16:20.000 |
its own AGI helping it speed up production. To be honest, the main result of this survey for me 00:16:25.360 |
is that it shows that AI researchers are really not good at thinking through their predictions. 00:16:30.640 |
Later on in the paper, the authors admit that this discrepancy between human level machine 00:16:35.600 |
intelligence and the full automation of all labor is surprising. And their guesses are maybe it was 00:16:41.680 |
the framing effect of the question, or maybe it's that caveat about the continuation of scientific 00:16:48.240 |
progress. Maybe respondents expect major disruption to scientific progress. And there's 00:16:53.680 |
something else from the paper that I found kind of amusing. What is the last job that AI researchers 00:16:59.120 |
think will be fully automated? Being an AI researcher. That's at 2063. Of course, that's 00:17:05.280 |
the 50% prediction. Some people think much earlier. Some people think much later. Now, number 00:17:09.520 |
one, I think that's kind of funny that AI researchers think what they do is going to be harder 00:17:13.600 |
to automate than anything else. But number two, look at the timeline 2063. That's 40 years or so 00:17:20.480 |
from now. But now think back to OpenAI who have the goal of solving super intelligent alignment 00:17:26.400 |
in four years. And the key point is this, their method of doing so is to automate machine learning 00:17:32.480 |
safety research. That's right. Build an AI that can do the safety research for them. Indeed, 00:17:37.920 |
one of the things they're working on is trying to align the model that's going to solve alignment. 00:17:42.800 |
But the point is, look at their timeline. They think it's possible, indeed it's their goal, 00:17:47.920 |
to create this automated alignment AI researcher in four years. They call it the first automated 00:17:55.840 |
alignment researcher. Now, yes, it's quite possible that they miss this deadline and it takes five 00:18:00.400 |
years or 10 years, or maybe they do it faster than four years. But look at the kind of numbers we're 00:18:04.480 |
talking about. Four years, five years, 10 years. Now I get that alignment research isn't all of AI 00:18:10.240 |
research and there's a lot more there, but these four year, five year, 10 year goals seem a big 00:18:16.240 |
stretch from a 40 year expectation of the automation of AI research. One of these two dates 00:18:23.120 |
is going to be pretty dramatically wrong. Now for a few more interesting little highlights before we 00:18:29.120 |
close out. The paper found that subtle differences in the way that they ask certain questions 00:18:34.560 |
radically changed the results. Apparently, if you ask people in this kind of way, 00:18:38.880 |
will we have superintelligence by 2050? You get much more conservative answers than if you ask, 00:18:44.880 |
give me the year by which we'll have a 50% chance of superintelligence. In a related study, 00:18:50.880 |
when people were asked about the chances of AI going wrong, when they were asked for a percentage, 00:18:56.080 |
they said 5%. When they were asked for a fraction, it was one in 15 million. Now, of course, these 00:19:02.080 |
weren't the same people, otherwise they'd be pretty bad at mathematics. These were different test 00:19:06.160 |
groups given different versions of questions. One tranche of the researchers might get one style of 00:19:11.840 |
question, another tranche gets a different style. But I guess the take home here is that human 00:19:16.720 |
psychology and anchoring is still playing an immense role when we talk about timelines and 00:19:22.480 |
predictions. Here's another fascinating highlight. The researchers were asked about the possibility 00:19:27.520 |
of an intelligence explosion. This is the question. Some people have argued the following, 00:19:32.480 |
if AI systems do nearly all research and development, improvements in AI will accelerate 00:19:38.000 |
the pace of technological progress, including further progress in AI. Over a short period, 00:19:43.600 |
less than five years, this feedback loop could cause technological progress to become more than 00:19:49.840 |
an order of magnitude faster. That's a period of less than five years in which technological 00:19:55.280 |
progress becomes 10 times or more faster. Now, before you see the results, let's reflect. That 00:20:01.200 |
would be a crazy world. It's already hard to keep up with AI. And I say that as someone who does it 00:20:07.680 |
for a living. Now imagine research and progress being 10 times or more faster. So is it 5% or 10% 00:20:15.840 |
of AI researchers who think that's possible? No, it's a majority of respondents. In 2023, 00:20:22.240 |
it's 24% who think there's an even chance of that happening, 20% who think it's likely, and 9% who 00:20:28.560 |
think it's quite likely. That's 53% of AI researchers who think we'll get this accelerating 00:20:34.960 |
feedback loop, a proto-singularity, if you like. At that point, it would almost be worth live 00:20:40.240 |
streaming my channel rather than creating videos because it would be like every minute there's a 00:20:44.560 |
new bit of news. Apparently, 86% of the researchers, and I would phrase it as only 86%, 00:20:49.840 |
were worried about deepfakes. To count, by the way, they had to view it as at least a substantial 00:20:55.680 |
concern. I would show those guys this in January of 2024. These were made in Blender and Cinema 4D, 00:21:03.280 |
but they look ridiculously lifelike to me. If 14% of researchers don't think that deepfakes will be 00:21:10.320 |
at least a concern, I don't know what to say to them. And the vast majority of respondents, 00:21:16.080 |
70%, thought that AI safety research should be prioritised more than it currently is. And an 00:21:22.640 |
equally clear message is that timelines are getting shorter. The red line is this year's 00:21:28.160 |
predictions and the blue line is last year's. And the lines being more to the left mean more 00:21:34.160 |
proximate, more close at hand predictions. This is for human level machine intelligence. Now, 00:21:39.680 |
I made a detailed AGI timeline of my own for AI Insiders, but suffice to say I am well to the left 00:21:46.400 |
of that red line. And just one final point before we leave this survey. Some people, I am sure, 00:21:52.320 |
in the comments will point out that it only achieved a 15% response rate, similar to previous 00:21:58.000 |
years. Thing is, they even gave prizes for people to respond to the survey. So they tried their best. 00:22:03.680 |
It seems that most people just seem to not want to spend the time to answer surveys. And to be 00:22:08.400 |
honest, I can understand. And that response rate is in line with other surveys of a similar size. 00:22:13.840 |
But I can't end on that technicality in the week of CES 2024 in Las Vegas. Now, I can talk about 00:22:20.720 |
more of my highlights if you like, but there's one device that stood out to me. It's ridiculously 00:22:25.840 |
expensive, but shows AI can be insanely useful and fun when we want it to be. It's these almost 00:22:32.320 |
five grand AI powered binoculars that can identify birds while you're looking through them. I just 00:22:38.480 |
think that's insane and super fun. And I'd get it if you dropped off a couple of zeros. Anyway, 00:22:44.080 |
we have covered a lot in this video. I am really curious to hear what you think. Thank you so much