back to index

OpenAI Flip-Flops and '10% Chance of Outperforming Humans in Every Task by 2027' - 3K AI Researchers


Whisper Transcript | Transcript Only Page

00:00:00.000 | If you blinked this week, you might have missed all four of the developments we're going to
00:00:05.040 | investigate today. From OpenAI's flip-flop on whether to maximize engagement, to doubts at the
00:00:11.600 | top of that company as to whether they should even be building superintelligence, and from the
00:00:16.720 | heating up of the battle to control quality data and the share of what Sam Altman calls the $100
00:00:22.800 | trillion AI future, to this - the startling findings from the biggest survey of AI researchers
00:00:29.680 | to date. I'll cover the highlights of this 38-page paper, but let's start with something I noticed
00:00:35.440 | about the GPT store. That's the store basically where you can create your own version of chat GPT
00:00:41.760 | based on your own data, your own custom instructions, that kind of thing. Various companies
00:00:46.000 | are getting involved and creating their own GPT and I'll talk about some of the GPTs that I've
00:00:50.960 | looked at. But it was actually this short paragraph in the announcement of the GPT store that I want
00:00:56.720 | to focus on first. It might seem like a small deal now, but in a few months or at the latest a few
00:01:02.720 | years, this short paragraph could have major ramifications. In the paragraph, OpenAI are
00:01:08.320 | describing the opportunity for builders to create their own GPTs and what they'll get out of it.
00:01:14.480 | Basically, how can builders monetize those GPTs that they're creating? Well, it seems like the
00:01:19.680 | way that OpenAI want to monetize their GPTs is through engagement. Here's the key sentence.
00:01:25.440 | As a first step, US-only builders will be paid based on user engagement with their GPTs. In
00:01:31.520 | other words, get your users to use your GPT as much as possible for as long as possible. And
00:01:37.200 | you're probably thinking, "Okay, Philip, what's the big deal with that sentence?" Well, here's
00:01:41.600 | Sam Altman testifying before Congress mid last year. - Companies whose revenues depend upon
00:01:47.200 | volume of use, screen time, intensity of use, design these systems in order to maximize
00:01:54.320 | the engagement of all users, including children, with perverse results in many cases. And what I
00:01:59.680 | would humbly advise you is that you get way ahead of this issue. - First of all, I think we try to
00:02:05.680 | design systems that do not maximize for engagement. In fact, we're so short on GPUs, the less people
00:02:11.520 | use our products, the better. But we're not an advertising-based model. We're not trying to get
00:02:15.280 | people to use it more and more. - So what might explain this flip-flop? Well, I think it's
00:02:20.480 | character AI. Their platform full of addictive chatbots is apparently catching up to ChatGPT
00:02:26.800 | in the US. As of just a few months ago, they had 4.2 million monthly active users in the US
00:02:33.840 | compared to 6 million monthly active US users for ChatGPT's mobile apps. It's also well known
00:02:40.400 | that people spend longer with character AI than they do on ChatGPT. There's also, of course,
00:02:45.920 | competition from Inflection AI, who have their Pi personal intelligence. It's basically a chatbot
00:02:52.320 | that kind of wants to be your friend, and it's soon going to be powered by Inflection 2, their
00:02:57.600 | most powerful LLM. Then there's Meta's universe of characters where you can, quote, "chat to your
00:03:03.360 | favorite celebrity" via a chatbot. So that commitment not to maximize engagement does seem
00:03:09.600 | to be waning. Indeed, when Sam Altman had fun speculation about what GPTs will be doing the
00:03:15.040 | best by the end of today, he had his own theory. There are incredibly useful GPTs in the store,
00:03:20.480 | but probably everything is waifus runs away with it. I'm not too familiar with that,
00:03:25.200 | but I believe it's like AI boyfriends and AI girlfriends. Just before we move on from the
00:03:30.320 | GPT store, there's an experience that happened to me quite a few times. I would test out one of the
00:03:34.640 | GPTs, like Write For Me, and that GPT says it can do relevant and precise word counts. I tried
00:03:41.760 | multiple times and every time it failed on word counts. I'm not trying to pick on one GPT, but
00:03:47.760 | very often using one of the GPTs wasn't really an improvement on just using GPT-4. There was,
00:03:53.680 | however, one notable exception to that, which was the Consensus GPT. Obviously, this is not sponsored,
00:04:00.640 | but by giving relevant links that I could then follow up with, it was genuinely helpful and
00:04:05.440 | better than GPT-4 base. This is for the specific task of looking up scientific research on a given
00:04:11.840 | topic. But now I'm going to give you a little dose of nostalgia. We are almost a year to the day when
00:04:17.600 | I started AI Explained, and one of my very first videos in that week was a review of the original
00:04:24.080 | Consensus app. That video is unlisted now because it's simply not relevant anymore, but I do talk
00:04:28.960 | about the one year anniversary of the channel and how it's changed on my latest podcast episode on
00:04:34.560 | AI Insiders for Patreon. There's even a link to the unlisted video if you want to see what AI
00:04:39.520 | Explained was like back in the day. But there was a much more cryptic announcement from OpenAI at
00:04:45.200 | the same time as the GPT store that I'm not sure was fully intentional. The reason I say that is
00:04:50.800 | that it was leaked/announced by someone not from OpenAI. It seems a bit like an inadvertent launch
00:04:57.200 | because a few minutes later, Greg Brockman put out a hastily written tweet, which he then edited into
00:05:02.400 | this tweet. Greg Brockman is the president and co-founder of OpenAI. The update was about GPT4
00:05:08.320 | learning from your chats, carrying what it learns between chats, and improving over time by
00:05:14.800 | remembering details and preferences. You're then allowed to reset your GPT's memory or turn off this
00:05:21.200 | feature. But I don't know about you, but it reminds me directly of my previous video on what Mamba
00:05:26.880 | would allow. If you're curious about that architecture or selective state space models,
00:05:31.120 | then check out that video. But the point for this video is this. We don't know if this announcement
00:05:36.800 | was intentional and we don't know what powers it. It could of course just be storing your
00:05:41.360 | conversations to load into the context window, but it feels more significant than that. As Brockman
00:05:46.320 | says, they are still experimenting, but hoping to roll this out more broadly over upcoming weeks.
00:05:52.560 | And as the investor Daniel Gross says, it's the early signs of a moat more significant than
00:05:58.320 | compute flops. He shared this screenshot where you can ask GPT, "What do you know about me?
00:06:03.920 | Where did we leave off on my last project? And remember that I like concise responses."
00:06:09.600 | So why would that be a moat? Well, just like we saw with user engagement for GPT's,
00:06:14.960 | it makes models more addictive, more customized to you. Why go to a different AI model, a different
00:06:20.640 | LLM, if this one from chatGPT knows who you are, knows what you like, knows what projects you like
00:06:26.640 | to work on, what code style you use, or more controversially, feels a bit more like your friend
00:06:32.160 | remembers your birthday, that kind of thing. If you think that's far fetched, by the way,
00:06:36.080 | remember that photorealistic video avatars are coming as well soon. And of course, the natural
00:06:41.520 | end point of this is that these chatbots become as intelligent or more intelligent than you.
00:06:46.320 | After all, that's what OpenAI have said that they're working towards all along. They want to
00:06:50.720 | build super intelligence, the super meaning beyond, beyond human intelligence. In this
00:06:56.080 | fairly recent interview with the Financial Times, Sam Altman said he splits his time between
00:07:00.720 | researching how to build super intelligence and ways to build up the computing power to do so.
00:07:06.160 | Also on his checklist is to figure out how to make it safe and figure out the benefits.
00:07:11.920 | And I promise I'm getting to a point soon enough here, but OpenAI have also said that they admit
00:07:18.160 | that by building AGI and super intelligence, it's going to replace human work. Indeed,
00:07:22.560 | their definition of AGI on their website is a system that outperforms humans at most
00:07:28.160 | economically valuable work. And that's just AGI, remember, not even super intelligence.
00:07:32.480 | Indeed, in a blog post on super intelligence, Sam Altman and Greg Brockman said that we believe
00:07:37.760 | it will be actually risky and difficult to stop the creation of super intelligence. Yes,
00:07:43.360 | it might automate human labor, but the upsides are so tremendous in quotes, and so many people
00:07:49.440 | are racing to build it. It's inherently part of the technological path that we are on.
00:07:54.640 | Now, you are probably really wondering why I'm bringing up these recent quotes to recap
00:07:59.680 | they're building super intelligence. They have to build super intelligence and yes,
00:08:03.760 | it will mean the automation of human labor. Well, here's why I bring all of that up. It
00:08:09.120 | seems like there might be the first inkling of second thoughts about that plan. Just four days
00:08:15.120 | ago, one of the key figures at OpenAI, Andrej Karpathy said this, he says that the best possible
00:08:21.520 | thing that we can be is not EAC or EA, but all about intelligence amplification. In other words,
00:08:27.360 | we should not seek to build super intelligent, godlike entities that replaces humans. So that's
00:08:34.800 | no super intelligence and not for the replacement of humans. It's about tools being the bicycle for
00:08:41.280 | the mind, things that empower all humans, not just a top percentile. And it's pretty hard to disagree
00:08:47.680 | with that. It seems like a wonderful vision to me too, but it was quite fascinating to see Sam
00:08:53.280 | Altman retweet or repost that tweet. The obvious questions are like, are you trying to build super
00:08:59.280 | intelligence or are you not? Are you trying to replace human labor or are you not trying to do
00:09:04.320 | so? You can't keep describing your latest advancements as tools, but then admit that they
00:09:09.760 | are going to replace human labor and eventually be more intelligent than all of us. I mean,
00:09:13.920 | I do think almost everyone could get behind this vision from Andrej Karpathy. It just seems to
00:09:19.920 | contradict some of the other stuff that OpenAI have put out. Also, if I had a chance to ask
00:09:25.360 | Karpathy or Altman, I would say, how do you draw the line between a tool and something that
00:09:31.440 | replaces us? Making an AI model more intelligent indeed makes it a better tool, but it also makes
00:09:37.040 | us one step closer to being replaced in the labor market. Giving an AI more agency and independence
00:09:43.120 | will make it a better assistant, but again, makes it one step closer to being able to replace your
00:09:48.800 | job. Indeed, as another key worker at OpenAI said, Richard Ngo, he thinks we'll continue hearing that
00:09:55.760 | LLMs are just tools and lack any intentions or goals until well after it's clearly false. I guess
00:10:02.640 | I'm seeking clarity on what divides something from being a tool and being a replacement. And if OpenAI
00:10:09.600 | are trying to be on the same page on this, they need to state clearly what that dividing line is.
00:10:15.040 | Now, just quickly before we get to that AI researcher prediction paper, there's one more
00:10:19.040 | thing from OpenAI that I find hard to reconcile. Last month, they did a deal with Axel Springer,
00:10:25.280 | a publishing house, to create new financial opportunities that support a sustainable future
00:10:30.960 | for journalism. And as of two days ago, it was revealed that they're in talks with CNN, Fox,
00:10:36.720 | and Time to get more content. And finally, the information revealed what OpenAI is typically
00:10:42.800 | offering publishers. It's between $1 million to $5 million annually. And what we also learned from
00:10:49.200 | this article is that battle to control data is not just being waged by OpenAI or even OpenAI
00:10:56.240 | and Google. Apple has actually launched into the fray and they are trying to strike deals with
00:11:01.440 | publishers for the use of their content. But the difference with Apple is that they want to be able
00:11:06.400 | to use content for future AI products in any way the company deems necessary. That could include,
00:11:12.320 | for example, imitating the style of a particular publisher or journal. Or, for example, developing
00:11:18.320 | a model that acts as the world's newspaper. Remember, you can customize models now so you
00:11:24.000 | can have your own Fox News, your own MSNBC. You could have, for example, your own personalized
00:11:29.600 | AI video avatar giving you exactly and only the news that you want. What that does to society
00:11:36.560 | is a question for another day. But Apple are apparently offering up to $50 million for those
00:11:42.560 | kind of rights. So what's so hard to reconcile then? Well, remember, this is all about creating
00:11:47.680 | new financial opportunities that support a sustainable future for journalism. But Sam
00:11:52.320 | Altman has already said that his grand idea is that OpenAI will capture much of the world's wealth
00:11:58.560 | through the creation of AGI and then redistribute this wealth. And he's talked about figures like
00:12:04.080 | $1 trillion and $100 trillion. In a world where OpenAI, Google and Apple are creating $100 trillion
00:12:12.560 | worth of wealth and profit, seems like in that world they would have gobbled up independent
00:12:18.720 | journalism or at least the major profits of independent journalism. Indeed, $100 trillion
00:12:24.240 | is about the size of the entire global GDP. And that kind of makes sense, right? If AGI or
00:12:29.520 | superintelligence can do the task of any human being, it would make sense to equate it with the
00:12:35.200 | global economy size. How that fits in with a sustainable future for journalism, I'll have
00:12:40.240 | to work that one out. But it's time at last for AI researchers to weigh in themselves on all of
00:12:46.240 | these debates. Thousands of AI authors submitted their predictions for this paper. It's predictions
00:12:52.400 | about everything. Timelines, safety, economic impact, everything. And of course, as you might
00:12:57.840 | expect, I've read the paper in full and I'm going to give you only the most juicy bits. This paper,
00:13:03.120 | by the way, came out just a week ago. But let's start with the very first paragraph. These results,
00:13:08.560 | by the way, come from a survey of 2,778 AI researchers. So here's the first prediction.
00:13:15.520 | If science continues undisrupted, the chance of unaided machines outperforming humans in every
00:13:22.080 | possible task was estimated at 10% by 2027. And yes, at 50% by 2047. But let's focus on that 10%
00:13:30.560 | by 2027. That's a one in 10 chance that all human tasks are potentially automatable in three years
00:13:40.640 | from now. Now, there is one quick caveat that I'm going to add to that on behalf of the paper.
00:13:45.760 | Even if there is one model out there that can wire a house and solve a math competition,
00:13:52.160 | doesn't mean that there's instantly billions of such models. If we're taking embodiment into
00:13:57.440 | account, the mass manufacturing of all of those models would take a lot longer than just a few
00:14:04.000 | months or years. But nevertheless, if you sit back and just read that prediction again from AI
00:14:10.400 | researchers, like it's easy to get lost in the noise and the news and next week there's going
00:14:17.120 | to be another model, another development, but a 10% chance in three, maybe three and a bit years
00:14:24.240 | for every human task to be potentially automatable unaided is pretty insane. These estimates,
00:14:31.440 | as they say, are typically earlier than they were when these researchers were surveyed last year.
00:14:36.720 | Now you may have noticed that this sentence uses the word outperforming, and there's a later
00:14:41.600 | sentence in the first paragraph talking about being fully automatable. And I'll come back to
00:14:46.080 | that in a moment. And when I say a moment, I mean like literally right now, because in the paper,
00:14:51.040 | there was an incredibly stark and unjustified difference between the predictions for high
00:14:57.280 | level machine intelligence, that's all human tasks and the full automation of labor or human jobs
00:15:03.040 | with the full automation of labor being predicted for the 2100s. And of course, on hearing that many
00:15:09.680 | of you will be like, wait, what's high level machine intelligence then and all human tasks?
00:15:14.320 | Well, here is where they describe high level machine intelligence as defined for this survey.
00:15:19.360 | High level machine intelligence is achieved when unaided machines can accomplish every task,
00:15:25.680 | every task better and more cheaply than human workers. So not just better, but also more cheaply
00:15:31.600 | think feasibility, not adoption. The only caveat is that we're assuming here that human scientific
00:15:36.960 | activity continues without major negative disruption. But the date by which there is
00:15:41.920 | apparently a 50% chance of this happening is 2047 down 13 years from 2060. But how on earth
00:15:52.000 | would it take us from 2047 to the 2100s to go from a machine that can accomplish every human task
00:16:00.160 | better and more cheaply to the full automation of labor? Like literally how long do they think
00:16:05.120 | it's going to take to manufacture these robots? And don't forget the manufacturing of these
00:16:09.920 | embodied AIs is going to be assisted presumably by the high level machine intelligences. It's
00:16:15.520 | not like manufacturing is going to continue at the pace it always has. Each factory would have
00:16:20.000 | its own AGI helping it speed up production. To be honest, the main result of this survey for me
00:16:25.360 | is that it shows that AI researchers are really not good at thinking through their predictions.
00:16:30.640 | Later on in the paper, the authors admit that this discrepancy between human level machine
00:16:35.600 | intelligence and the full automation of all labor is surprising. And their guesses are maybe it was
00:16:41.680 | the framing effect of the question, or maybe it's that caveat about the continuation of scientific
00:16:48.240 | progress. Maybe respondents expect major disruption to scientific progress. And there's
00:16:53.680 | something else from the paper that I found kind of amusing. What is the last job that AI researchers
00:16:59.120 | think will be fully automated? Being an AI researcher. That's at 2063. Of course, that's
00:17:05.280 | the 50% prediction. Some people think much earlier. Some people think much later. Now, number
00:17:09.520 | one, I think that's kind of funny that AI researchers think what they do is going to be harder
00:17:13.600 | to automate than anything else. But number two, look at the timeline 2063. That's 40 years or so
00:17:20.480 | from now. But now think back to OpenAI who have the goal of solving super intelligent alignment
00:17:26.400 | in four years. And the key point is this, their method of doing so is to automate machine learning
00:17:32.480 | safety research. That's right. Build an AI that can do the safety research for them. Indeed,
00:17:37.920 | one of the things they're working on is trying to align the model that's going to solve alignment.
00:17:42.800 | But the point is, look at their timeline. They think it's possible, indeed it's their goal,
00:17:47.920 | to create this automated alignment AI researcher in four years. They call it the first automated
00:17:55.840 | alignment researcher. Now, yes, it's quite possible that they miss this deadline and it takes five
00:18:00.400 | years or 10 years, or maybe they do it faster than four years. But look at the kind of numbers we're
00:18:04.480 | talking about. Four years, five years, 10 years. Now I get that alignment research isn't all of AI
00:18:10.240 | research and there's a lot more there, but these four year, five year, 10 year goals seem a big
00:18:16.240 | stretch from a 40 year expectation of the automation of AI research. One of these two dates
00:18:23.120 | is going to be pretty dramatically wrong. Now for a few more interesting little highlights before we
00:18:29.120 | close out. The paper found that subtle differences in the way that they ask certain questions
00:18:34.560 | radically changed the results. Apparently, if you ask people in this kind of way,
00:18:38.880 | will we have superintelligence by 2050? You get much more conservative answers than if you ask,
00:18:44.880 | give me the year by which we'll have a 50% chance of superintelligence. In a related study,
00:18:50.880 | when people were asked about the chances of AI going wrong, when they were asked for a percentage,
00:18:56.080 | they said 5%. When they were asked for a fraction, it was one in 15 million. Now, of course, these
00:19:02.080 | weren't the same people, otherwise they'd be pretty bad at mathematics. These were different test
00:19:06.160 | groups given different versions of questions. One tranche of the researchers might get one style of
00:19:11.840 | question, another tranche gets a different style. But I guess the take home here is that human
00:19:16.720 | psychology and anchoring is still playing an immense role when we talk about timelines and
00:19:22.480 | predictions. Here's another fascinating highlight. The researchers were asked about the possibility
00:19:27.520 | of an intelligence explosion. This is the question. Some people have argued the following,
00:19:32.480 | if AI systems do nearly all research and development, improvements in AI will accelerate
00:19:38.000 | the pace of technological progress, including further progress in AI. Over a short period,
00:19:43.600 | less than five years, this feedback loop could cause technological progress to become more than
00:19:49.840 | an order of magnitude faster. That's a period of less than five years in which technological
00:19:55.280 | progress becomes 10 times or more faster. Now, before you see the results, let's reflect. That
00:20:01.200 | would be a crazy world. It's already hard to keep up with AI. And I say that as someone who does it
00:20:07.680 | for a living. Now imagine research and progress being 10 times or more faster. So is it 5% or 10%
00:20:15.840 | of AI researchers who think that's possible? No, it's a majority of respondents. In 2023,
00:20:22.240 | it's 24% who think there's an even chance of that happening, 20% who think it's likely, and 9% who
00:20:28.560 | think it's quite likely. That's 53% of AI researchers who think we'll get this accelerating
00:20:34.960 | feedback loop, a proto-singularity, if you like. At that point, it would almost be worth live
00:20:40.240 | streaming my channel rather than creating videos because it would be like every minute there's a
00:20:44.560 | new bit of news. Apparently, 86% of the researchers, and I would phrase it as only 86%,
00:20:49.840 | were worried about deepfakes. To count, by the way, they had to view it as at least a substantial
00:20:55.680 | concern. I would show those guys this in January of 2024. These were made in Blender and Cinema 4D,
00:21:03.280 | but they look ridiculously lifelike to me. If 14% of researchers don't think that deepfakes will be
00:21:10.320 | at least a concern, I don't know what to say to them. And the vast majority of respondents,
00:21:16.080 | 70%, thought that AI safety research should be prioritised more than it currently is. And an
00:21:22.640 | equally clear message is that timelines are getting shorter. The red line is this year's
00:21:28.160 | predictions and the blue line is last year's. And the lines being more to the left mean more
00:21:34.160 | proximate, more close at hand predictions. This is for human level machine intelligence. Now,
00:21:39.680 | I made a detailed AGI timeline of my own for AI Insiders, but suffice to say I am well to the left
00:21:46.400 | of that red line. And just one final point before we leave this survey. Some people, I am sure,
00:21:52.320 | in the comments will point out that it only achieved a 15% response rate, similar to previous
00:21:58.000 | years. Thing is, they even gave prizes for people to respond to the survey. So they tried their best.
00:22:03.680 | It seems that most people just seem to not want to spend the time to answer surveys. And to be
00:22:08.400 | honest, I can understand. And that response rate is in line with other surveys of a similar size.
00:22:13.840 | But I can't end on that technicality in the week of CES 2024 in Las Vegas. Now, I can talk about
00:22:20.720 | more of my highlights if you like, but there's one device that stood out to me. It's ridiculously
00:22:25.840 | expensive, but shows AI can be insanely useful and fun when we want it to be. It's these almost
00:22:32.320 | five grand AI powered binoculars that can identify birds while you're looking through them. I just
00:22:38.480 | think that's insane and super fun. And I'd get it if you dropped off a couple of zeros. Anyway,
00:22:44.080 | we have covered a lot in this video. I am really curious to hear what you think. Thank you so much
00:22:49.520 | for watching and have a wonderful day.