back to indexAI - 2024AD: 212-page Report (from this morning) Fully Read w/ Highlights
Chapters
0:0 Introduction
0:43 o1 changed the game
1:37 OpenAI costs revealed
2:39 MovieGen and Pika 1.5
3:39 Nobel Prize Winners + job listing
6:51 BrainLM
8:20 Open Chai-1
8:48 Stacked Accelerations
10:10 China Enters Stage Left
10:55 100x Drop in Price? Or 80%? + Transformers
12:16 Copyright Models + Zuck Warning
14:12 Interesting Climate in AI
16:48 Jailbreaking still an Issue
17:42 GenAI Actual Harms DeepMind
18:37 Predictions
00:00:00.000 |
This morning the 212 page State of AI Report 2024 was released and as I did last year, 00:00:07.360 |
I want to bring you just the highlights. Yes, I read it all and it's been going 6 years and it 00:00:13.600 |
gives a powerful framework around which I can also weave in some of the most interesting developments 00:00:19.600 |
of the past few days. So now that I am done with the gratuitous special effects courtesy of a new 00:00:25.760 |
Pika AI tool that I upgraded my account to bring you guys, here is the report. The report has been 00:00:31.680 |
going 6 years and it's from Airstreet Capital. These are of course my highlights, you will find 00:00:37.120 |
plenty more if you read the full report yourself linked in the description. But this first highlight 00:00:43.360 |
and this page I suspect initially read "OpenAI's reign of terror came to an end" and the authors, 00:00:49.520 |
and they can correct me if they are watching, probably had to edit that with a comma until 00:00:54.720 |
when 01 came out. Essentially this page is stating that models like Claude, 3.5 Sonic, Grok2, Gemini 00:01:01.280 |
1.5 caught up with GPT-4. All models are converging they thought because of the heavy overlaps in 00:01:08.000 |
pre-training data. Before we get to the research section though, I covered their 2023 predictions 00:01:14.400 |
in a video from about a year ago. Their predictions are on the left and their rating of their own 00:01:19.360 |
predictions is on the right. There was one prediction that they rated as having failed 00:01:24.160 |
that I think is pretty harsh. They said the Gen AI scaling craze would see a group spending over 00:01:30.480 |
a billion dollars to train a single large scale model and the rating was let's give that another 00:01:36.800 |
year. But just yesterday we got this from the information which is that OpenAI's projected 00:01:41.680 |
2024 costs for training a model is 3 billion dollars. Note that that does not include research 00:01:49.040 |
compute amortization as in training miniature models to see what works. This is full-on training 00:01:54.640 |
of frontier models. I would be pretty surprised if the full 01 model if you count the base model 00:02:00.400 |
generator and the fine-tuned final model didn't cost at least 1 billion dollars. That report in 00:02:06.160 |
the information by the way said that OpenAI likely won't turn a profit until 2029 according 00:02:12.800 |
to internal sources. Oh and if you thought 1 billion or even 3 billion was quite a lot for 00:02:18.640 |
training LLMs training chatbots you might think of them well how about almost 10 billion a year 00:02:24.240 |
in 2026. Note that doesn't include the playing around research compute costs of more than 5 00:02:31.440 |
billion dollars in that same year. Just imagine how many experiments or tinkering about 5 billion 00:02:37.440 |
dollars can bring you. The next highlight for me was the section on multi-modality and Meta's 00:02:43.120 |
movie gen. We can't actually play about with the model which is why I haven't done a whole separate 00:02:47.760 |
video on it but I will play this 5 second extract because it is impressive how it can produce audio 00:02:53.600 |
at the same time as video. You might have caught there the chainsaw sound effect when she was doing 00:03:03.840 |
the garden and the car racing sounds as the car was racing on the carpet. To be honest compared 00:03:09.760 |
to reading the movie gen paper I found it more fun to play about with tools like Pika AI which 00:03:16.080 |
are available now and all you have to do is upload an image and then pick a Pika effect like melt or 00:03:22.720 |
explode or squish and you get these amazing little clips. Look at Mu Deng here picked up and squished 00:03:29.600 |
all my AI insiders on Patreon the logo look at this boom there is the full screen explosion 00:03:36.000 |
and I think that is pretty cool. Now of course I'm not going to ignore the fact that the Nobel 00:03:39.920 |
prize in physics was won by a pair of neural network scientists, AI scientists and likewise 00:03:46.720 |
Sir Demis Hassabis of Google DeepMind fame co-won the Nobel prize in chemistry for AlphaFold. One 00:03:53.760 |
commentator said this represents AI eating science and it's hard to disagree with that. In fact now 00:03:59.600 |
I think of it I wonder what the odds in prediction markets are for most Nobel prizes in the sciences 00:04:05.360 |
in the 2030s being won by people involved in AI. Obviously some of you may argue that the AI itself 00:04:12.960 |
should win the prize. Actually that reminds me on a complete tangent it's kind of related to Google 00:04:17.760 |
DeepMind I saw this job advert for Google DeepMind here it is it's a new one it's a research scientist 00:04:24.160 |
job position but you need to have where is it down here you need to have a deep interest in 00:04:30.640 |
machine cognition and consciousness. So maybe the prospect of AI winning the Nobel prize isn't so 00:04:38.720 |
far-fetched. I will also note that almost all of these figures have issued stringent warnings about 00:04:45.040 |
the future of AI. Sir Demis Hassabis comparing the threat of AI going wrong to nuclear war. John 00:04:52.240 |
Hothfeld has recently issued a warning about the world turning into something like 1984 because of 00:04:58.320 |
AI controlling the narrative. And of course Geoffrey Hinton has issued a myriad warnings 00:05:03.440 |
about the inherent superiority of artificial intelligence compared to human intelligence. 00:05:08.880 |
I watched the interviews that he did today and yesterday and I did pick out these two or three 00:05:13.280 |
highlights. First he is proud of his former student Ilya Sutskova firing Sam Altman. I'm 00:05:19.760 |
particularly proud of the fact that one of my students fired Sam Altman and I think I better 00:05:24.240 |
leave it there and leave it for questions. Can you please elaborate on your comment earlier on 00:05:29.040 |
the call about Sam Altman? OpenAI was set up with a big emphasis on safety. Its primary objective was 00:05:36.720 |
to develop artificial general intelligence and ensure that it was safe. One of my former students 00:05:42.480 |
Ilya Sutskova was the chief scientist and over time it turned out that Sam Altman was much less 00:05:50.480 |
concerned with safety than with profits and I think that's unfortunate. In a nutshell though, 00:05:58.560 |
here was his warning. Most of the top researchers I know believe that AI will become more intelligent 00:06:06.880 |
than people. They vary on the time scales. A lot of them believe that that will happen sometime in 00:06:12.400 |
the next 20 years. Some of them believe it will happen sooner. Some of them believe it will take 00:06:17.200 |
much longer, but quite a few good researchers believe that sometime in the next 20 years, 00:06:22.400 |
AI will become more intelligent than us and we need to think hard about what happens then. 00:06:28.000 |
My guess is it will probably happen sometime between 5 and 20 years from now. It might be 00:06:33.680 |
longer. There's a very small chance it will be sooner and we don't know what's going to happen 00:06:38.640 |
then. So if you look around, there are very few examples of more intelligent things being 00:06:44.400 |
controlled by less intelligent things which makes you wonder whether when AI gets smarter than us 00:06:49.520 |
it's going to take over control. One thing I am pretty confident in is that a world of superior 00:06:55.840 |
artificial intelligence will be a hell of a lot weirder than our one. Here's an example from page 00:07:02.240 |
54 about BrainLM which I'll be honest I hadn't even heard of. I did read the BrainLM paper after 00:07:08.880 |
this though because this line caught my eye. This model can be fine-tuned to predict clinical 00:07:15.280 |
variables e.g. age and anxiety disorders better than other methods. In simple terms, it can read 00:07:22.720 |
your brain activity and predict better than almost any other method whether you have, for example, 00:07:28.720 |
mental health challenges. That is not actually something I knew existed and the paper is also 00:07:33.680 |
very interesting. Not only by the way is BrainLM inspired by natural language models, it leverages 00:07:40.400 |
a transformer-based architecture. They mask future states allowing the model to do self-supervised 00:07:46.080 |
training and predict what comes next if that rings a bell. With enough data and pre-training, 00:07:50.720 |
BrainLM can predict future brain states and decode cognitive variables and possibly most 00:07:58.240 |
impactfully, although those other two are pretty impactful, it can do in-silico perturbation analysis. 00:08:04.160 |
In other words, imagine testing medications for depression in silicon on GPUs rather than 00:08:10.320 |
initially with patients. BrainLM has the ability to simulate brain responses in a biologically 00:08:16.880 |
meaningful manner. The next highlight is a fairly quick one and you may have heard of AlphaFold3 00:08:22.400 |
from Google DeepMind about predicting the structure of proteins, DNA, and more. But I didn't know that 00:08:29.040 |
there was Chai-1 from Chai Discovery backed by OpenAI which is an open-source alternative, 00:08:36.400 |
nor did I know that its performance in certain domains is comparable or superior to AlphaFold3. 00:08:43.600 |
AlphaFold3, don't forget, has not been fully open-sourced. One other highlight I'm going to 00:08:48.240 |
pick out on page 96, I'm going to choose not because it's particularly revealing, 00:08:52.800 |
but because it exemplifies the kind of stacking accelerations that we're experiencing. 00:08:58.080 |
The chart below shows the generations of NVIDIA data center GPUs and you can see the number of 00:09:04.640 |
months between releases is gradually on average declining. You may wonder about the next generation 00:09:10.960 |
which is apparently the Rubin R100 which is going to come in late 2025. So yes, we have the release 00:09:17.360 |
dates coming closer and closer together on average. But also as you can see from the right line, 00:09:23.120 |
the accelerating number of teraflops within each GPU. You can think of this as charting 00:09:28.960 |
the thousands of trillions of floating point operations or calculations per second per GPU. 00:09:35.520 |
That's already fairly accelerated, right? Until you learn that the number of GPUs that we're 00:09:40.960 |
clustering, not just in one data center, but combining data centers is also massively increasing. 00:09:46.480 |
And of course, the amount of money that people are spending on these GPUs. 00:09:50.560 |
And that's all before we consider the algorithmic efficiencies within models like the O1 series. 00:09:56.240 |
This is why, and I've said this before on the channel, I think the next two years of progress 00:10:00.560 |
is pretty much baked in. I did a massive analysis video on my Patreon, but suffice to say the next 00:10:06.560 |
10,000X of scale is pretty much baked in. Which brings me to the next highlight and it's a fairly 00:10:12.400 |
quick one, but China wants some of that scale. I just thought you guys might enjoy this quick 00:10:16.960 |
anecdote given that H100s aren't allowed to be sold to China. But a Malaysian broker had a way 00:10:23.200 |
of sticking to those rules, kind of. NVIDIA coordinated the rental, installation and 00:10:28.800 |
activation of servers based in a Malaysian town adjacent to the Singapore border. NVIDIA 00:10:34.800 |
inspectors checked the servers there and left. Shortly afterward, the servers were whisked 00:10:39.120 |
away to China via Hong Kong. Depending on the modality, China, it seems to me, 00:10:44.000 |
is between 3 and 12 months behind the frontier, but not much more than that. 00:10:49.280 |
I'm not drawing any conclusions, it was just interesting for me to read this and hopefully 00:10:53.280 |
for you too. Speaking of renting and costs, there is this chart going around about the 00:10:59.360 |
100X drop in price of frontier models from OpenAI. Or with Anthropic, a 60X drop from the cost of 00:11:06.160 |
Cloud 3 Opus to Cloud 3 Haiku. But I've never believed those kind of drops because we're not 00:11:11.040 |
comparing like for like. In SimpleBench performance for example, Opus is just streets ahead of Haiku, 00:11:17.680 |
it's not even close. But on page 110, the report had this comparison which for me feels a bit more 00:11:23.840 |
accurate to what has been achieved and it is still dramatic. Even comparing quote the same model, 00:11:28.960 |
Gemini 1.5 Pro, from launch to second half of 2024, it's a 76% price cut. Likewise for Gemini 00:11:37.040 |
1.5 Flash, it's an 86% price cut. This remember is for roughly equivalent, if not superior 00:11:43.600 |
performance. So if we try to keep the average bar of performance the same, that feels about right, 00:11:49.840 |
an 80% cut in price for the same actual performance. We wouldn't want to extrapolate too far, 00:11:55.680 |
but if that rate of price cut carried on for 1, 2, 5 years, it would be a wild world we would live 00:12:02.880 |
in. And yes, by the way, it would be a world largely dominated I think still by Transformers. 00:12:08.320 |
As we've seen with BrainLM and O1, the full capabilities of Transformers are still being 00:12:14.160 |
discovered and they represent a three quarters market share. Next highlight is a quick one on 00:12:19.600 |
copyright and we all know probably that OpenAI transcribed millions of hours of YouTube videos 00:12:25.200 |
to power its Whisper model. Did you know that RunwayML and NVIDIA also mass scraped YouTube? 00:12:31.200 |
But I thought some of you might be interested in the fact that people are trying to create 00:12:34.400 |
business models now, like Calliope Networks, so that creators can sell their YouTube videos 00:12:39.840 |
to be scraped. Basically get paid for what's already happening for free. I know there are 00:12:44.160 |
plenty of YouTube creators who wouldn't mind a tiny slice of that hundred billion in revenue 00:12:49.360 |
that OpenAI are projecting. However, we probably should bear in mind Mark Zuckerberg's words, 00:12:53.760 |
which are that creators and publishers overestimate the value of their work for training AI. 00:12:58.800 |
Obviously it's too long to go into here, but I think O1 slightly justifies this claim. It seems 00:13:04.960 |
to me, and obviously this is a very early take, but as long as there is some data within the 00:13:10.000 |
model that pertains to the kind of reasoning that must be done to answer a particular problem, 00:13:15.520 |
the O1 method will find it. It doesn't necessarily need abundant examples, just some. I will likely 00:13:21.360 |
not be making that point though. I'm just wondering if anyone offers me a check for 00:13:25.120 |
scraping my old YouTube videos. By the way, it's not just videos. Teams are working on 00:13:29.760 |
getting authors paid for their work's use by AI companies. Now I've just had a thought that 00:13:35.520 |
if it were possible somehow to reverse engineer the data sources that were most responsible for 00:13:43.200 |
model performance, I don't think any of these companies would have an incentive to find out 00:13:48.240 |
those data sources. While it's all just this mass aggregated data scraped from the internet, 00:13:53.360 |
there's little kind of clear responsibility to pay this person or that. I'm basically saying 00:13:58.240 |
that the research talent that is probably most able to do the kind of engineering necessary to 00:14:03.760 |
ascertain the most useful unique sources of data for the models is the very same talent least 00:14:09.520 |
incentivized to do that research. Anyway, speaking of kind of externalities like copyright payment, 00:14:14.800 |
how about environmental effects? Well, all of these companies, as I have covered before, 00:14:19.200 |
made various pledges saying that they're going to be completely carbon neutral. I should probably 00:14:23.280 |
reflect on my use of the word pledges there. They're more kind of aspirational brainstorming 00:14:28.240 |
notes without making things too complicated. Power usage because of AI, among other things, 00:14:33.280 |
is going to skyrocket. Now bear with me though. That doesn't matter according to Eric Schmidt, 00:14:37.840 |
formerly CEO of Google, because we weren't going to hit our targets anyway. My own opinion is that 00:14:43.600 |
we're not going to hit the climate goals anyway because we're not organized to do it. And the 00:14:50.080 |
way to do it is with the things that we're talking about now. And yes, the needs in this area will be 00:14:57.840 |
a problem, but I'd rather bet on AI solving the problem than constraining it and having the 00:15:03.120 |
problem. Now, just in case that doesn't reassure you, here is Sam Altman and Ilya Sutskever back 00:15:08.960 |
when they still worked together saying that AI will solve climate change. I don't want to say 00:15:14.400 |
this because climate change is so serious and so hard of a problem, but I think once we have 00:15:19.920 |
a really powerful super intelligence, addressing climate change will not be particularly difficult 00:15:26.000 |
for a system like that. We can even explain how. 00:15:35.520 |
Here's how you solve climate change. You need a very large amount of efficient carbon capture. 00:15:41.520 |
You need the energy for the carbon capture, you need the technology to build it, and you need 00:15:44.720 |
to build a lot of it. If you can accelerate the scientific progress, which is something that a 00:15:50.240 |
powerful AI could do, we could get to a very advanced carbon capture much faster. We could 00:15:55.520 |
get to a very cheap power much faster. We could get to cheaper manufacturing much faster. Now 00:16:01.440 |
combine those three. Cheap power, cheap manufacturing, advanced carbon capture. 00:16:06.000 |
Now you build lots of them. And now you sucked out all this, all the excess CO2 from the atmosphere. 00:16:11.840 |
You know, if you think about a system where you can say, 00:16:13.840 |
tell me how to make a lot of clean energy cheaply. 00:16:17.200 |
With one addition that not only you ask it to tell it, you ask it to do it. 00:16:22.160 |
Of course, I don't want to be too facetious. AI genuinely does lead to efficiencies which 00:16:28.960 |
reduce energy usage. I just thought that it might be worth flagging that until a super intelligent 00:16:35.040 |
AI solves climate change, AI data centers might raise your electricity bills and increase the 00:16:40.720 |
risk of blackout. For many of you, this will be just a trivial externality. For others, 00:16:46.320 |
pretty annoying. Drawing to an end now, just three more points. The first is a quick one, 00:16:50.880 |
and it's basically this. In a nutshell, jailbreaking has not been solved. There was 00:16:55.280 |
page after page after page proving that many, many jailbreaking techniques still work. 00:17:01.040 |
Stealth attacks, sleeper agents, instruction hierarchies being compromised within hours, 00:17:07.280 |
and on and on and on. I just wanted to summarize for anyone who wasn't up to date with this, 00:17:12.480 |
that jailbreaking has definitely not been solved. It brings to mind early last year 00:17:17.120 |
when some researchers thought that jailbreaking was essentially a solved problem. Amadei, 00:17:22.400 |
to his credit, said at the time, this is 2023, "We are finding new jailbreaks every day. People 00:17:28.080 |
jailbreak Claude. They jailbreak the other models. I'm deeply concerned that in two to three years, 00:17:32.800 |
that would be one to two years from now, a jailbreak could be life or death." 00:17:37.280 |
But as page 203 points out, are we focused on the wrong harms? And this was such an interesting chart 00:17:44.640 |
on the right of the actual misuses of Gen AI that I tried to find the original source. I will say, 00:17:51.120 |
as a slight critique of this report, they do make finding the original sources a bit harder than I 00:17:56.400 |
would have thought. But anyway, it was this June paper from Google DeepMind. Generative AI misuse, 00:18:01.760 |
a taxonomy of tactics and insights from real world data. So in short, these are the actual misuses 00:18:08.400 |
that are happening now with Gen AI. I'll put the definitions up on screen, but you can see 00:18:14.080 |
that the most frequent one is impersonating people. Like those robocalls in New Hampshire 00:18:19.360 |
pretending to be Joe Biden, but obviously not being from him. Or generating non-consensual 00:18:24.320 |
intimate images. Obviously the chart when we get AGI and ASI, artificial superintelligence, 00:18:30.400 |
might look very different, but for 2024, these are the current harms. 00:18:35.120 |
Last, of course, I had to end on the report's predictions for next year. And I do have one 00:18:40.720 |
slight critique of these predictions. There are a lot of unquantitative words thrown in. Frontier 00:18:47.360 |
labs will implement meaningful changes to data collection practices. Well, who defines meaningful? 00:18:53.840 |
Lawmakers will worry that they've overreached with the EU AI Act. This one is more firm. 00:19:00.160 |
An open source alternative to OpenAI 01 surpasses it across a range of reasoning benchmarks. 00:19:05.360 |
That one, I'm going to go with probably not. That's a close one. I think end of 2025, 00:19:10.960 |
it might get really close. But 01 is going to be something quite special. Even 01 preview is 00:19:16.320 |
already. Yes, OpenAI have given a lot of hints about how they did it. But I will stick my neck 00:19:21.760 |
out and say that an open source alternative won't surpass 01. Maybe LLAMA 4 mid to late next year 00:19:28.560 |
gets close, but I don't think surpasses it. Anyway, back to subjective words. We've got 00:19:32.880 |
challenges failing to make any meaningful dent. Well, what does that mean in NVIDIA's market 00:19:37.840 |
position? Levels of investment in humanoid robots will trail off. Does that mean go down or just 00:19:42.960 |
increase less quickly? I don't think levels of investment in humanoid robots will go down for 00:19:48.320 |
sure. I think it will keep going up. Nine is obviously a very firm prediction, which is a 00:19:53.120 |
research paper generated by an AI scientist is accepted at a major ML conference or workshop. 00:19:58.560 |
I'm going to say no to that one. So that one was a clear, objective prediction, and I'm going to 00:20:03.680 |
take the other side of it. I'll give you one quick reason why, and you can let me know what you think 00:20:07.760 |
in the comments. But I think the authors of the paper will have to make it clear that an AI wrote 00:20:12.800 |
it for ethical reasons, and then the conference will likely reject it even if it's a good paper 00:20:18.000 |
because it was done by AI. That's even assuming it's good enough for a proper major ML conference, 00:20:24.000 |
which I don't think it would be not in 2025. 2027, very different story. And again, we have 00:20:29.680 |
the language, a video game based around interacting with gen AI based elements will achieve breakout 00:20:34.400 |
status. Well, what does that mean? Before I make my own cheeky prediction, here's how I ended 00:20:39.120 |
last year's video. Now, if I'm criticizing them, it would be remiss of me not to end the video with 00:20:44.080 |
my own prediction. I predict that a model will be released in the next year that will break 00:20:49.200 |
state-of-the-art benchmarks across at least four different modalities simultaneously. I would claim 00:20:54.880 |
that was passed with GPC 4.0 in May, which broke certain benchmarks across text domains and in 00:21:01.440 |
audio, vision understanding, and other domains. But what's my prediction, which I think at the 00:21:06.480 |
very least is quantitative, that open AI's valuation will double again next year, absent 00:21:13.600 |
an invasion of Taiwan by China. When you combine the O-1 method with a GPT-5 or Orion scale model, 00:21:20.240 |
amazing things will happen, and I think the world will hear about it. Thank you so much 00:21:25.360 |
for watching. Would love to see you over on Patreon, but regardless, have a wonderful day.