This morning the 212 page State of AI Report 2024 was released and as I did last year, I want to bring you just the highlights. Yes, I read it all and it's been going 6 years and it gives a powerful framework around which I can also weave in some of the most interesting developments of the past few days.
So now that I am done with the gratuitous special effects courtesy of a new Pika AI tool that I upgraded my account to bring you guys, here is the report. The report has been going 6 years and it's from Airstreet Capital. These are of course my highlights, you will find plenty more if you read the full report yourself linked in the description.
But this first highlight and this page I suspect initially read "OpenAI's reign of terror came to an end" and the authors, and they can correct me if they are watching, probably had to edit that with a comma until when 01 came out. Essentially this page is stating that models like Claude, 3.5 Sonic, Grok2, Gemini 1.5 caught up with GPT-4.
All models are converging they thought because of the heavy overlaps in pre-training data. Before we get to the research section though, I covered their 2023 predictions in a video from about a year ago. Their predictions are on the left and their rating of their own predictions is on the right.
There was one prediction that they rated as having failed that I think is pretty harsh. They said the Gen AI scaling craze would see a group spending over a billion dollars to train a single large scale model and the rating was let's give that another year. But just yesterday we got this from the information which is that OpenAI's projected 2024 costs for training a model is 3 billion dollars.
Note that that does not include research compute amortization as in training miniature models to see what works. This is full-on training of frontier models. I would be pretty surprised if the full 01 model if you count the base model generator and the fine-tuned final model didn't cost at least 1 billion dollars.
That report in the information by the way said that OpenAI likely won't turn a profit until 2029 according to internal sources. Oh and if you thought 1 billion or even 3 billion was quite a lot for training LLMs training chatbots you might think of them well how about almost 10 billion a year in 2026.
Note that doesn't include the playing around research compute costs of more than 5 billion dollars in that same year. Just imagine how many experiments or tinkering about 5 billion dollars can bring you. The next highlight for me was the section on multi-modality and Meta's movie gen. We can't actually play about with the model which is why I haven't done a whole separate video on it but I will play this 5 second extract because it is impressive how it can produce audio at the same time as video.
You might have caught there the chainsaw sound effect when she was doing the garden and the car racing sounds as the car was racing on the carpet. To be honest compared to reading the movie gen paper I found it more fun to play about with tools like Pika AI which are available now and all you have to do is upload an image and then pick a Pika effect like melt or explode or squish and you get these amazing little clips.
Look at Mu Deng here picked up and squished all my AI insiders on Patreon the logo look at this boom there is the full screen explosion and I think that is pretty cool. Now of course I'm not going to ignore the fact that the Nobel prize in physics was won by a pair of neural network scientists, AI scientists and likewise Sir Demis Hassabis of Google DeepMind fame co-won the Nobel prize in chemistry for AlphaFold.
One commentator said this represents AI eating science and it's hard to disagree with that. In fact now I think of it I wonder what the odds in prediction markets are for most Nobel prizes in the sciences in the 2030s being won by people involved in AI. Obviously some of you may argue that the AI itself should win the prize.
Actually that reminds me on a complete tangent it's kind of related to Google DeepMind I saw this job advert for Google DeepMind here it is it's a new one it's a research scientist job position but you need to have where is it down here you need to have a deep interest in machine cognition and consciousness.
So maybe the prospect of AI winning the Nobel prize isn't so far-fetched. I will also note that almost all of these figures have issued stringent warnings about the future of AI. Sir Demis Hassabis comparing the threat of AI going wrong to nuclear war. John Hothfeld has recently issued a warning about the world turning into something like 1984 because of AI controlling the narrative.
And of course Geoffrey Hinton has issued a myriad warnings about the inherent superiority of artificial intelligence compared to human intelligence. I watched the interviews that he did today and yesterday and I did pick out these two or three highlights. First he is proud of his former student Ilya Sutskova firing Sam Altman.
I'm particularly proud of the fact that one of my students fired Sam Altman and I think I better leave it there and leave it for questions. Can you please elaborate on your comment earlier on the call about Sam Altman? OpenAI was set up with a big emphasis on safety.
Its primary objective was to develop artificial general intelligence and ensure that it was safe. One of my former students Ilya Sutskova was the chief scientist and over time it turned out that Sam Altman was much less concerned with safety than with profits and I think that's unfortunate. In a nutshell though, here was his warning.
Most of the top researchers I know believe that AI will become more intelligent than people. They vary on the time scales. A lot of them believe that that will happen sometime in the next 20 years. Some of them believe it will happen sooner. Some of them believe it will take much longer, but quite a few good researchers believe that sometime in the next 20 years, AI will become more intelligent than us and we need to think hard about what happens then.
My guess is it will probably happen sometime between 5 and 20 years from now. It might be longer. There's a very small chance it will be sooner and we don't know what's going to happen then. So if you look around, there are very few examples of more intelligent things being controlled by less intelligent things which makes you wonder whether when AI gets smarter than us it's going to take over control.
One thing I am pretty confident in is that a world of superior artificial intelligence will be a hell of a lot weirder than our one. Here's an example from page 54 about BrainLM which I'll be honest I hadn't even heard of. I did read the BrainLM paper after this though because this line caught my eye.
This model can be fine-tuned to predict clinical variables e.g. age and anxiety disorders better than other methods. In simple terms, it can read your brain activity and predict better than almost any other method whether you have, for example, mental health challenges. That is not actually something I knew existed and the paper is also very interesting.
Not only by the way is BrainLM inspired by natural language models, it leverages a transformer-based architecture. They mask future states allowing the model to do self-supervised training and predict what comes next if that rings a bell. With enough data and pre-training, BrainLM can predict future brain states and decode cognitive variables and possibly most impactfully, although those other two are pretty impactful, it can do in-silico perturbation analysis.
In other words, imagine testing medications for depression in silicon on GPUs rather than initially with patients. BrainLM has the ability to simulate brain responses in a biologically meaningful manner. The next highlight is a fairly quick one and you may have heard of AlphaFold3 from Google DeepMind about predicting the structure of proteins, DNA, and more.
But I didn't know that there was Chai-1 from Chai Discovery backed by OpenAI which is an open-source alternative, nor did I know that its performance in certain domains is comparable or superior to AlphaFold3. AlphaFold3, don't forget, has not been fully open-sourced. One other highlight I'm going to pick out on page 96, I'm going to choose not because it's particularly revealing, but because it exemplifies the kind of stacking accelerations that we're experiencing.
The chart below shows the generations of NVIDIA data center GPUs and you can see the number of months between releases is gradually on average declining. You may wonder about the next generation which is apparently the Rubin R100 which is going to come in late 2025. So yes, we have the release dates coming closer and closer together on average.
But also as you can see from the right line, the accelerating number of teraflops within each GPU. You can think of this as charting the thousands of trillions of floating point operations or calculations per second per GPU. That's already fairly accelerated, right? Until you learn that the number of GPUs that we're clustering, not just in one data center, but combining data centers is also massively increasing.
And of course, the amount of money that people are spending on these GPUs. And that's all before we consider the algorithmic efficiencies within models like the O1 series. This is why, and I've said this before on the channel, I think the next two years of progress is pretty much baked in.
I did a massive analysis video on my Patreon, but suffice to say the next 10,000X of scale is pretty much baked in. Which brings me to the next highlight and it's a fairly quick one, but China wants some of that scale. I just thought you guys might enjoy this quick anecdote given that H100s aren't allowed to be sold to China.
But a Malaysian broker had a way of sticking to those rules, kind of. NVIDIA coordinated the rental, installation and activation of servers based in a Malaysian town adjacent to the Singapore border. NVIDIA inspectors checked the servers there and left. Shortly afterward, the servers were whisked away to China via Hong Kong.
Depending on the modality, China, it seems to me, is between 3 and 12 months behind the frontier, but not much more than that. I'm not drawing any conclusions, it was just interesting for me to read this and hopefully for you too. Speaking of renting and costs, there is this chart going around about the 100X drop in price of frontier models from OpenAI.
Or with Anthropic, a 60X drop from the cost of Cloud 3 Opus to Cloud 3 Haiku. But I've never believed those kind of drops because we're not comparing like for like. In SimpleBench performance for example, Opus is just streets ahead of Haiku, it's not even close. But on page 110, the report had this comparison which for me feels a bit more accurate to what has been achieved and it is still dramatic.
Even comparing quote the same model, Gemini 1.5 Pro, from launch to second half of 2024, it's a 76% price cut. Likewise for Gemini 1.5 Flash, it's an 86% price cut. This remember is for roughly equivalent, if not superior performance. So if we try to keep the average bar of performance the same, that feels about right, an 80% cut in price for the same actual performance.
We wouldn't want to extrapolate too far, but if that rate of price cut carried on for 1, 2, 5 years, it would be a wild world we would live in. And yes, by the way, it would be a world largely dominated I think still by Transformers. As we've seen with BrainLM and O1, the full capabilities of Transformers are still being discovered and they represent a three quarters market share.
Next highlight is a quick one on copyright and we all know probably that OpenAI transcribed millions of hours of YouTube videos to power its Whisper model. Did you know that RunwayML and NVIDIA also mass scraped YouTube? But I thought some of you might be interested in the fact that people are trying to create business models now, like Calliope Networks, so that creators can sell their YouTube videos to be scraped.
Basically get paid for what's already happening for free. I know there are plenty of YouTube creators who wouldn't mind a tiny slice of that hundred billion in revenue that OpenAI are projecting. However, we probably should bear in mind Mark Zuckerberg's words, which are that creators and publishers overestimate the value of their work for training AI.
Obviously it's too long to go into here, but I think O1 slightly justifies this claim. It seems to me, and obviously this is a very early take, but as long as there is some data within the model that pertains to the kind of reasoning that must be done to answer a particular problem, the O1 method will find it.
It doesn't necessarily need abundant examples, just some. I will likely not be making that point though. I'm just wondering if anyone offers me a check for scraping my old YouTube videos. By the way, it's not just videos. Teams are working on getting authors paid for their work's use by AI companies.
Now I've just had a thought that if it were possible somehow to reverse engineer the data sources that were most responsible for model performance, I don't think any of these companies would have an incentive to find out those data sources. While it's all just this mass aggregated data scraped from the internet, there's little kind of clear responsibility to pay this person or that.
I'm basically saying that the research talent that is probably most able to do the kind of engineering necessary to ascertain the most useful unique sources of data for the models is the very same talent least incentivized to do that research. Anyway, speaking of kind of externalities like copyright payment, how about environmental effects?
Well, all of these companies, as I have covered before, made various pledges saying that they're going to be completely carbon neutral. I should probably reflect on my use of the word pledges there. They're more kind of aspirational brainstorming notes without making things too complicated. Power usage because of AI, among other things, is going to skyrocket.
Now bear with me though. That doesn't matter according to Eric Schmidt, formerly CEO of Google, because we weren't going to hit our targets anyway. My own opinion is that we're not going to hit the climate goals anyway because we're not organized to do it. And the way to do it is with the things that we're talking about now.
And yes, the needs in this area will be a problem, but I'd rather bet on AI solving the problem than constraining it and having the problem. Now, just in case that doesn't reassure you, here is Sam Altman and Ilya Sutskever back when they still worked together saying that AI will solve climate change.
I don't want to say this because climate change is so serious and so hard of a problem, but I think once we have a really powerful super intelligence, addressing climate change will not be particularly difficult for a system like that. We can even explain how. Here's how you solve climate change.
You need a very large amount of efficient carbon capture. You need the energy for the carbon capture, you need the technology to build it, and you need to build a lot of it. If you can accelerate the scientific progress, which is something that a powerful AI could do, we could get to a very advanced carbon capture much faster.
We could get to a very cheap power much faster. We could get to cheaper manufacturing much faster. Now combine those three. Cheap power, cheap manufacturing, advanced carbon capture. Now you build lots of them. And now you sucked out all this, all the excess CO2 from the atmosphere. You know, if you think about a system where you can say, tell me how to make a lot of clean energy cheaply.
With one addition that not only you ask it to tell it, you ask it to do it. Of course, I don't want to be too facetious. AI genuinely does lead to efficiencies which reduce energy usage. I just thought that it might be worth flagging that until a super intelligent AI solves climate change, AI data centers might raise your electricity bills and increase the risk of blackout.
For many of you, this will be just a trivial externality. For others, pretty annoying. Drawing to an end now, just three more points. The first is a quick one, and it's basically this. In a nutshell, jailbreaking has not been solved. There was page after page after page proving that many, many jailbreaking techniques still work.
Stealth attacks, sleeper agents, instruction hierarchies being compromised within hours, and on and on and on. I just wanted to summarize for anyone who wasn't up to date with this, that jailbreaking has definitely not been solved. It brings to mind early last year when some researchers thought that jailbreaking was essentially a solved problem.
Amadei, to his credit, said at the time, this is 2023, "We are finding new jailbreaks every day. People jailbreak Claude. They jailbreak the other models. I'm deeply concerned that in two to three years, that would be one to two years from now, a jailbreak could be life or death." But as page 203 points out, are we focused on the wrong harms?
And this was such an interesting chart on the right of the actual misuses of Gen AI that I tried to find the original source. I will say, as a slight critique of this report, they do make finding the original sources a bit harder than I would have thought. But anyway, it was this June paper from Google DeepMind.
Generative AI misuse, a taxonomy of tactics and insights from real world data. So in short, these are the actual misuses that are happening now with Gen AI. I'll put the definitions up on screen, but you can see that the most frequent one is impersonating people. Like those robocalls in New Hampshire pretending to be Joe Biden, but obviously not being from him.
Or generating non-consensual intimate images. Obviously the chart when we get AGI and ASI, artificial superintelligence, might look very different, but for 2024, these are the current harms. Last, of course, I had to end on the report's predictions for next year. And I do have one slight critique of these predictions.
There are a lot of unquantitative words thrown in. Frontier labs will implement meaningful changes to data collection practices. Well, who defines meaningful? Lawmakers will worry that they've overreached with the EU AI Act. This one is more firm. An open source alternative to OpenAI 01 surpasses it across a range of reasoning benchmarks.
That one, I'm going to go with probably not. That's a close one. I think end of 2025, it might get really close. But 01 is going to be something quite special. Even 01 preview is already. Yes, OpenAI have given a lot of hints about how they did it. But I will stick my neck out and say that an open source alternative won't surpass 01.
Maybe LLAMA 4 mid to late next year gets close, but I don't think surpasses it. Anyway, back to subjective words. We've got challenges failing to make any meaningful dent. Well, what does that mean in NVIDIA's market position? Levels of investment in humanoid robots will trail off. Does that mean go down or just increase less quickly?
I don't think levels of investment in humanoid robots will go down for sure. I think it will keep going up. Nine is obviously a very firm prediction, which is a research paper generated by an AI scientist is accepted at a major ML conference or workshop. I'm going to say no to that one.
So that one was a clear, objective prediction, and I'm going to take the other side of it. I'll give you one quick reason why, and you can let me know what you think in the comments. But I think the authors of the paper will have to make it clear that an AI wrote it for ethical reasons, and then the conference will likely reject it even if it's a good paper because it was done by AI.
That's even assuming it's good enough for a proper major ML conference, which I don't think it would be not in 2025. 2027, very different story. And again, we have the language, a video game based around interacting with gen AI based elements will achieve breakout status. Well, what does that mean?
Before I make my own cheeky prediction, here's how I ended last year's video. Now, if I'm criticizing them, it would be remiss of me not to end the video with my own prediction. I predict that a model will be released in the next year that will break state-of-the-art benchmarks across at least four different modalities simultaneously.
I would claim that was passed with GPC 4.0 in May, which broke certain benchmarks across text domains and in audio, vision understanding, and other domains. But what's my prediction, which I think at the very least is quantitative, that open AI's valuation will double again next year, absent an invasion of Taiwan by China.
When you combine the O-1 method with a GPT-5 or Orion scale model, amazing things will happen, and I think the world will hear about it. Thank you so much for watching. Would love to see you over on Patreon, but regardless, have a wonderful day.