back to index

AI - 2024AD: 212-page Report (from this morning) Fully Read w/ Highlights


Chapters

0:0 Introduction
0:43 o1 changed the game
1:37 OpenAI costs revealed
2:39 MovieGen and Pika 1.5
3:39 Nobel Prize Winners + job listing
6:51 BrainLM
8:20 Open Chai-1
8:48 Stacked Accelerations
10:10 China Enters Stage Left
10:55 100x Drop in Price? Or 80%? + Transformers
12:16 Copyright Models + Zuck Warning
14:12 Interesting Climate in AI
16:48 Jailbreaking still an Issue
17:42 GenAI Actual Harms DeepMind
18:37 Predictions

Whisper Transcript | Transcript Only Page

00:00:00.000 | This morning the 212 page State of AI Report 2024 was released and as I did last year,
00:00:07.360 | I want to bring you just the highlights. Yes, I read it all and it's been going 6 years and it
00:00:13.600 | gives a powerful framework around which I can also weave in some of the most interesting developments
00:00:19.600 | of the past few days. So now that I am done with the gratuitous special effects courtesy of a new
00:00:25.760 | Pika AI tool that I upgraded my account to bring you guys, here is the report. The report has been
00:00:31.680 | going 6 years and it's from Airstreet Capital. These are of course my highlights, you will find
00:00:37.120 | plenty more if you read the full report yourself linked in the description. But this first highlight
00:00:43.360 | and this page I suspect initially read "OpenAI's reign of terror came to an end" and the authors,
00:00:49.520 | and they can correct me if they are watching, probably had to edit that with a comma until
00:00:54.720 | when 01 came out. Essentially this page is stating that models like Claude, 3.5 Sonic, Grok2, Gemini
00:01:01.280 | 1.5 caught up with GPT-4. All models are converging they thought because of the heavy overlaps in
00:01:08.000 | pre-training data. Before we get to the research section though, I covered their 2023 predictions
00:01:14.400 | in a video from about a year ago. Their predictions are on the left and their rating of their own
00:01:19.360 | predictions is on the right. There was one prediction that they rated as having failed
00:01:24.160 | that I think is pretty harsh. They said the Gen AI scaling craze would see a group spending over
00:01:30.480 | a billion dollars to train a single large scale model and the rating was let's give that another
00:01:36.800 | year. But just yesterday we got this from the information which is that OpenAI's projected
00:01:41.680 | 2024 costs for training a model is 3 billion dollars. Note that that does not include research
00:01:49.040 | compute amortization as in training miniature models to see what works. This is full-on training
00:01:54.640 | of frontier models. I would be pretty surprised if the full 01 model if you count the base model
00:02:00.400 | generator and the fine-tuned final model didn't cost at least 1 billion dollars. That report in
00:02:06.160 | the information by the way said that OpenAI likely won't turn a profit until 2029 according
00:02:12.800 | to internal sources. Oh and if you thought 1 billion or even 3 billion was quite a lot for
00:02:18.640 | training LLMs training chatbots you might think of them well how about almost 10 billion a year
00:02:24.240 | in 2026. Note that doesn't include the playing around research compute costs of more than 5
00:02:31.440 | billion dollars in that same year. Just imagine how many experiments or tinkering about 5 billion
00:02:37.440 | dollars can bring you. The next highlight for me was the section on multi-modality and Meta's
00:02:43.120 | movie gen. We can't actually play about with the model which is why I haven't done a whole separate
00:02:47.760 | video on it but I will play this 5 second extract because it is impressive how it can produce audio
00:02:53.600 | at the same time as video. You might have caught there the chainsaw sound effect when she was doing
00:03:03.840 | the garden and the car racing sounds as the car was racing on the carpet. To be honest compared
00:03:09.760 | to reading the movie gen paper I found it more fun to play about with tools like Pika AI which
00:03:16.080 | are available now and all you have to do is upload an image and then pick a Pika effect like melt or
00:03:22.720 | explode or squish and you get these amazing little clips. Look at Mu Deng here picked up and squished
00:03:29.600 | all my AI insiders on Patreon the logo look at this boom there is the full screen explosion
00:03:36.000 | and I think that is pretty cool. Now of course I'm not going to ignore the fact that the Nobel
00:03:39.920 | prize in physics was won by a pair of neural network scientists, AI scientists and likewise
00:03:46.720 | Sir Demis Hassabis of Google DeepMind fame co-won the Nobel prize in chemistry for AlphaFold. One
00:03:53.760 | commentator said this represents AI eating science and it's hard to disagree with that. In fact now
00:03:59.600 | I think of it I wonder what the odds in prediction markets are for most Nobel prizes in the sciences
00:04:05.360 | in the 2030s being won by people involved in AI. Obviously some of you may argue that the AI itself
00:04:12.960 | should win the prize. Actually that reminds me on a complete tangent it's kind of related to Google
00:04:17.760 | DeepMind I saw this job advert for Google DeepMind here it is it's a new one it's a research scientist
00:04:24.160 | job position but you need to have where is it down here you need to have a deep interest in
00:04:30.640 | machine cognition and consciousness. So maybe the prospect of AI winning the Nobel prize isn't so
00:04:38.720 | far-fetched. I will also note that almost all of these figures have issued stringent warnings about
00:04:45.040 | the future of AI. Sir Demis Hassabis comparing the threat of AI going wrong to nuclear war. John
00:04:52.240 | Hothfeld has recently issued a warning about the world turning into something like 1984 because of
00:04:58.320 | AI controlling the narrative. And of course Geoffrey Hinton has issued a myriad warnings
00:05:03.440 | about the inherent superiority of artificial intelligence compared to human intelligence.
00:05:08.880 | I watched the interviews that he did today and yesterday and I did pick out these two or three
00:05:13.280 | highlights. First he is proud of his former student Ilya Sutskova firing Sam Altman. I'm
00:05:19.760 | particularly proud of the fact that one of my students fired Sam Altman and I think I better
00:05:24.240 | leave it there and leave it for questions. Can you please elaborate on your comment earlier on
00:05:29.040 | the call about Sam Altman? OpenAI was set up with a big emphasis on safety. Its primary objective was
00:05:36.720 | to develop artificial general intelligence and ensure that it was safe. One of my former students
00:05:42.480 | Ilya Sutskova was the chief scientist and over time it turned out that Sam Altman was much less
00:05:50.480 | concerned with safety than with profits and I think that's unfortunate. In a nutshell though,
00:05:58.560 | here was his warning. Most of the top researchers I know believe that AI will become more intelligent
00:06:06.880 | than people. They vary on the time scales. A lot of them believe that that will happen sometime in
00:06:12.400 | the next 20 years. Some of them believe it will happen sooner. Some of them believe it will take
00:06:17.200 | much longer, but quite a few good researchers believe that sometime in the next 20 years,
00:06:22.400 | AI will become more intelligent than us and we need to think hard about what happens then.
00:06:28.000 | My guess is it will probably happen sometime between 5 and 20 years from now. It might be
00:06:33.680 | longer. There's a very small chance it will be sooner and we don't know what's going to happen
00:06:38.640 | then. So if you look around, there are very few examples of more intelligent things being
00:06:44.400 | controlled by less intelligent things which makes you wonder whether when AI gets smarter than us
00:06:49.520 | it's going to take over control. One thing I am pretty confident in is that a world of superior
00:06:55.840 | artificial intelligence will be a hell of a lot weirder than our one. Here's an example from page
00:07:02.240 | 54 about BrainLM which I'll be honest I hadn't even heard of. I did read the BrainLM paper after
00:07:08.880 | this though because this line caught my eye. This model can be fine-tuned to predict clinical
00:07:15.280 | variables e.g. age and anxiety disorders better than other methods. In simple terms, it can read
00:07:22.720 | your brain activity and predict better than almost any other method whether you have, for example,
00:07:28.720 | mental health challenges. That is not actually something I knew existed and the paper is also
00:07:33.680 | very interesting. Not only by the way is BrainLM inspired by natural language models, it leverages
00:07:40.400 | a transformer-based architecture. They mask future states allowing the model to do self-supervised
00:07:46.080 | training and predict what comes next if that rings a bell. With enough data and pre-training,
00:07:50.720 | BrainLM can predict future brain states and decode cognitive variables and possibly most
00:07:58.240 | impactfully, although those other two are pretty impactful, it can do in-silico perturbation analysis.
00:08:04.160 | In other words, imagine testing medications for depression in silicon on GPUs rather than
00:08:10.320 | initially with patients. BrainLM has the ability to simulate brain responses in a biologically
00:08:16.880 | meaningful manner. The next highlight is a fairly quick one and you may have heard of AlphaFold3
00:08:22.400 | from Google DeepMind about predicting the structure of proteins, DNA, and more. But I didn't know that
00:08:29.040 | there was Chai-1 from Chai Discovery backed by OpenAI which is an open-source alternative,
00:08:36.400 | nor did I know that its performance in certain domains is comparable or superior to AlphaFold3.
00:08:43.600 | AlphaFold3, don't forget, has not been fully open-sourced. One other highlight I'm going to
00:08:48.240 | pick out on page 96, I'm going to choose not because it's particularly revealing,
00:08:52.800 | but because it exemplifies the kind of stacking accelerations that we're experiencing.
00:08:58.080 | The chart below shows the generations of NVIDIA data center GPUs and you can see the number of
00:09:04.640 | months between releases is gradually on average declining. You may wonder about the next generation
00:09:10.960 | which is apparently the Rubin R100 which is going to come in late 2025. So yes, we have the release
00:09:17.360 | dates coming closer and closer together on average. But also as you can see from the right line,
00:09:23.120 | the accelerating number of teraflops within each GPU. You can think of this as charting
00:09:28.960 | the thousands of trillions of floating point operations or calculations per second per GPU.
00:09:35.520 | That's already fairly accelerated, right? Until you learn that the number of GPUs that we're
00:09:40.960 | clustering, not just in one data center, but combining data centers is also massively increasing.
00:09:46.480 | And of course, the amount of money that people are spending on these GPUs.
00:09:50.560 | And that's all before we consider the algorithmic efficiencies within models like the O1 series.
00:09:56.240 | This is why, and I've said this before on the channel, I think the next two years of progress
00:10:00.560 | is pretty much baked in. I did a massive analysis video on my Patreon, but suffice to say the next
00:10:06.560 | 10,000X of scale is pretty much baked in. Which brings me to the next highlight and it's a fairly
00:10:12.400 | quick one, but China wants some of that scale. I just thought you guys might enjoy this quick
00:10:16.960 | anecdote given that H100s aren't allowed to be sold to China. But a Malaysian broker had a way
00:10:23.200 | of sticking to those rules, kind of. NVIDIA coordinated the rental, installation and
00:10:28.800 | activation of servers based in a Malaysian town adjacent to the Singapore border. NVIDIA
00:10:34.800 | inspectors checked the servers there and left. Shortly afterward, the servers were whisked
00:10:39.120 | away to China via Hong Kong. Depending on the modality, China, it seems to me,
00:10:44.000 | is between 3 and 12 months behind the frontier, but not much more than that.
00:10:49.280 | I'm not drawing any conclusions, it was just interesting for me to read this and hopefully
00:10:53.280 | for you too. Speaking of renting and costs, there is this chart going around about the
00:10:59.360 | 100X drop in price of frontier models from OpenAI. Or with Anthropic, a 60X drop from the cost of
00:11:06.160 | Cloud 3 Opus to Cloud 3 Haiku. But I've never believed those kind of drops because we're not
00:11:11.040 | comparing like for like. In SimpleBench performance for example, Opus is just streets ahead of Haiku,
00:11:17.680 | it's not even close. But on page 110, the report had this comparison which for me feels a bit more
00:11:23.840 | accurate to what has been achieved and it is still dramatic. Even comparing quote the same model,
00:11:28.960 | Gemini 1.5 Pro, from launch to second half of 2024, it's a 76% price cut. Likewise for Gemini
00:11:37.040 | 1.5 Flash, it's an 86% price cut. This remember is for roughly equivalent, if not superior
00:11:43.600 | performance. So if we try to keep the average bar of performance the same, that feels about right,
00:11:49.840 | an 80% cut in price for the same actual performance. We wouldn't want to extrapolate too far,
00:11:55.680 | but if that rate of price cut carried on for 1, 2, 5 years, it would be a wild world we would live
00:12:02.880 | in. And yes, by the way, it would be a world largely dominated I think still by Transformers.
00:12:08.320 | As we've seen with BrainLM and O1, the full capabilities of Transformers are still being
00:12:14.160 | discovered and they represent a three quarters market share. Next highlight is a quick one on
00:12:19.600 | copyright and we all know probably that OpenAI transcribed millions of hours of YouTube videos
00:12:25.200 | to power its Whisper model. Did you know that RunwayML and NVIDIA also mass scraped YouTube?
00:12:31.200 | But I thought some of you might be interested in the fact that people are trying to create
00:12:34.400 | business models now, like Calliope Networks, so that creators can sell their YouTube videos
00:12:39.840 | to be scraped. Basically get paid for what's already happening for free. I know there are
00:12:44.160 | plenty of YouTube creators who wouldn't mind a tiny slice of that hundred billion in revenue
00:12:49.360 | that OpenAI are projecting. However, we probably should bear in mind Mark Zuckerberg's words,
00:12:53.760 | which are that creators and publishers overestimate the value of their work for training AI.
00:12:58.800 | Obviously it's too long to go into here, but I think O1 slightly justifies this claim. It seems
00:13:04.960 | to me, and obviously this is a very early take, but as long as there is some data within the
00:13:10.000 | model that pertains to the kind of reasoning that must be done to answer a particular problem,
00:13:15.520 | the O1 method will find it. It doesn't necessarily need abundant examples, just some. I will likely
00:13:21.360 | not be making that point though. I'm just wondering if anyone offers me a check for
00:13:25.120 | scraping my old YouTube videos. By the way, it's not just videos. Teams are working on
00:13:29.760 | getting authors paid for their work's use by AI companies. Now I've just had a thought that
00:13:35.520 | if it were possible somehow to reverse engineer the data sources that were most responsible for
00:13:43.200 | model performance, I don't think any of these companies would have an incentive to find out
00:13:48.240 | those data sources. While it's all just this mass aggregated data scraped from the internet,
00:13:53.360 | there's little kind of clear responsibility to pay this person or that. I'm basically saying
00:13:58.240 | that the research talent that is probably most able to do the kind of engineering necessary to
00:14:03.760 | ascertain the most useful unique sources of data for the models is the very same talent least
00:14:09.520 | incentivized to do that research. Anyway, speaking of kind of externalities like copyright payment,
00:14:14.800 | how about environmental effects? Well, all of these companies, as I have covered before,
00:14:19.200 | made various pledges saying that they're going to be completely carbon neutral. I should probably
00:14:23.280 | reflect on my use of the word pledges there. They're more kind of aspirational brainstorming
00:14:28.240 | notes without making things too complicated. Power usage because of AI, among other things,
00:14:33.280 | is going to skyrocket. Now bear with me though. That doesn't matter according to Eric Schmidt,
00:14:37.840 | formerly CEO of Google, because we weren't going to hit our targets anyway. My own opinion is that
00:14:43.600 | we're not going to hit the climate goals anyway because we're not organized to do it. And the
00:14:50.080 | way to do it is with the things that we're talking about now. And yes, the needs in this area will be
00:14:57.840 | a problem, but I'd rather bet on AI solving the problem than constraining it and having the
00:15:03.120 | problem. Now, just in case that doesn't reassure you, here is Sam Altman and Ilya Sutskever back
00:15:08.960 | when they still worked together saying that AI will solve climate change. I don't want to say
00:15:14.400 | this because climate change is so serious and so hard of a problem, but I think once we have
00:15:19.920 | a really powerful super intelligence, addressing climate change will not be particularly difficult
00:15:26.000 | for a system like that. We can even explain how.
00:15:35.520 | Here's how you solve climate change. You need a very large amount of efficient carbon capture.
00:15:41.520 | You need the energy for the carbon capture, you need the technology to build it, and you need
00:15:44.720 | to build a lot of it. If you can accelerate the scientific progress, which is something that a
00:15:50.240 | powerful AI could do, we could get to a very advanced carbon capture much faster. We could
00:15:55.520 | get to a very cheap power much faster. We could get to cheaper manufacturing much faster. Now
00:16:01.440 | combine those three. Cheap power, cheap manufacturing, advanced carbon capture.
00:16:06.000 | Now you build lots of them. And now you sucked out all this, all the excess CO2 from the atmosphere.
00:16:11.840 | You know, if you think about a system where you can say,
00:16:13.840 | tell me how to make a lot of clean energy cheaply.
00:16:17.200 | With one addition that not only you ask it to tell it, you ask it to do it.
00:16:22.160 | Of course, I don't want to be too facetious. AI genuinely does lead to efficiencies which
00:16:28.960 | reduce energy usage. I just thought that it might be worth flagging that until a super intelligent
00:16:35.040 | AI solves climate change, AI data centers might raise your electricity bills and increase the
00:16:40.720 | risk of blackout. For many of you, this will be just a trivial externality. For others,
00:16:46.320 | pretty annoying. Drawing to an end now, just three more points. The first is a quick one,
00:16:50.880 | and it's basically this. In a nutshell, jailbreaking has not been solved. There was
00:16:55.280 | page after page after page proving that many, many jailbreaking techniques still work.
00:17:01.040 | Stealth attacks, sleeper agents, instruction hierarchies being compromised within hours,
00:17:07.280 | and on and on and on. I just wanted to summarize for anyone who wasn't up to date with this,
00:17:12.480 | that jailbreaking has definitely not been solved. It brings to mind early last year
00:17:17.120 | when some researchers thought that jailbreaking was essentially a solved problem. Amadei,
00:17:22.400 | to his credit, said at the time, this is 2023, "We are finding new jailbreaks every day. People
00:17:28.080 | jailbreak Claude. They jailbreak the other models. I'm deeply concerned that in two to three years,
00:17:32.800 | that would be one to two years from now, a jailbreak could be life or death."
00:17:37.280 | But as page 203 points out, are we focused on the wrong harms? And this was such an interesting chart
00:17:44.640 | on the right of the actual misuses of Gen AI that I tried to find the original source. I will say,
00:17:51.120 | as a slight critique of this report, they do make finding the original sources a bit harder than I
00:17:56.400 | would have thought. But anyway, it was this June paper from Google DeepMind. Generative AI misuse,
00:18:01.760 | a taxonomy of tactics and insights from real world data. So in short, these are the actual misuses
00:18:08.400 | that are happening now with Gen AI. I'll put the definitions up on screen, but you can see
00:18:14.080 | that the most frequent one is impersonating people. Like those robocalls in New Hampshire
00:18:19.360 | pretending to be Joe Biden, but obviously not being from him. Or generating non-consensual
00:18:24.320 | intimate images. Obviously the chart when we get AGI and ASI, artificial superintelligence,
00:18:30.400 | might look very different, but for 2024, these are the current harms.
00:18:35.120 | Last, of course, I had to end on the report's predictions for next year. And I do have one
00:18:40.720 | slight critique of these predictions. There are a lot of unquantitative words thrown in. Frontier
00:18:47.360 | labs will implement meaningful changes to data collection practices. Well, who defines meaningful?
00:18:53.840 | Lawmakers will worry that they've overreached with the EU AI Act. This one is more firm.
00:19:00.160 | An open source alternative to OpenAI 01 surpasses it across a range of reasoning benchmarks.
00:19:05.360 | That one, I'm going to go with probably not. That's a close one. I think end of 2025,
00:19:10.960 | it might get really close. But 01 is going to be something quite special. Even 01 preview is
00:19:16.320 | already. Yes, OpenAI have given a lot of hints about how they did it. But I will stick my neck
00:19:21.760 | out and say that an open source alternative won't surpass 01. Maybe LLAMA 4 mid to late next year
00:19:28.560 | gets close, but I don't think surpasses it. Anyway, back to subjective words. We've got
00:19:32.880 | challenges failing to make any meaningful dent. Well, what does that mean in NVIDIA's market
00:19:37.840 | position? Levels of investment in humanoid robots will trail off. Does that mean go down or just
00:19:42.960 | increase less quickly? I don't think levels of investment in humanoid robots will go down for
00:19:48.320 | sure. I think it will keep going up. Nine is obviously a very firm prediction, which is a
00:19:53.120 | research paper generated by an AI scientist is accepted at a major ML conference or workshop.
00:19:58.560 | I'm going to say no to that one. So that one was a clear, objective prediction, and I'm going to
00:20:03.680 | take the other side of it. I'll give you one quick reason why, and you can let me know what you think
00:20:07.760 | in the comments. But I think the authors of the paper will have to make it clear that an AI wrote
00:20:12.800 | it for ethical reasons, and then the conference will likely reject it even if it's a good paper
00:20:18.000 | because it was done by AI. That's even assuming it's good enough for a proper major ML conference,
00:20:24.000 | which I don't think it would be not in 2025. 2027, very different story. And again, we have
00:20:29.680 | the language, a video game based around interacting with gen AI based elements will achieve breakout
00:20:34.400 | status. Well, what does that mean? Before I make my own cheeky prediction, here's how I ended
00:20:39.120 | last year's video. Now, if I'm criticizing them, it would be remiss of me not to end the video with
00:20:44.080 | my own prediction. I predict that a model will be released in the next year that will break
00:20:49.200 | state-of-the-art benchmarks across at least four different modalities simultaneously. I would claim
00:20:54.880 | that was passed with GPC 4.0 in May, which broke certain benchmarks across text domains and in
00:21:01.440 | audio, vision understanding, and other domains. But what's my prediction, which I think at the
00:21:06.480 | very least is quantitative, that open AI's valuation will double again next year, absent
00:21:13.600 | an invasion of Taiwan by China. When you combine the O-1 method with a GPT-5 or Orion scale model,
00:21:20.240 | amazing things will happen, and I think the world will hear about it. Thank you so much
00:21:25.360 | for watching. Would love to see you over on Patreon, but regardless, have a wonderful day.