back to indexLarge Scale AI on Apple Silicon — Alex Cheema, EXO Labs

00:00:17.140 |
I'm sure you're wondering what this has to do with AI, 00:00:32.940 |
said that there would be an infinite amount of energy 00:00:53.740 |
that all the energy is quantized in some way, 00:01:02.020 |
which is like this constant that comes out of nowhere. 00:01:04.100 |
And, you know, they didn't really know at the time 00:01:09.960 |
So another chap in 1909 called Millican came up with an experiment 00:01:20.820 |
So he did it indirectly by measuring the charge of an electron. 00:01:24.140 |
And the way he did it is he put this sort of charged oil spray 00:01:29.900 |
into some water with some charged plates and then looked at how fast this charge moves. 00:01:35.760 |
The details aren't so important, but he got a result, and, you know, it was like big news in the scientific community. 00:01:44.220 |
And everyone was like, oh, man, now we know what this charge is, we know what this H is. 00:01:50.280 |
And all was good, and then, you know, sort of, he had this result and everyone was using it for their calculations and stuff. 00:01:58.960 |
And, you know, many years went by and all the experiments seemed to agree with his reading. 00:02:04.660 |
And, you know, it took until just after that last data point, like 1929, for the sort of people to realize that, 00:02:26.140 |
okay, well, you know, this isn't the actual value. 00:02:29.380 |
And I think what's interesting is when you look at the history, 00:02:33.220 |
you've got this, like, period of 15 years or so where everyone thought that this was the sort of the right value. 00:02:42.440 |
And, you know, you look back and you think, okay, why did they think this? 00:02:48.780 |
And it turns out that, you know, people are doing the right experiments. 00:02:56.200 |
And there's a lot of data points that would be in between here as well that's just, you know, been lost in the history books. 00:03:03.200 |
But, you know, and part of that is because of embarrassment. 00:03:07.220 |
This is, like, a very embarrassing thing for the scientific community. 00:03:12.420 |
Well, basically, these scientists, they were running their experiments. 00:03:16.640 |
They got a result and then they saw, oh, wait, this great Millikan guy that had this result in 1909, it doesn't agree with him. 00:03:27.640 |
And then they sort of, like, fudged, you know, the experiments to, you know, make it work such that they get the same value as him. 00:03:38.860 |
And this is, like, this is not a trivial thing. 00:03:40.660 |
This is quite an important thing to science in general. 00:03:42.700 |
So this kind of gets me up to the point of, like, you know, sort of scientific rigor. 00:03:49.180 |
And, you know, it's actually very tricky to do science properly. 00:03:54.640 |
And, you know, the fact that, you know, these subsequent experiments, basically, they all agreed on the wrong thing. 00:04:04.700 |
And it says something about, you know, sort of the way in which scientific progress happens. 00:04:11.120 |
So, you know, there's a lot of inertia behind the current way of doing things. 00:04:15.160 |
And, you know, there's another example, which is sort of about questioning assumptions. 00:04:20.320 |
And, you know, there's another guy who is looking at something completely different. 00:04:28.800 |
And he wanted to basically, he had a hypothesis. 00:04:32.380 |
He wanted to test, like, some weird esoteric thing. 00:04:35.040 |
He wanted to test that, like, you know, that rats, he could basically get rats to navigate a maze in a specific way. 00:04:43.760 |
He wanted them to go through this corridor of doors, right? 00:04:49.380 |
And they wanted them, he wanted the rats to come out of a door that was three doors along from the one that they went in. 00:04:58.620 |
So he wanted to show that, like, they could actually think and, you know, be able to consistently go, like, three doors along. 00:05:05.320 |
And basically, he tried, like, a bunch of stuff. 00:05:10.640 |
And what kept happening was whatever door, like, he basically put, like, a piece of food on the door that was three doors along to make them go through that one. 00:05:21.820 |
But what ended up happening is they just always went to the previous door. 00:05:25.080 |
So if it was, like, you know, door one that they went through and he wanted it to come out of door four, and then he tested it again where they go through door two and they should go out of door five. 00:05:41.660 |
So, like, he was, like, okay, why is this happening? 00:05:43.660 |
Like, how did the rats know, like, to go back to that same door? 00:05:47.780 |
So he was, like, he very meticulously went through and he made sure that there was no pattern or anything that they could distinguish on the doors. 00:06:03.460 |
So maybe there was, like, a smell that came from the food. 00:06:06.900 |
He tried basically putting chemicals in so that they couldn't smell the food. 00:06:11.980 |
And then he thought, okay, maybe it's something to do with the lighting. 00:06:15.880 |
You know, like, a human could do this, like, through common sense, just sort of see, like, okay, the lighting is in such and such a way and see the pattern. 00:06:22.820 |
And so he covered up the corridors and stuff and made sure that that couldn't be a thing. 00:06:26.560 |
And still, you know, the same thing happened. 00:06:29.640 |
And eventually what he found out was that the reason they could consistently go to that same door was because they remembered the sounds. 00:06:38.480 |
So as they walked along, they remembered the pattern of the sounds in this corridor. 00:06:44.620 |
So what he did was he put some sand in there so that they couldn't distinguish the sounds, basically. 00:06:52.120 |
Now, from a scientific perspective, this is, like, S-tier science in terms of, you know, really clearly looking at, like, what are all the assumptions I'm making and just, like, systematically eliminating them. 00:07:08.900 |
But, you know, the problem was that the scientific community didn't agree. 00:07:12.540 |
So the people that were conducting experiments at the time, they made a lot of these assumptions. 00:07:22.540 |
So, you know, none of the, it wasn't cited, you know, this was, this was basically just forgotten. 00:07:27.780 |
And so I think, you know, there's sort of this tendency to stick to the current way that things are done. 00:07:38.000 |
And even the methodology, if it's spot on, if there's a certain way of doing things, then that kind of, you know, has a lot of inertia behind it. 00:07:47.080 |
And, you know, Feynman talked about this in one of his talks called Cogga called Science, in, like, the 70s. 00:07:54.360 |
And he had this, like, you know, quote, the first principle is that you must not fool yourself. 00:08:03.620 |
And I think, you know, there's this tendency to sort of make, oversimplify the science. 00:08:14.920 |
And if you're interested that Gwen wrote, like, a blog post all about this and there's some controversy about, like, who this guy was and stuff. 00:08:22.060 |
But getting to AI, so, like, you know, there's a very similar thing happened in AI, you know, sort of about questioning assumptions, right? 00:08:33.480 |
And, you know, just having a good idea out there is not enough. 00:08:36.720 |
So, you know, in 1963, backpropagation was introduced in this paper. 00:08:43.540 |
And then it was, you know, reinvented in 1976 in this paper and reinvented again in 1988. 00:08:50.320 |
And then, you know, sort of deep convolutional neural networks were introduced here. 00:08:56.760 |
And then it was only in 1989 that these two things were combined. 00:08:59.480 |
So, you had, like, deep CNNs and backpropagation. 00:09:04.420 |
And then it was only, like, three decades later that, you know, CNNs were widely accepted. 00:09:14.240 |
There was still, like, this massive skepticism, even though the ideas were out there. 00:09:20.060 |
I think, like, a big part of it is sort of, again, being stuck in the way of doing things 00:09:27.140 |
So, if you look at, obviously, CPUs, they have this von Neumann bottleneck. 00:09:31.680 |
You know, they have really good single-core performance. 00:09:35.180 |
But, you know, if you're sort of having to read memory often, 00:09:41.980 |
And, you know, you can sort of look at, like, at a systems level, 00:09:47.940 |
And it sort of changed this, like, ratio of how many, you know, 00:09:52.840 |
how many bytes you have to load to how many flops you can execute. 00:09:55.680 |
And, you know, it's kind of striking, you know, like, 00:10:02.840 |
this was, like, a groundbreaking paper and a very famous paper, 00:10:05.940 |
like, where they trained a network on 1,000 machines, 00:10:15.700 |
there was this other paper that got the exact same results, 00:10:18.500 |
but it took, like, you know, three machines in a couple of days. 00:10:22.580 |
And this was using, you know, hardware acceleration, 00:10:26.980 |
So, you know, this gets me to the hardware lottery, 00:10:33.260 |
which is essentially this idea introduced by Sarah Hooker in 2020, 00:10:39.800 |
the best research ideas don't necessarily win. 00:10:42.460 |
There's a lot of factors that sort of make it 00:10:48.240 |
but, you know, it doesn't necessarily get adopted and accepted. 00:11:06.020 |
like, they're kind of creating inertia as well, 00:11:12.400 |
So if they're good at generating Python code, 00:11:22.800 |
if you wanted to come out with a new programming language, 00:13:46.580 |
Like, Studio has a lot more memory bandwidth,