back to index

GPT-3 vs Human Brain


Whisper Transcript | Transcript Only Page

00:00:00.000 | The human brain is at least 100 trillion synapses,
00:00:03.440 | and it could be as high as 1,000 trillion.
00:00:05.880 | And a synapse is a channel connected to neurons
00:00:08.500 | through which an electrical or chemical signal is transferred
00:00:12.000 | and is the loose inspiration for the synapses, weights,
00:00:15.560 | parameters of an artificial neural network.
00:00:18.640 | GPT-3, the recently released language model from OpenAI
00:00:23.280 | that has been captivating people's imagination
00:00:25.640 | with zero shot or few shot learning,
00:00:28.400 | has 175 billion synapses or parameters.
00:00:33.320 | As mentioned in the OpenAI paper,
00:00:35.120 | the amount of compute that was used to train
00:00:37.240 | the final version of this network
00:00:38.680 | was 3.14 times 10 to the 23rd flops.
00:00:43.520 | And if we use reasonable cost estimates
00:00:45.440 | based on Lambda's test of U100 cloud instance,
00:00:48.640 | the cost of training this neural network is $4.6 million.
00:00:52.320 | Now, the natural question I had is,
00:00:55.420 | if the model with 175 billion parameters does very well,
00:00:59.420 | how well will a model do that has the same number
00:01:02.620 | of parameters as our human brain?
00:01:05.300 | Setting aside the fact that both our estimate
00:01:07.900 | of the number of synapses and the intricate structure
00:01:10.580 | of the brain might require a much, much larger
00:01:13.260 | neural network to approximate the brain.
00:01:15.500 | But it's very possible that even just this 100 trillion
00:01:18.420 | synapse number will allow us to see
00:01:20.820 | some magical performance from these systems.
00:01:23.700 | And one way of asking the question of how far away are we,
00:01:27.500 | is how much does it approximately cost
00:01:29.340 | to train a model with 100 trillion parameters?
00:01:32.880 | So GPT-3 is 175 billion parameters
00:01:36.540 | and $4.6 million in 2020.
00:01:39.440 | Let's call it GPT-4HB with 100 trillion parameters.
00:01:45.660 | Assuming linear scaling of compute requirements
00:01:48.780 | with respect to number of parameters,
00:01:51.580 | the cost in 2020 for training this neural network
00:01:54.800 | is $2.6 billion.
00:01:56.980 | Now, another interesting open AI paper
00:01:58.900 | that I've talked about in the past,
00:02:00.420 | titled "Measuring the Algorithmic Efficiency
00:02:02.740 | of Neural Networks," indicates that for the past seven years
00:02:06.740 | the neural network training efficiency
00:02:09.480 | has been doubling every 16 months.
00:02:12.180 | So if this trend continues, then in 2024,
00:02:16.060 | the cost of training this GPT-HB network
00:02:20.740 | would be $325 million, decreasing to $40 million in 2028,
00:02:25.740 | and in 2032, coming down to approximately the same price
00:02:29.860 | as the GPT-3 network today at $5 million.
00:02:34.140 | Now, it's important to note, as the paper indicates,
00:02:36.300 | that as the size of the network and the compute increases,
00:02:39.460 | the improvement of the performance of the network
00:02:41.620 | follows a power law.
00:02:43.380 | Still, given some of the impressive
00:02:45.620 | Turing test passing performances of GPT-3,
00:02:49.700 | it's fascinating to think what a language model
00:02:53.100 | with 100 trillion parameters might be able to accomplish.
00:02:57.340 | I might make a few short videos like these,
00:03:00.140 | focusing on a single, simple idea on the basics of GPT-3,
00:03:04.540 | including technical, even philosophical implications,
00:03:08.500 | along with highlighting how others are using it.
00:03:12.060 | So if you enjoy this kind of thing, subscribe,
00:03:14.620 | and remember, try to learn something new every day.
00:03:17.180 | (upbeat music)
00:03:19.760 | (upbeat music)
00:03:22.340 | (upbeat music)
00:03:24.920 | (upbeat music)
00:03:27.500 | [BLANK_AUDIO]