back to index

Occam's Razor (Marcus Hutter) | AI Podcast Clips


Chapters

0:0 Occams Razor
0:48 The most important principle in science
2:20 Why is Einstein so beautiful
3:43 Induction
4:29 Theory
6:31 Weighting
8:2 Compression
9:38 Kolmogorov Complexity
11:18 The Whole Universe
11:58 Noise and Chaos
13:38 Library of All Books
15:3 Game of Life
17:48 Finding Simple Programs

Whisper Transcript | Transcript Only Page

00:00:00.000 | - What is Occam's Razor?
00:00:03.200 | - So Occam's Razor says that you should not
00:00:06.080 | multiply entities beyond necessity,
00:00:09.680 | which sort of if you translate it to proper English,
00:00:12.680 | means, and you know, in a scientific context,
00:00:15.280 | means that if you have two theories or hypotheses
00:00:18.000 | or models which equally well describe the phenomenon
00:00:22.000 | you are studying or the data,
00:00:23.760 | you should choose the more simple one.
00:00:26.160 | - So that's just the principle?
00:00:27.680 | - Yes.
00:00:28.520 | - So that's not like a provable law perhaps?
00:00:32.280 | Perhaps we'll kind of discuss it and think about it,
00:00:35.720 | but what's the intuition of why the simpler answer
00:00:40.320 | is the one that is likelier to be more correct descriptor
00:00:45.320 | of whatever we're talking about?
00:00:47.320 | - I believe that Occam's Razor
00:00:48.800 | is probably the most important principle in science.
00:00:52.480 | I mean, of course we lead logical deduction
00:00:54.280 | and we do experimental design,
00:00:56.760 | but science is about finding, understanding the world,
00:01:01.760 | finding models of the world,
00:01:03.680 | and we can come up with crazy complex models
00:01:05.920 | which explain everything but predict nothing,
00:01:08.200 | but the simple model seem to have predictive power
00:01:12.440 | and it's a valid question why.
00:01:15.360 | And there are two answers to that.
00:01:18.200 | You can just accept it, that is the principle of science,
00:01:21.440 | and we use this principle and it seems to be successful.
00:01:25.040 | We don't know why, but it just happens to be.
00:01:28.120 | Or you can try, you know, find another principle
00:01:30.760 | which explains Occam's Razor.
00:01:33.320 | And if we start with the assumption
00:01:36.320 | that the world is governed by simple rules,
00:01:39.800 | then there's a bias towards simplicity
00:01:43.600 | and applying Occam's Razor
00:01:46.120 | is the mechanism to finding these rules.
00:01:49.320 | And actually in a more quantitative sense,
00:01:51.280 | and we come back to that later in case of somnolent deduction
00:01:53.920 | you can rigorously prove that.
00:01:55.240 | You assume that the world is simple,
00:01:57.840 | then Occam's Razor is the best you can do
00:01:59.920 | in a certain sense.
00:02:01.240 | - So I apologize for the romanticized question,
00:02:03.840 | but why do you think outside of its effectiveness,
00:02:08.440 | why do you think we find simplicity
00:02:10.600 | so appealing as human beings?
00:02:12.120 | Why does it just, why does E equals MC squared
00:02:15.720 | seem so beautiful to us humans?
00:02:19.760 | - I guess mostly in general,
00:02:22.840 | many things can be explained by an evolutionary argument.
00:02:27.200 | And, you know, there's some artifacts in humans
00:02:29.560 | which are just artifacts and not evolutionary necessary.
00:02:33.480 | But with this beauty and simplicity,
00:02:36.400 | it's, I believe, at least the core
00:02:41.080 | is about, like science,
00:02:45.000 | finding regularities in the world,
00:02:46.760 | understanding the world,
00:02:48.240 | which is necessary for survival, right?
00:02:50.280 | If I look at a bush, right,
00:02:53.000 | and I just see noise,
00:02:54.760 | and there is a tiger, right,
00:02:56.000 | and eats me, then I'm dead.
00:02:57.240 | But if I try to find a pattern,
00:02:59.080 | and we know that humans are prone to find more patterns
00:03:04.080 | in data than they are,
00:03:06.640 | like the Mars face and all these things,
00:03:10.120 | but this bias towards finding patterns,
00:03:12.160 | even if they are non,
00:03:13.360 | but I mean, it's best, of course, if they are,
00:03:16.080 | helps us for survival.
00:03:17.480 | - Yeah, that's fascinating.
00:03:19.680 | I haven't thought really about the,
00:03:22.080 | I thought I just loved science,
00:03:23.680 | but indeed, in terms of just survival purposes,
00:03:28.480 | there is an evolutionary argument
00:03:30.880 | for why we find the work of Einstein so beautiful.
00:03:35.440 | Maybe a quick, small tangent.
00:03:39.040 | Could you describe what Solomonov induction is?
00:03:42.880 | - Yeah, so that's a theory which I claim,
00:03:47.240 | and Raysolomonov sort of claimed a long time ago
00:03:49.880 | that this solves the big philosophical problem of induction.
00:03:54.480 | And I believe the claim is essentially true.
00:03:57.240 | And what it does is the following.
00:03:59.320 | So, okay, for the picky listener,
00:04:04.240 | induction can be interpreted narrowly and widely.
00:04:08.080 | Narrow means inferring models from data.
00:04:11.200 | And widely means also then using these models
00:04:15.280 | for doing predictions,
00:04:16.240 | so predictions also part of the induction.
00:04:18.760 | So I'm a little sloppy sort of with the terminology,
00:04:21.640 | and maybe that comes from Raysolomonov being sloppy.
00:04:25.840 | Maybe I shouldn't say that.
00:04:27.240 | (both laughing)
00:04:28.280 | He can't complain anymore.
00:04:30.360 | So let me explain a little bit this theory in simple terms.
00:04:34.360 | So assume you have a data sequence,
00:04:36.520 | make it very simple, the simplest one,
00:04:37.960 | say 1, 1, 1, 1, 1, and you see 100 1s.
00:04:41.080 | What do you think comes next?
00:04:43.240 | The natural answer, I'm gonna speed up a little bit.
00:04:44.920 | The natural answer is, of course, 1.
00:04:46.960 | And the question is why?
00:04:50.400 | Well, we see a pattern there.
00:04:52.880 | Okay, there's a 1, and we repeat it.
00:04:55.040 | And why should it suddenly after 100 1s be different?
00:04:57.760 | So what we're looking for is simple explanations or models
00:05:01.320 | for the data we have.
00:05:03.000 | And now the question is,
00:05:03.960 | a model has to be presented in a certain language.
00:05:07.720 | In which language do we use?
00:05:09.800 | In science, we want formal languages,
00:05:11.680 | and we can use mathematics,
00:05:13.080 | or we can use programs on a computer.
00:05:16.120 | So abstractly on a Turing machine, for instance,
00:05:18.680 | or it can be a general purpose computer.
00:05:20.720 | So, and there are, of course, lots of models.
00:05:23.640 | You can say maybe it's 100 1s,
00:05:25.120 | and then 100 0s, and 100 1s, that's a model, right?
00:05:27.520 | But there are simpler models.
00:05:29.240 | There's a model print one loop.
00:05:31.480 | Now that also explains the data.
00:05:33.440 | And if you push that to the extreme,
00:05:36.840 | you are looking for the shortest program,
00:05:39.120 | which if you run this program,
00:05:40.800 | reproduces the data you have.
00:05:43.160 | It will not stop, it will continue naturally.
00:05:45.960 | And this you take for your prediction.
00:05:48.360 | And on the sequence of ones, it's very plausible, right?
00:05:50.680 | That print one loop is the shortest program.
00:05:53.080 | We can give some more complex examples,
00:05:55.240 | like one, two, three, four, five.
00:05:57.480 | What comes next?
00:05:58.320 | The short program is again, you know, counter.
00:06:00.440 | And so that is roughly speaking
00:06:03.640 | how Solomon's induction works.
00:06:05.400 | The extra twist is that it can also deal with noisy data.
00:06:10.000 | So if you have, for instance, a coin flip,
00:06:12.200 | say a biased coin, which comes up head
00:06:13.960 | with 60% probability,
00:06:15.600 | then it will predict, it will learn and figure this out.
00:06:20.040 | And after a while it predict,
00:06:20.960 | oh, the next coin flip will be head with probability 60%.
00:06:24.880 | So it's the stochastic version of that.
00:06:27.080 | - But the goal is, the dream is always
00:06:29.320 | the search for the short program.
00:06:30.960 | - Yes, yeah.
00:06:31.800 | Well, in Solomonov induction, precisely what you do is,
00:06:34.440 | so you combine, so looking for the shortest program
00:06:38.200 | is like applying Opac's Razor,
00:06:39.880 | like looking for the simplest theory.
00:06:41.880 | There's also Epicurus principle, which says,
00:06:44.600 | if you have multiple hypotheses,
00:06:46.080 | which equally well describe your data,
00:06:47.840 | don't discard any of them,
00:06:49.000 | keep all of them around, you never know.
00:06:51.360 | And you can put it together and say,
00:06:53.240 | okay, I have a bias towards simplicity,
00:06:55.520 | but I don't rule out the larger models.
00:06:57.600 | And technically what we do is,
00:06:59.560 | we weigh the shorter models higher
00:07:03.080 | and the longer models lower.
00:07:05.280 | And you use a Bayesian technique,
00:07:06.840 | so you have a prior,
00:07:08.480 | and which is precisely two to the minus
00:07:12.800 | the complexity of the program.
00:07:14.920 | And you weigh all this hypothesis and take this mixture,
00:07:17.640 | and then you get also the stochasticity in.
00:07:20.080 | - Yeah, like many of your ideas,
00:07:21.400 | that's just a beautiful idea of weighing
00:07:23.520 | based on the simplicity of the program.
00:07:25.040 | I love that.
00:07:26.040 | That seems to me maybe a very human-centric concept,
00:07:30.000 | seems to be a very appealing way
00:07:32.200 | of discovering good programs in this world.
00:07:37.320 | You've used the term compression quite a bit.
00:07:40.480 | I think it's a beautiful idea.
00:07:42.960 | Sort of, we just talked about simplicity
00:07:45.320 | and maybe science or just all of our intellectual pursuits
00:07:50.000 | is basically the attempt to compress the complexity
00:07:53.760 | all around us into something simple.
00:07:55.840 | So what does this word mean to you, compression?
00:08:00.840 | - I essentially have already explained it.
00:08:04.280 | So compression means for me,
00:08:06.680 | finding short programs for the data
00:08:11.160 | or the phenomenon at hand.
00:08:12.480 | You could interpret it more widely
00:08:13.960 | as finding simple theories,
00:08:16.680 | which can be mathematical theories,
00:08:18.160 | or maybe even informal, like just in words.
00:08:21.720 | Compression means finding short descriptions,
00:08:24.600 | explanations, programs for the data.
00:08:27.560 | - Do you see science as a kind of,
00:08:31.800 | our human attempt at compression?
00:08:34.280 | So we're speaking more generally,
00:08:35.720 | 'cause when you say programs,
00:08:37.600 | we're kind of zooming in on a particular sort of,
00:08:39.480 | almost like a computer science,
00:08:40.760 | artificial intelligence focus.
00:08:42.880 | But do you see all of human endeavor
00:08:44.600 | as a kind of compression?
00:08:47.040 | - Well, at least all of science I see
00:08:48.520 | as an endeavor of compression,
00:08:50.240 | not all of humanity, maybe.
00:08:52.360 | And well, there are also some other aspects of science,
00:08:54.800 | like experimental design, right?
00:08:56.280 | I mean, we create experiments specifically
00:09:00.120 | to get extra knowledge.
00:09:01.400 | And that is then part of the decision-making process.
00:09:05.000 | But once we have the data,
00:09:08.080 | to understand the data is essentially compression.
00:09:10.840 | So I don't see any difference between compression,
00:09:13.520 | understanding, and prediction.
00:09:17.240 | - So we're jumping around topics a little bit,
00:09:20.200 | but returning back to simplicity,
00:09:22.760 | a fascinating concept of Kolmogorov complexity.
00:09:26.560 | So in your sense, do most objects
00:09:29.360 | in our mathematical universe
00:09:31.920 | have high Kolmogorov complexity?
00:09:34.200 | And maybe what is, first of all,
00:09:36.320 | what is Kolmogorov complexity?
00:09:38.200 | - Okay, Kolmogorov complexity
00:09:40.040 | is a notion of simplicity or complexity.
00:09:43.400 | And it takes the compression view to the extreme.
00:09:48.200 | So I explained before that if you have some data sequence,
00:09:51.920 | just think about a file on a computer,
00:09:53.960 | and best sort of, you know, just a string of bits.
00:09:57.360 | And if you, and we have data compressors,
00:10:01.640 | like we compress big files into, say, zip files
00:10:04.240 | with certain compressors.
00:10:05.920 | And you can also produce self-extracting RKFs.
00:10:08.560 | That means as an executable, if you run it,
00:10:11.320 | it reproduces your original file
00:10:12.960 | without needing an extra decompressor.
00:10:15.040 | It's just a decompressor plus the RKF together in one.
00:10:18.440 | And now there are better and worse compressors,
00:10:21.040 | and you can ask, what is the ultimate compressor?
00:10:23.320 | So what is the shortest possible self-extracting RKF
00:10:27.080 | you could produce for a certain data set, yeah,
00:10:30.160 | which reproduces the data set?
00:10:31.800 | And the length of this is called the Kolmogorov complexity.
00:10:35.520 | And arguably, that is the information content
00:10:38.920 | in the data set.
00:10:40.160 | I mean, if the data set is very redundant or very boring,
00:10:42.680 | you can compress it very well,
00:10:43.960 | so the information content should be low.
00:10:46.960 | And, you know, it is low according to this definition.
00:10:49.120 | - So it's the length of the shortest program
00:10:51.920 | that summarizes the data?
00:10:53.240 | - Yes, yeah.
00:10:54.280 | - And what's your sense of our sort of universe
00:10:58.480 | when we think about the different objects in our universe,
00:11:03.480 | that we try concepts or whatever at every level,
00:11:07.680 | do they have high or low Kolmogorov complexity?
00:11:10.520 | So what's the hope?
00:11:11.600 | Do we have a lot of hope
00:11:13.640 | in being able to summarize much of our world?
00:11:16.640 | - That's a tricky and difficult question.
00:11:20.720 | So as I said before,
00:11:23.760 | I believe that the whole universe,
00:11:25.800 | based on the evidence we have, is very simple.
00:11:28.960 | So it has a very short description.
00:11:31.480 | - Sorry, to linger on that, the whole universe,
00:11:35.420 | what does that mean?
00:11:36.260 | Do you mean at the very basic fundamental level
00:11:39.000 | in order to create the universe?
00:11:40.820 | - Yes, yeah.
00:11:41.660 | So you need a very short program,
00:11:44.360 | and you run it--
00:11:45.200 | - To get the thing going.
00:11:46.280 | - To get the thing going,
00:11:47.280 | and then it will reproduce our universe.
00:11:49.720 | There's a problem with noise.
00:11:51.600 | We can come back to that later, possibly.
00:11:54.360 | - Is noise a problem, or is it a bug or a feature?
00:11:57.520 | - I would say it makes our life as a scientist
00:12:01.680 | really, really much harder.
00:12:04.440 | I mean, think about without noise,
00:12:05.760 | we wouldn't need all of the statistics.
00:12:08.200 | - But that may be,
00:12:09.040 | we wouldn't feel like there's a free will.
00:12:11.140 | Maybe we need that for the--
00:12:12.400 | - Yeah, this is an illusion
00:12:14.600 | that noise can give you free will.
00:12:16.880 | - At least in that way, it's a feature.
00:12:18.940 | But also, if you don't have noise,
00:12:21.280 | you have chaotic phenomena,
00:12:23.000 | which are effectively like noise.
00:12:25.000 | So we can't get away with statistics even then.
00:12:27.960 | I mean, think about rolling a dice
00:12:29.800 | and forget about quantum mechanics,
00:12:31.500 | and you know exactly how you throw it.
00:12:33.440 | But I mean, it's still so hard to compute the trajectory
00:12:36.280 | that effectively it is best to model it
00:12:38.640 | as coming out with a number,
00:12:42.360 | this probability one over six.
00:12:45.380 | But from this set of philosophical
00:12:48.620 | Kolmogorov complexity perspective,
00:12:50.380 | if we didn't have noise,
00:12:52.180 | then arguably you could describe the whole universe
00:12:55.420 | as well as a standard model plus generativity.
00:12:59.660 | I mean, we don't have a theory of everything yet,
00:13:01.860 | but sort of assuming we are close to it or have it, yeah.
00:13:04.460 | Plus the initial conditions,
00:13:05.740 | which may hopefully be simple.
00:13:07.660 | And then you just run it,
00:13:08.860 | and then you would reproduce the universe.
00:13:11.300 | But that's spoiled by noise or by chaotic systems
00:13:15.800 | or by initial conditions, which may be complex.
00:13:18.560 | So now if we don't take the whole universe,
00:13:21.960 | we're just a subset, just take planet Earth.
00:13:26.020 | Planet Earth cannot be compressed
00:13:27.880 | into a couple of equations.
00:13:29.820 | This is a hugely complex system.
00:13:31.520 | - So interesting.
00:13:32.360 | So when you look at the window,
00:13:33.920 | like the whole thing might be simple,
00:13:35.280 | but when you just take a small window, then--
00:13:38.360 | - It may become complex, and that may be counterintuitive,
00:13:41.040 | but there's a very nice analogy.
00:13:44.000 | The book, the library of all books.
00:13:46.520 | So imagine you have a normal library with interesting books
00:13:49.240 | and you go there, great, lots of information
00:13:51.600 | and quite complex, yeah?
00:13:54.260 | So now I create a library which contains all possible books,
00:13:57.280 | say, of 500 pages.
00:13:59.080 | So the first book just has AAAA over all the pages.
00:14:01.960 | The next book, AAAA, and ends with B, and so on.
00:14:04.520 | I create this library of all books.
00:14:06.460 | I can write a super short program
00:14:08.000 | which creates this library.
00:14:09.560 | So this library which has all books
00:14:11.280 | has zero information content.
00:14:13.560 | And you take a subset of this library
00:14:15.140 | and suddenly you have a lot of information in there.
00:14:17.580 | - So that's fascinating.
00:14:18.960 | I think one of the most beautiful object,
00:14:20.600 | mathematical objects that, at least today,
00:14:22.700 | seems to be understudied or under-talked about
00:14:24.800 | is cellular automata.
00:14:27.200 | What lessons do you draw from sort of the game of life
00:14:30.800 | for cellular automata,
00:14:31.720 | where you start with the simple rules,
00:14:33.060 | just like you're describing with the universe,
00:14:35.080 | and somehow complexity emerges?
00:14:38.560 | Do you feel like you have an intuitive grasp
00:14:42.640 | on the fascinating behavior of such systems,
00:14:46.360 | where, like you said, some chaotic behavior could happen,
00:14:49.800 | some complexity could emerge,
00:14:51.640 | it could die out in some very rigid structures?
00:14:55.920 | Do you have a sense about cellular automata
00:14:59.000 | that somehow transfers maybe
00:15:00.440 | to the bigger questions of our universe?
00:15:03.200 | - Yeah, the cellular automata,
00:15:04.240 | and especially the converse game of life,
00:15:06.480 | is really great because the rules are so simple,
00:15:08.480 | you can explain it to every child,
00:15:09.960 | and even by hand, you can simulate a little bit,
00:15:12.520 | and you see these beautiful patterns emerge,
00:15:16.240 | and people have proven that it's even Turing-complete.
00:15:19.040 | You cannot just use a computer to simulate game of life,
00:15:22.080 | but you can also use game of life
00:15:23.400 | to simulate any computer.
00:15:25.720 | That is truly amazing,
00:15:28.760 | and it's the prime example, probably,
00:15:31.560 | to demonstrate that very simple rules
00:15:35.000 | can lead to very rich phenomena.
00:15:37.480 | And people sometimes,
00:15:39.040 | how is chemistry and biology so rich?
00:15:41.960 | I mean, this can't be based on simple rules,
00:15:44.640 | but no, we know quantum electrodynamics
00:15:46.760 | describes all of chemistry,
00:15:48.600 | and we come later back to that.
00:15:51.160 | I claim intelligence can be explained
00:15:53.160 | or described in one single equation,
00:15:55.240 | this very rich phenomenon.
00:15:56.840 | You asked also about whether I understand this phenomenon,
00:16:02.120 | and it's probably not,
00:16:06.520 | and there's this saying,
00:16:07.800 | you never understand really things,
00:16:09.040 | you just get used to them.
00:16:10.600 | And I think I'm pretty used to cellular automata,
00:16:15.600 | so you believe that you understand now
00:16:17.880 | why this phenomenon happens,
00:16:19.360 | but I give you a different example.
00:16:21.520 | I didn't play too much with this converse game of life,
00:16:24.040 | but a little bit more with fractals
00:16:27.280 | and with the Mandelbrot set,
00:16:28.480 | and these beautiful patterns,
00:16:30.760 | just look, Mandelbrot set.
00:16:32.240 | And well, when the computers were really slow,
00:16:35.560 | and I just had a black and white monitor
00:16:37.600 | and programmed my own programs in assembler too.
00:16:41.360 | - Assembler, wow.
00:16:43.240 | Wow, you're legit.
00:16:44.680 | (both laughing)
00:16:46.040 | - To get these fractals on the screen,
00:16:48.000 | and it was mesmerized, and much later.
00:16:49.600 | So I returned to this every couple of years,
00:16:52.560 | and then I tried to understand what is going on,
00:16:55.080 | and you can understand a little bit,
00:16:57.080 | so I tried to derive the locations,
00:17:01.000 | there are these circles and the apple shape,
00:17:05.800 | and then you have smaller Mandelbrot sets
00:17:09.640 | recursively in this set.
00:17:11.280 | And there's a way to mathematically,
00:17:14.000 | by solving high order polynomials,
00:17:15.760 | to figure out where these centers are
00:17:17.960 | and what size they are approximately.
00:17:20.360 | And by sort of mathematically approaching this problem,
00:17:24.840 | you slowly get a feeling of why things are like they are,
00:17:30.360 | and that sort of is a first step to understanding
00:17:35.360 | why this rich phenomenon.
00:17:37.160 | - Do you think it's possible, what's your intuition,
00:17:39.440 | do you think it's possible to reverse engineer
00:17:41.160 | and find the short program that generated these fractals
00:17:45.960 | by looking at the fractals?
00:17:48.640 | - Well, in principle, yes.
00:17:50.040 | So, I mean, in principle, what you can do is,
00:17:54.240 | you take any data set, you take these fractals,
00:17:56.800 | or you take whatever your data set, whatever you have,
00:18:00.400 | say a picture of Conway's Game of Life,
00:18:03.280 | and you run through all programs,
00:18:05.480 | you take a program of size one, two, three, four,
00:18:07.560 | and all these programs, run them all in parallel
00:18:09.320 | in so-called dovetailing fashion,
00:18:11.360 | give them computational resources,
00:18:13.600 | first one 50%, second one half resources, and so on,
00:18:16.160 | and let them run, wait until they halt,
00:18:19.240 | give an output, compare it to your data,
00:18:21.440 | and if some of these programs produce the correct data,
00:18:24.640 | then you stop, and then you have already some program.
00:18:26.760 | It may be a long program because it's faster,
00:18:29.160 | and then you continue,
00:18:30.200 | and you get shorter and shorter programs
00:18:31.960 | until you eventually find the shortest program.
00:18:34.760 | The interesting thing, you can never know
00:18:36.280 | whether it's the shortest program
00:18:37.800 | because there could be an even shorter program,
00:18:39.680 | which is just even slower,
00:18:41.680 | and you just have to wait, yeah?
00:18:44.440 | But asymptotically, and actually after a finite time,
00:18:47.240 | you have the shortest program.
00:18:48.720 | So this is a theoretical but completely impractical way
00:18:52.680 | of finding the underlying structure
00:18:58.200 | in every data set,
00:18:59.640 | and that is what Solomon of induction does
00:19:01.240 | and Kolmogorov complexity.
00:19:02.880 | In practice, of course, we have to approach the problem
00:19:04.880 | more intelligently, and then
00:19:07.320 | if you take resource limitations into account,
00:19:12.960 | there's, for instance, the field of pseudo random numbers,
00:19:15.920 | and these are random numbers,
00:19:18.080 | so these are deterministic sequences,
00:19:20.720 | but no algorithm which is fast,
00:19:23.120 | fast means runs in polynomial time,
00:19:24.840 | can detect that it's actually deterministic.
00:19:27.800 | So we can produce interesting,
00:19:30.120 | I mean, random numbers maybe not that interesting,
00:19:31.720 | but just an example.
00:19:32.640 | We can produce complex-looking data,
00:19:36.560 | and we can then prove that no fast algorithm
00:19:39.320 | can detect the underlying pattern.
00:19:41.200 | - Which is unfortunately,
00:19:46.760 | that's a big challenge for our search for simple programs
00:19:52.040 | in the space of artificial intelligence, perhaps.
00:19:54.560 | - Yes, it definitely is for artificial intelligence,
00:19:56.640 | and it's quite surprising that it's,
00:19:59.360 | I can't say easy, I mean,
00:20:00.920 | physicists worked really hard to find these theories,
00:20:04.280 | but apparently it was possible for human minds
00:20:08.160 | to find these simple rules in the universe.
00:20:09.760 | It could have been different, right?
00:20:11.320 | - It could have been different.
00:20:13.200 | It's awe-inspiring.
00:20:15.240 | (static)
00:20:17.320 | (silence)
00:20:19.480 | (silence)
00:20:21.640 | (silence)
00:20:23.800 | (silence)
00:20:25.960 | (silence)
00:20:28.120 | (silence)
00:20:30.280 | (silence)
00:20:32.440 | [BLANK_AUDIO]