Stephen Wolfram: Computational Universe | MIT 6.S099: Artificial General Intelligence (AGI)

Welcome back to 6S099 Artificial General Intelligence. Today we have Stephen Wolfram. (audience applauding) Wow. That's a first, I didn't even get started, you're already clapping. In his book, A New Kind of Science, he has explored and revealed the power, beauty and complexity of cellular automata as simple computational systems for which incredible complexity can emerge.

It's actually one of the books that really inspired me to get into artificial intelligence. He's created the Wolfram Alpha computational knowledge engine, created Mathematica that has now expanded to become Wolfram Language. Both he and his son were involved in helping analyze, create the alien language from the movie Arrival of which they use the Wolfram Language.

Please again, give Stephen a warm welcome. (audience applauding) All right, so I gather the brief here is to talk about how artificial general intelligence is going to be achieved. Is that the basic picture? So maybe I'm reminded of kind of a story which I don't think I've ever told in public but that something that happened just a few buildings over from here.

So this was 2009 and Wolfram Alpha was about to arrive on the scene. I assume most of you have used Wolfram Alpha or seen Wolfram Alpha, yes? How many of you have used Wolfram Alpha? Okay, that's good. (audience laughing) So I had long been a friend of Marvin Minsky's and Marvin was a sort of pioneer of the AI world and I'd kind of seen for years question answering systems that tried to do sort of general intelligence question answering and so had Marvin.

And so I was gonna show Marvin Wolfram Alpha. He looks at it and he's like, oh, okay, that's fine. Whatever, said no Marvin, this time it actually works. You can try real questions. This is actually something useful. This is not just a toy. And it was kind of interesting to see.

It took about five minutes for Marvin to realize that this was finally a question answering system that could actually answer questions that were useful to people. And so one question is how did we achieve that? So you go to Wolfram Alpha and you can ask it, I mean, it's, I don't know what we can ask it.

I don't know, what's the, some random question. What is the population of Cambridge? Actually, here's a question, divided by, let's try that. What's the population of Cambridge? It's probably gonna figure out. If we mean Cambridge, Massachusetts, it's gonna give us some number, it's gonna give us some plot. Actually, what I wanna know is number of students at MIT divided by population of Cambridge.

See if it can figure that out. And, okay, it's kind of interesting. Oh, no, that's divided by, ah, that's interesting. It guessed that we were talking about Cambridge University as the denominator there. So it says the number of students at MIT divided by the number of students at Cambridge University.

That's interesting, I'm actually surprised. Let's see what happens if I say Cambridge MA there. No, it'll probably fail horribly. No, that's good. Okay, so, no, that's interesting. That's a plot as a function of time, of the fraction of the, okay, so anyway. So I'm glad it works. So one question is how did we manage to get, so many things have to work in order to get stuff like this to work.

You have to be able to understand the natural language, you have to have data sources, you have to be able to compute things from the data, and so on. One of the things that was a surprise to me was in terms of natural language understanding, was the critical thing turned out to be just knowing a lot of stuff.

The actual parsing of the natural language is kind of, I think it's kind of clever, and we use a bunch of ideas that came from my new kind of science project and so on. But I think the most important thing is just knowing a lot of stuff about the world is really important to actually being able to understand natural language in a useful situation.

I think the other thing is having, actually having access to lots of data. Let me show you a typical example here of what is needed. So I ask about the ISS, and hopefully it'll wake up and tell us something here, come on, what's going on here? There we go, okay.

So it figured out that we probably are talking about a spacecraft, not a file format, and now it's gonna give us a plot that shows us where the ISS is right now. So to make this work, we obviously have to have some feed of radar tracking data about satellites and so on, which we have for every satellite that's out there.

But then that's not good enough to just have that feed. Then you also have to be able to do celestial mechanics to work out, well, where is the ISS actually right now based on the orbital elements that have been deduced from radar, and then if we want to know things like, okay, when is it going to, it's not currently visible from Boston, Massachusetts, it will next rise at 7.36 p.m.

on Monday on today. So this requires a mixture of data about what's going on in the world, together with models about how the world is supposed to work, being able to predict things, and so on. And I think another thing that I kind of realized about AI and so on from the Wolfram Alpha effort has been that one of the earlier ideas for how one would achieve AI was let's make it work kind of like brains do, and let's make it figure stuff out, and so if it has to do physics, let's have it do physics by pure reasoning, like people at least used to do physics.

But in the last 300 years, we've had a different way to do physics that wasn't sort of based on natural philosophy. It was instead based on things like mathematics. And so one of the things that we were doing in Wolfram Alpha was to kind of cheat relative to what had been done in previous AI systems, which was instead of using kind of reasoning-type methods, we're just saying, okay, we want to compute where the ISS is going to be, we've got a bunch of equations of motion that correspond to differential equations, we're just gonna solve the equations of motion and get an answer.

That's kind of leveraging the last 300 years or so of exact science that had been done, rather than trying to make use of kind of human reasoning ideas. And I might say that in terms of the history of the Wolfram Alpha project, when I was a kid, a disgustingly long time ago, I was interested in AI kinds of things, and I, in fact, I was kind of upset recently to find a bunch of stuff I did when I was 12 years old, kind of trying to assemble a pre-version of Wolfram Alpha way back before it was technologically possible, but it's also a reminder that one just does the same thing one's whole life, so to speak, at some level.

But what happened was when I started off working mainly in physics, and then I got involved in building computer systems to do things like mathematical computation and so on, and I then sort of got interested in, okay, so can we generalize this stuff, and can we really make systems that can answer sort of arbitrary questions about the world, and for example, sort of the promise would be if there's something that is systematically known in our civilization, make it automatic to answer questions on the basis of that systematic knowledge.

And back in around late 1970s, early 1980s, my conclusion was if you wanted to do something like that, the only realistic path to being able to do it was to build something much like a brain. And so I got interested in neural nets, and I tried to do things with neural nets back in 1980, and nothing very interesting happened, well, I couldn't get them to do anything very interesting.

And that, so I kind of had the idea that the only way to get the kind of thing that now exists in Wolfram Alpha, for example, was to build a brain-like thing. And then many years later, for reasons I can explain, I kind of came back to this and realized, actually, it wasn't true that you had to build a brain-like thing, sort of mere computation was sufficient.

And that was kind of what got me started actually trying to build Wolfram Alpha. When we started building Wolfram Alpha, one of the things I did was go to a sort of a field trip to a big reference library, and you see all these shelves of books and so on, and the question is, can we take all of this knowledge that exists in all of these books and actually automate being able to answer questions on the basis of it?

And I think we've pretty much done that. For that, at least the books you find in a typical reference library. So that was, it looked kind of daunting at the beginning because there's a lot of knowledge and information out there, but actually it turns out there are a few thousand domains, and we've steadily gone through and worked on these different domains.

Another feature of the Wolfram Alpha project was that we didn't really, you know, I'd been involved a lot in doing basic science and in trying to have sort of grand theories of the world. One of my principles in building Wolfram Alpha was not to start from a grand theory of the world.

That is, not to kind of start from some global ontology of the world and then try and build down into all these different domains, but instead to work up from having, you know, hundreds, then thousands of domains that actually work, whether they're, you know, information about cars or information about sports or information about movies or whatever else, have each of these domains sort of building up from the bottom in each of these domains, and then finding that there were common themes in these domains that we could then build into frameworks and then sort of construct the whole system on the basis of that, and that's kind of how it's worked, and I can talk about some of the actual frameworks that we end up using and so on.

But maybe I should explain a little bit more. So one question is how does Wolfram Alpha actually sort of work inside? And the answer is it's a big program. It's about, it's the core system is about 15 million lines of Wolfram language code, and it's some number of terabytes of raw data.

And so the way, the thing that sort of made building Wolfram Alpha possible was this language, Wolfram language, which started with Mathematica, which came out in 1988, and has been sort of progressively growing since then. So maybe I should show you some things about Wolfram language, and it's easy.

You can go use this. MIT has a site license for it. You can use it all over the place. You can find it on the web, et cetera, et cetera, et cetera. But, okay, the basics work. Let's start off with something like, let's make a random graph, and let's say we have a random graph with 200 nodes, 400 vertices.

Okay, so there's a random graph. The first important thing about Wolfram language is that it's a symbolic language. So I can just pick up this graph, and I could say, you know, I don't want to do some analysis of this graph. That graph is just a symbolic thing that I can just do computations on.

Or I could say, let's get a, another good thing to always do is get a current image. See, there we go. And now I could go and say something like, let's do some basic thing. Let's say, let's edge detect that image. Again, this image is just a thing that we can manipulate.

We could take the image, we could make it, I don't know, we could take the image and partition it into little pieces, do computations on that. I don't know, simple. Let's do, let's just say, sort each row of the image, assemble the image again, oops. Assemble that image again, we'll get some mixed up picture there.

If I wanted to, I could, for example, let's say, let's make that the current image, and let's say, make that dynamic. I can be just running that code, hopefully, and a little loop, and there we can make that work. So, one general point here is, this is just an image for us, it's just a piece of data like anything else.

If we just have a variable, a thing called x, it just says, okay, that's x, I don't need to know a particular value, it's just a symbolic thing that corresponds to, that's a thing called x. Now, what gets interesting when you have symbolic language and so on is, we're interested in having it represent stuff about the world, as well as just abstract kinds of things.

I mean, I can abstractly say, find some funky integral, I don't know what, that's then representing, using symbolic variables to represent algebraic kinds of things. But I could also just say, I don't know, something like Boston. And Boston is another kind of symbolic thing that has, if I say, what is it really inside?

That's the entity, a city, Boston, Massachusetts, United States. Actually, you notice when I typed that in, I was using natural language to type it in, and it gave me a bunch of disambiguation here. It said, assuming Boston is a city, assuming Boston, Massachusetts, use Boston, New York, or, okay, let's use Boston and the Philippines, which I've never heard of, but let's try using that instead.

And now, if I look at that, it'll say it's Boston and some province of the Philippines, et cetera, et cetera, et cetera. Now, I might ask it, of that, I could say something like, what's the population of that? And it, okay, it's a fairly small place. Or I could say, for example, let me do this.

Let me say, a geolist plot from that Boston, let's take from that Boston, two, and now let's type in Boston again, and now let's have it use the default meaning of the word of Boston, and then let's join those up, and now this should plot, this should show me a plot.

There we go, okay, so there's the path from the Boston that we picked in the Philippines to the Boston here. Or we could ask it, I don't know, I could just say, I could ask it the distance from one to another or something like that. So, one of the things here, one of the things we found really, really useful, actually, in Wolfram Language, so first of all, there's a way of representing stuff about the world, like cities, for example.

Or let's say I want to say, let's do this. Let's say, let's do something with cities. Let's say capital cities in South America. Okay, so notice, this is a piece of natural language. This will get interpreted into something which is precise, symbolic Wolfram Language code that we can then compute with, and that will give us the cities in South, capital cities in South America.

I could, for example, let's say I say, find shortest tour. So now I'm going to use some, oops, no, I don't want to do that. What I want to do first is to say, show me the geopositions of all those cities on line 21 there. So now it will find the geopositions, and now it will say, compute the shortest tour.

So that's saying there's a 10,000 mile traveling salesman tour around those cities, so I could take those cities that are on line 21, and I could say, order the cities according to this, and then I could make another geolist plot of that, join it up, and this should now show us a traveling salesman tour of the capital cities in South America.

So, it's sort of interesting to see what's involved in making stuff like this work. One of, my goal has been to sort of automate as much as possible about things that have to be computed, and that means knowing as many algorithms as possible, and also knowing as much data about the world as possible.

And I kind of view this as sort of a knowledge-based programming approach, where you have a typical kind of idea in programming languages is, you have some small programming languages, has a few primitives that are pretty much tied into what a machine can intrinsically do, and then maybe you'll have libraries that add on to that and so on.

My kind of crazy idea of many, many years ago has been to build an integrated system where all of the stuff about different domains of knowledge and so on are all just built into the system and designed in a coherent way. I mean, this has been kind of the story of my life for the last 30 years, is trying to keep the design of the system coherent, even as one adds all sorts of different areas of capability.

So, as, I mean, we can go and dive into all sorts of different kinds of things here, but maybe as an example, well, let's do, what could we do here? We could take, let's try, how about this? Is that a bone? I think so, that's a bone. So let's try that.

(keyboard clicking) As a mesh region. See if that works. So this will now use a completely different domain of human endeavor. Okay, oops, there's two of those bones. Let's try, let's just try, let's try left humerus, and let's try that, the mesh region for that, and now we should have a bone here.

Okay, there's a representation of a bone. Let's take that bone, and we could, for example, say, let's take the surface area of that, as in some units, or I could, let's do some much more outrageous thing. Let's say we take region distance. So we're going to take the distance from that bone to a point, let's say, zero, zero, Z, and let's make a plot of that distance with Z going from, let's say, I have no idea where this bone is, but let's try something like this.

So that was really boring. Let's try, so what this is doing, again, a whole bunch of stuff has to work in order for this to operate. This has to be, this is some region in 3D space that's represented by some mesh. You have to compute, you know, do the computational geometry to figure out where it is.

If I wanted to, let's try anatomy plot 3D, and let's say something like left hand, for example, and now it's going to show us probably the complete data that it has about the geometry of a left hand. There we go. Okay, so there's the result, and we could take that apart and start computing things from it and so on.

So what, so this is, so there's a lot of kind of computational knowledge that's built in here. One, let's talk a little bit about kind of the modern machine learning story. So for instance, if I say, let's get a picture here. Let's say, let's just say picture of, has anybody got a favorite kind of animal?

What? Panda. Okay, so let's try, okay, giant panda. Okay, okay, there's a panda. Let's see what, now let's try saying, let's try for this panda, let's try saying image identify, and now here we'll be embarrassed probably, but let's just see, let's see what happens. If I say image identify that, and now it'll hopefully, wake up, wake up, wake up.

This only takes a few hundred milliseconds. Okay, very good, giant panda. Let's see what the runners up were to the giant panda. Let's say we want to say the 10 runners up in all categories for that thing, okay. So a giant panda, a procyonid, which I've never heard of.

Are pandas carnivorous? They eat bamboo shoots, okay. So that was so lucky it didn't get that one. It's really sure it's a mammal, and it's absolutely certain it's a vertebrate. Okay, so you might ask, how did it figure this out? And so then you can kind of look under the hood and say, so we have a whole framework for representing neural nets symbolically.

And so this is the actual model that it's using to do this. So this is a, so there's a neural net, and it's got, we can drill down, and we can see there's a piece of the neural net. We can drill down even further to one of these, and we can probably see what, that's a batch normalization layer, somewhere deep, deep inside the entrails of the, not panda, but of this thing, okay.

So now let's take that object, which is just a symbolic object, and let's feed it the picture of the panda. And we can see, and there, oops. I was not giving it the right thing. What did I just do wrong here? Oh, here, let's take, oh, I see what I did.

Okay, let's take this thing and feed it the picture of the panda, and it says it's a giant panda, okay. How about we do something more outrageous? Let's take that neural net, and let's only use the first, let's say, 10 layers of the neural net. So let's just take out 10 layers of the neural net and feed it the panda.

And now what we'll get is something from the insides of the neural net, and I could say, for example, let's just make those into images. Okay, so that's what the neural net had figured out about the panda after 10 layers of going through the neural net. And maybe, actually, it'd be interesting to see, let's do a feature space plot.

So now we're going to, of those intermediate things in the brain of the neural net, so to speak, this is now taking, so what this is just doing is to do dimension reduction on this space of images, and so it's not very exciting. It's probably mostly distinguishing these by total gray level, but that's kind of showing us the space of different sort of features of the insides of this neural net.

So it's also, what's interesting to see here is things like the symbolic representation of the neural net, and if you're wondering how does that actually work inside, it's underneath, it's using MXNet, which we happen to have contributed to a lot, and there's sort of a bunch of symbolic layers on top of that that feed into that.

And maybe I can show you here. Let me show you how you would train one of these neural nets. That's also kind of fun. So we have a data repository that has all sorts of useful data. One piece of data it has is a bunch of neural net training sets, so this is the standard MNIST training set of handwritten digits.

Okay, so there's MNIST, and you notice that these things here, that's just an image which I could copy out, and I could do, let's say I could do color negate on that image 'cause it's just an image, and there's the result and so on. And now I could say, let's take a neural net, like let's take a simple neural net like Lynette, for example, okay, so let's take Lynette, and then let's take the untrained evaluation network.

So this is now a version of Lynette, simple, standard neural net that didn't get trained. So for example, if I take that symbolic representation of Lynette, and I could say net initialize, then it will take that, and it'll just put random weights into Lynette, okay, so if I take those random weights, and I feed it a zero here, I feed it that image of a zero, it will presumably produce something completely random, in this particular case, two, right?

So now what I would like to do is to take this, so that was just randomly initializing the weights. So now what I'd like to do is to take the MNIST training set, and I'd like to actually train Lynette using MNIST training set, so let's take this, and let's take a random sample of, let's say, I don't know, 1,000 pieces of Lynette.

Come on, why is it having to load it again? There we go, okay, so there's a random sample there, it was on line 21, and now let me go down here, and say, where was it? Well, we can just take this thing here, so this is the uninitialized version of Lynette, and we can say take that, and then let's say net train of that with the thing on line 21, which was that 1,000 instances.

So now what it's doing is it's running training on, and that's, you see the loss going down and so on. It's running training for those 1,000 instances of Lynette, and it will, we can stop it if we want to. Actually, this is a new display, this is very nice.

This is a new version of Wolfram Language, which is coming out next week, which I'm showing you, but it's quite similar to what exists today, but because that's one of the features of running a software company is that you always run the very latest version of things, for better or worse, and this is also a good way to debug it, 'cause it's supposed to come out next week.

If I find some horrifying bug, maybe it will get delayed, but let's try, let's try this. Okay, now it says it's zero, okay, and so this is now a trained version of Lynette, trained with that training data. One of the things, so we can talk about all kinds of details of neural nets and so on, but maybe I should zoom out to talk a little bit about bigger picture as I see it.

So one question is, sort of a question of what is in principle possible to do with computation? So we have, as we're building all kinds of things, we're making image identifiers, we're figuring out all kinds of things about where the International Space Station is and so on. Question is, what is in principle possible to compute?

And so one of the places one can ask that question is when one looks at, for example, models of the natural world. One can say, how do we make models of the natural world? Kind of a traditional approach has been, let's use mathematical equations to make models of the natural world.

A question is, if we want to kind of generalize that and say, well, what are all possible ways to make models of things, what can we say about that question? So I spent many years of my life trying to address that question. And basically what I've thought about a lot is that if you want to make a model of a thing, you have to have definite rules by which the thing operates.

What's the most general way to represent possible rules? Well, in today's world, we think of that as a program. So the next question is, well, what does the space of all possible programs look like? And most of the time, we're writing programs like Wolfram Language is 50 million lines of code, and it's a big, complicated program that was built for a fairly specific purpose.

But the question is, if we just look at sort of the space of possible programs, more or less at random, what's out there in the space of possible programs? So I got interested many years ago in cellular automata, which are a really good example of a very simple kind of program.

So let me show you an example of one of these. So these are the rules for a typical cellular automaton. And this just says you have a row of black and white squares, and this just says you look at a square, say what color is that square, what color are its left and right neighbors, decide what color of the square will be on the next step based on that rule.

Okay, so really simple rule. So now let's take a look at what actually happens if we use that rule a bunch of times. So we can take that rule, the 254 is just the binary digits that correspond to those positions in this rule. So now I can say this, I could say let's do 50 steps, let me do this.

And now if I run according to the rule I just defined, it turns out to be pretty trivial. It's just saying, if any square is, if we start off with a black square, if any square is, if any neighboring square is black, make a black square. So we've used a very simple program.

We got a very simple result out. Okay, let's try a different program. We can try changing this. We'll get, that's a program with one bit different. Now we get that kind of pattern. So the question is, well, what happens, you might say, okay, if you've got such a trivial program, it's not surprising you're just gonna get trivial results out.

So, but you can do an experiment to test that hypothesis. You can just say, let's take all possible programs, there are 256 possible programs that are based on these eight bits here. Let's just take, well, let's just, whoops. Let's just take, let's say the first 64 of those programs and let's just make a, there we go.

Let's just make a table of the results that we get by running those first 64 programs here. So here we get the result. And what you see is, well, most of them are pretty trivial. They start off with one black cell in the middle and it just tools off to one side.

Occasionally we get something more exciting happening like here's a nice nested pattern that we get. If we were to continue it longer, it would make more detailed nesting. But then, my all time favorite science discovery, if you go on and just look at these, after a while you find this one here, which is rule 30 in this numbering scheme.

And that's doing something a bit more complicated. You say, well, what's going on here? We just started off with this very simple rule. Let's see what happens. Maybe after a while, if we run rule 30 long enough, it will resolve into something simpler. So let's try running it, let's say 500 steps.

And that's the, whoops, that's the result we get. Let's say, let's just make it full screen. Okay, it's aliasing a bit on the projector there, but you get the basic idea. This is a, so this just started off from one black cell at the top and this is what it made.

And that's pretty weird because all this is, you know, this is sort of not the way it's supposed, things are supposed to work. 'Cause what we have here is just that little program down there and it makes this big complicated pattern here. And, you know, we can see there's a certain amount of regularity on one side, but for example, the center column of this pattern is, for all practical purposes, completely random.

In fact, it was, we used it as the random number generator in mathematical and morphem language for many years. It was recently retired after excellent service because we found a somewhat more efficient one. But the, so, you know, what do we learn from this? What we learn from this is out in the computational universe of possible programs, it's possible to get, even with very simple programs, very rich, complicated behavior.

Well, that's important if you're interested in modeling the natural world because you might think that there are programs that represent systems in nature that might work this way and so on. It's also important for technology because it says, okay, let's say you're trying to find a, let's say you're trying to find a program that's a good random number generator.

How are you gonna do that? Well, you could start thinking very hard and you could try and make up, you know, you could try and write down all kinds of flow charts about how this random number generator is going to work. Or you can say, forget that, I'm just gonna search the computational universe of possible programs and just look for one that serves as a good random number generator.

In this particular case, after you've searched 30 programs, you'll find one that makes a good random number generator. Why does it work? That's a complicated story. It's not a story that I think necessarily we can really tell very well. But what's important is that this idea that out in the computational universe, there's a lot of rich, sophisticated stuff that can be essentially mined for our technological purposes.

That's the important thing. Whether we understand how this works is a different matter. I mean, it's like when we look at the natural world, the physical world, we're used to kind of mining things. You know, we started using magnets to do magnetic stuff long before we understood the theory of ferromagnetism and so on.

And so similarly here, we can sort of go out into the computational universe and find stuff that's useful for our purposes. Now, in fact, the world of sort of deep learning and neural nets and so on is a little bit like this. It uses the trick that there's a certain degree of differentiability there, so you can kind of home in on let's try and find something that's incrementally better.

And for certain kinds of problems, that works pretty well. I think the thing that we've done a lot, I've done a lot, is just sort of exhaustive search in the computational universe of possible programs. Just search a trillion programs and try and find one that does something interesting and useful for you.

There's a lot of things to say about what, well, actually, in the search a trillion programs and find one that's useful, let me show you another example of that. Let's see. So I was interested a while ago in, I have to look something up here, sorry. Let me see here.

In Boolean algebra, and I was interested in the space of all possible mathematicses. And let me just see here. I'm not finding what I wanted to find, sorry. That was a good example. I should have memorized this, but I haven't. So here we go. There it is. So I was interested in if you just look at, so we talked about sort of looking at the space of all possible, the space of all possible programs.

Another thing you can do is say, if you're gonna invent mathematics from nothing, what possible axiom systems could we use in mathematics? So I was curious, where do, and that, again, might seem like a completely crazy thing to do, to just say, let's just start enumerating axiom systems at random and see if we find one that's interesting and useful.

But it turns out, once you have this idea that out in the computational universe of possible programs, there's actually a lot of low-hanging fruit to be found, it turns out you can apply that idea in lots of places. I mean, the thing to understand is, why do we not see a lot of engineering structures that look like this?

The reason is because our traditional model of engineering has been, we engineer things in a way where we can foresee what the outcome of our engineering steps are going to be. And when it comes to something like this, we can find it out in the computational universe, but we can't readily foresee what's going to happen.

We can't do sort of a step-by-step design of this particular thing. And so in engineering, in human engineering, as it's been practiced so far, most of it has consisted of building things where we can foresee step-by-step what the outcome of our engineering is going to be. And we see that in programs, we see that in other kinds of engineering structures.

And so there's sort of a different kind of engineering, which is about mining the computational universe of possible programs. And it's worth realizing there's a lot more that can be done a lot more efficiently by mining the computational universe of possible programs than by just constructing things step-by-step as a human.

So for example, if you look for optimal algorithms for things, like, I don't know, even something like sorting networks, the optimal sorting networks look very complicated. They're not things that you would construct by sort of step-by-step thinking about things with in a kind of typical human way. And so this idea, if you're really going to have computation work efficiently, you are going to end up with these programs that are sort of just mined from the computational universe.

And one of the issues with mining things, so that this makes use of computation much more efficiently than a typical thing that we might construct. Now, one feature of this is it's hard to understand what's going on. And there's actually a fundamental reason for that, which is in our efforts to sort of understand what's going on, we get to use our brains, our computers, our mathematics, or whatever.

And our goal is this particular little program did a certain amount of computation to work out this pattern. The question is, can we kind of outrun that computation and say, oh, I can tell that actually this particular bit down here is going to be a black bit. You don't have to go and do all that computation.

But it turns out that, and again, this will maybe is a digression, which there's this phenomenon I call computational irreducibility, which I think is really common. And it's a consequence of this thing I call principle of computational equivalence. And that principle of computational equivalence basically says, as soon as you have a system whose behavior isn't fairly easy to analyze, the chances are that the computation it's doing is essentially as sophisticated as it could be.

And that has consequences like it implies that the typical thing like this will correspond to a universal computer that you can use to program anything. It also has the consequence of this computational irreducibility phenomenon that says you can't expect our brains to be able to outrun the computations that are going on inside the system.

If there was computational reducibility, then we can expect that this thing went to a lot of trouble and did a million steps of evolution. But actually just by using our brains, we can jump ahead and see what the answer will be. Computational irreducibility suggests that isn't the case. If we're going to make the most efficient use of computational resources, we will inevitably run into computational irreducibility all over the place.

It has the consequence that we get the situation where we can't readily sort of foresee and understand what's going to happen. So back to mathematics for a second. So this is just an axiom system that, so I looked for all possible, looked through sort of all possible axiom systems starting off with really tiny ones.

And I asked the question, what's the first axiom system that corresponds to Boolean algebra? So it turns out this thing here, this tiny little thing here, generates all theorems of Boolean algebra. It is the simplest axiom for Boolean algebra. Now, something I have to show you this 'cause it's a new feature you see.

If I say, find equational proof, let's say I want to prove commutativity of the NAND operation. I'm gonna show you something here. This is going to try to generate, let's see if this works. This is going to try to generate an automated proof based on that axiom system of that result.

So it had 102 steps in the proof. And let's try and say, let's look at, for example, the proof network here. Actually, let's look at the proof dataset. No, that's not what I wanted. Oh, I should learn how to use this, shouldn't I? (audience laughing) Let's see. What I want is the, yeah, proof dataset.

There we go. Very good. Okay, so this is, actually, let's say, first of all, let's say the proof graph. Okay, so this is gonna show me how that proof was done. So there are a bunch of lemmas that got proved, and from those lemmas, those lemmas were combined, and eventually it proved the result.

So let's take a look at what some of those lemmas were. Okay, so here's the result. So after, so it goes through, and these are various lemmas it's using, and eventually, after many pages of nonsense, it will get to the result. Okay, each one of these, some of these lemmas are kind of complicated there.

That's that lemma. It's a pretty complicated lemma, et cetera, et cetera, et cetera. So you might ask, what on earth is going on here? And the answer is, so I first generated a version of this proof 20 years ago, and I tried to understand what was going on, and I completely failed.

And it's sort of embarrassing because this is supposed to be a proof. It's supposed to be demonstrating some results, and what we realize is that, you know, what does it mean to have a proof of something? What does it mean to explain how a thing is done? You know, what is the purpose of a proof?

Purpose of a proof is basically to let humans understand why something is true. And so, for example, if you go to, let's say we go to Wolfram Alpha, and we do, you know, some random thing where we say, let's do, you know, an integral of something or another, it will be able to very quickly, in fact, it will take it only milliseconds internally to work out the answer to that integral, okay?

But then somebody who wants to hand in a piece of homework or something like that needs to explain why is this true. Okay, well, we have this handy step-by-step solution thing here, which explains why it's true. Now, the thing I should admit about the step-by-step solution is it's completely fake.

That is, the steps that are described in the step-by-step solution have absolutely nothing to do with the way that internally that integral was computed. These are steps created purely for the purpose of telling a story to humans about why this integral came out the way it did. And now what we're seeing, and so that's a, so there's one thing is knowing the answer, the other thing is being able to tell a story about why the answer worked that way.

Well, what we see here is this is a proof, but it was an automatically generated proof, and it's a really lousy story for us humans. I mean, if it turned out that one of these theorems here was one that had been proved by Gauss or something and appeared in all the textbooks, we would be much happier because then we would start to have a kind of human representable story about what was going on.

Instead, we just get a bunch of machine-generated lemmas that we can't understand, that we can't kind of wrap our brains around. And it's sort of the same thing that's going on in when we look at one of these neural nets. We're seeing, you know, when we were looking wherever it was at the innards of that neural net, and we say, well, how is it figuring out that that's a picture of a panda?

Well, the answer is it decided that, you know, if we humans were saying, how would you figure out if it's a picture of a panda? We might say, well, look and see if it has eyes. That's a clue for whether it's an animal. Look and see if it looks like it's kind of round and furry and things.

That's a version of whether it's a panda and et cetera, et cetera, et cetera. But what it's doing is it learned a bunch of criteria for, you know, is it a panda or is it one of 10,000 other possible things that it could have recognized? And it learned those criteria in a way that was somehow optimal based on the training that it got and so on.

But it learned things which were distinctions which are different from the distinctions that we humans make in the language that we as humans use. And so in some sense, you know, when we start talking about, well, describe a picture, we have a certain human language for describing that picture.

We have, you know, in our human, in typical human languages, we have maybe 30 to 50,000 words that we use to describe things. Those words are words that have sort of evolved as being useful for describing the world that we live in. When it comes to this neural net, it could be using, it could say, well, the words that it has effectively learned which allow it to make distinctions about what's going on in the analysis that it's doing, it has effectively invented words that describe distinctions, but those words have nothing to do with our historically invented words that exist in our languages.

So it's kind of an interesting situation that it is its way of thinking, so to speak. If you say, well, what's it thinking about? How do we describe what it's thinking? That's a tough thing to answer, because just like with the automated theorem, we're sort of stuck having to say, well, we can't really tell a human story because the things that it invented are things for which we don't even have words in our languages and so on.

Okay, so one thing to realize is in this kind of space of sort of all possible computations, there's a lot of stuff out there that can be done. There's this kind of ocean of sophisticated computation. And then the question that we have to ask for us humans is, okay, how do we make use of all of that stuff?

So what we've got kind of on the one hand is we've got the things we know how to think about, human languages, our way of describing things, our way of talking about stuff, that's the one side of things. The other side of things we have is this very powerful kind of seething ocean of computation on the other side where lots of things can happen.

So the question is, how do we make use of this sort of ocean of computation in the best possible way for our human purposes and building technology and so on? And so the way I see my kind of part of what I've spent a very long time doing is kind of building a language that allows us to take human thinking on the one hand and describe and sort of provide a sort of computational communication language that allows us to get the benefit of what's possible over in the sort of ocean of computation in a way that's rooted in what we humans actually want to do.

And so I kind of view Wolfram Language as being sort of an attempt to make a bridge between, so on the one hand, there's all possible computations. On the other hand, there's things we think we want to do. And I view Wolfram Language as being my best attempt right now to make a way to take our sort of human computational thinking and be able to actually implement it.

So in a sense, it's a language which works on two sides. It's both a language where you as the machine can understand, okay, it's looking at this and that's what it's going to compute. But on the other hand, it's also a language for us humans to think about things in computational terms.

So if I go and I, I don't know, one of these things that I'm doing here, whatever it is, this wasn't that exciting, but find shortest tour of the geo position of the capital cities in South America. That is a language, that's a representation and a precise language of something.

And the idea is that that's a language which we humans can find useful in thinking about things in computational terms. It also happens to be a language that the machine can immediately understand and execute. And so I think this is sort of a general, when I think about AI in general, what is the sort of what's the overall problem?

Well, part of the overall problem is, so how do we tell the AIs what to do, so to speak? There's this very powerful, this sort of ocean of computation is what we get to mine for purposes of building AI kinds of things. But then the question is, how do we tell the AIs what to do?

And what I see, what I've tried to do with Wolfram Language is to provide a way of kind of accessing that computation and sort of making use of the knowledge that our civilization has accumulated. And because that's the, you know, there's the general computation on this side, and there's the specific things that we humans have thought about.

And the question is to make use of the things that we've thought about to do things that we care about doing. Actually, if you're interested in these kinds of things, I happen to just write a blog post last couple of days ago. It's kind of a funny blog post.

It's about, well, you can see the title there. It came because a friend of mine has this crazy project to put little sort of disks or something that should represent kind of the best achievements of human civilization, so to speak, to send out its hitchhiking on various spacecraft that are going out into the solar system in the next little while.

And the question is what to put on this little disk that kind of represents, you know, the achievements of civilization. It's kind of depressing when you go back and you look at what people have tried to do on this before and realizing how hard it is to tell even whether something is an artifact or not.

But this was sort of a, yeah, that's a good one. That's from 11,000 years ago. The question is can you figure out what on earth it is and what it means? But so what's relevant about this is this whole question of there are things that are out there in the computational universe.

And when we think about extraterrestrial intelligence, I find it kind of interesting that artificial intelligence is our first example of an alien intelligence. We don't happen to have found what we view as extraterrestrial intelligence right now, but we are in the process of building pretty decent version of an alien intelligence here.

And the question is if you ask questions like, well, you know, what is it thinking? Does it have a purpose in what it's doing and so on? And you're confronted with things like this. It's very, you can kind of do a test run of what's its purpose? What is it trying to do in a way that is very similar to the kinds of questions you would ask about extraterrestrial intelligence?

But in any case, the main point is that I see this sort of ocean of computation. There's the let's describe what we actually want to do with that ocean of computation. And that's where, that's one of the primary problems we have. Now people talk about AI and what is AI going to allow us to automate?

And my basic answer to that would be, we'll be able to automate everything that we can describe. The problem is it's not clear what we can describe. Or put another way, you imagine various jobs and people are doing things, they're repeated judgment jobs, things like this. They're where we can readily automate those things.

But the thing that we can't really automate is saying, well, what are we trying to do? That is what are our goals? Because in a sense, when we see one of these systems, let's say it's a cellular automaton here. The question is, what is this cellular automaton trying to do?

Maybe I'll give you another cellular automaton that is a little bit more exciting here. Let's do this one. So the question is, what is this cellular automaton trying to do? It's got this whole big structure here and things are happening with it. We can go, we can run it for a couple of thousand steps.

We can ask, it's a nice example of kind of undecidability in action, what's gonna happen here? This is kind of the halting problem. Is this gonna halt? What's it gonna do? There's computational irreducibility, so we actually can't tell. There's a case where we know this is a universal computer, in fact, eventually, well, I won't even spoil it for you.

If I went on long enough, it would go into some kind of cycle, but we can ask, what is this thing trying to do? What is it, you know, is it, what's it thinking about? What's its, you know, what's its goal? What's its purpose? And, you know, we get very quickly in a big mess thinking about those kinds of things.

I've, one of the things that comes out of this principle of computational equivalence is thinking about what kinds of things have, are capable of sophisticated computation. So I mentioned a while back here, sort of my personal history with WolfMalpha of having thought about doing something like WolfMalpha when I was a kid and then believing that you sort of had to build a brain to make that possible and so on.

And one of the things that I then thought was that there was some kind of bright line between what is intelligent and what is merely computational, so to speak. In other words, that there was something which is like, oh, we've got this great thing that we humans have that, you know, is intelligence and all these things in nature and so on and all the stuff that's going on there, it's just computation or it's just, you know, things operating according to rules, that's different.

There's some bright line distinction between these things. Well, I think the thing that came about after I'd looked at all these cellular automata and all kinds of other things like that is I sort of came up with this principle of computational equivalence idea, which we've now got quite a lot of evidence for, which I talk about people are interested in, but that basically there isn't a, that once you reach a certain level of computational sophistication, everything is equivalent.

And that means that, that implies that there really isn't a bright line distinction between, for example, the computations going on in our brains and the computations going on in the simple cellular automata and so on. And that essentially philosophical point is what actually got me to start trying to build both from alpha, because I realized that, gosh, you know, I'd been looking for this sort of, the magic bullets of intelligence, and I just decided probably there isn't one.

And actually it's all just computation. And so that means we can actually in practice build something that does this kind of intelligent like thing, and so that's what I think is the case, is that there really isn't sort of a bright line distinction and that has more extreme consequences.

Like people will say things like, you know, the weather has a mind of its own, okay? Sounds kind of silly, sounds kind of animistic, primitive and so on, but in fact, the, you know, fluid dynamics of the weather is as computationally sophisticated as the stuff that goes on in our brains.

But we can start asking, but then you say, but the weather doesn't have a purpose. You know, what's the purpose of the weather? Well, you know, maybe the weather is trying to equalize the temperature between the, you know, the North Pole and the tropics or something. And then we have to say, well, but that's not a purpose in the way that we think about purposes.

That's just, you know, and we get very confused. And in the end, what we realize is when we're talking about things like purposes, we have to have this kind of chain of provenance that goes back to humans and human history and all that kind of thing. And I think it's the same type of thing when we talk about computation and AI and so on.

The thing that we, this question of sort of purpose, goals, things like this, that's the thing which is intrinsically human and not something that we can ever sort of automatically generate. It makes no sense to talk about automatically generating it because these computational systems, they do all kinds of stuff.

You know, we can say they've got a purpose, we can attribute purposes to them, et cetera, et cetera, et cetera, but, you know, ultimately it's sort of a human thread of purpose that we have to deal with. So that means, for example, when we talk about AIs and we're interested in things like, so how do we tell, you know, like we'd like to be able to tell, we talk about AI ethics, for example.

We'd like to be able to make a statement to the AIs like, you know, please be nice to us humans. And that's a, you know, that's something, so one of the issues there is, so talking about that kind of thing, one of the issues is how are we going to make a statement like be nice to us humans?

What's the, you know, how are we going to explain that to an AI? And this is where, again, you know, my efforts to build a language, a computational communication language that bridges the world of what we humans think about and the world of what is possible in computation is important, and so one of the things I've been interested in is actually building what I call a symbolic discourse language that can be a general representation for sort of the kinds of things that we might want to put in, that we might want to say in things like be nice to humans.

So sort of a little bit of background to that. So, you know, in the modern world, people are keen on smart contracts. They often think of them as being deeply tied into blockchain, which I don't think is really quite right. The important thing about smart contracts is it's a way of having sort of an agreement between parties which can be executed automatically, and that agreement may be, you know, you may choose to sort of anchor that agreement in a blockchain, you may not, but the whole point is you have to, what you, you know, when people write legal contracts, they write them in an approximation to English.

They write them in legalese typically 'cause they're trying to write them in something a little bit more precise than regular English, but the limiting case of that is to make a symbolic discourse language in which you can write the contract in code basically. And I've been very interested in using Wolfram Language to do that because in Wolfram Language, we have a language which can describe things about the world and we can talk about the kinds of things that people actually talk about in contracts and so on.

And we're most of the way there to being able to do that. And then when you start thinking about that, you start thinking about, okay, so we've got this language to describe things that we care about in the world. And so when it comes to things like tell the AIs to be nice to the humans, we can imagine using Wolfram Language to sort of build an AI constitution that says this is how the AI is supposed to work.

But when we talk about sort of just the untethered, you know, the untethered AI doesn't have any particular, it's just gonna do what it does. And if we want it to, you know, if we want to somehow align it with human purposes, we have to have some way to sort of talk to the AI.

And that's, you know, I view my efforts to build Wolfram Language as a way to do that. I mean, you know, as I was showing at the beginning, you can use, you can take natural language and with natural language, you can build up a certain amount of, you can say a certain number of things in natural language.

You can then say, well, how do we make this more precise in a precise symbolic language? If you want to build up more complicated things, it gets hard to do that in natural language. And so you have to kind of build up more serious programs in symbolic language. And I've probably been yakking a while here and I'm happy to, I can talk about all kinds of different things here, but maybe I've not seen as many reactions as I might've expected to think.

So I'm not sure which things people are interested in which they're not. But so maybe I should stop here and we can have discussion, questions, comments. Yes. (audience applauding) - Yes, two microphones if you have questions, please come up. - So I have a quick question. It goes to the earlier part of your talk where you say you don't build a top-down ontology, you actually build from the bottom up with disparate domains.

What do you feel are the core technologies of the knowledge representation which you use within Wolfram Alpha that allows you, you know, different domains to reason about each other, to come up with solutions? And is there any feeling of differentiability, for example, so if you were to come up with a plan to do something new within Wolfram Alpha language, you know, how would you go about doing that?

- Okay, so we've done maybe a couple of thousand domains. What is actually involved in doing one of these domains? It's a gnarly business. Every domain has some crazy different thing about it. I tried to make up actually a while ago, let me show you something, a kind of a hierarchy of what it means to make, see if I can find this here, kind of a hierarchy of what it means to make a domain computable.

Where is it? There we go. Okay, here we go. So this is sort of a hierarchy of levels of what it means to make a domain computable from just, you know, you've got some array of data that's quite structured. Forget, you know, the separate issue about extracting things from unstructured data, but let's imagine that you were given, you know, a bunch of data about landing sites of meteorites or something, okay?

So you go through various levels. So, you know, things like, okay, the landing sites of the meteorites, are the positions just strings, or are they some kind of canonical representation of geoposition? Is the, you know, is the type of meteorite, you know, some of them are iron meteorites, some of them are stone meteorites.

Have you made a canonical representation? Have you made some kind of way to identify what-- - Sorry, go ahead. - No, no, I mean, to do that, so-- - So my question is like, you know, if you did have positions as a string as well as a canonical representation, do you have redundant pieces of the same, redundant representations of the same information in the different-- - No, I mean, our goal-- - Is everything canonical that you have?

Do you have a minimal representation of everything? - Yeah, our goal is to make everything canonical. Now, that's, you know, there is a lot of complexity in doing that. I mean, if you, you know, in each, okay, so another feature of these domains. Okay, so here's another thing to say.

You know, it would be lovely if one could just automate everything and cut the humans out of the loop. Turns out this doesn't work. And in fact, whenever we do these domains, it's fairly critical to have expert humans who really understand the domain or you simply get it wrong.

And it's also, having said that, once you've done enough domains, you can do a lot of cross-checking between domains and we are the number one reporters of error and of errors in pretty much all standardized data sources because we can do that kind of cross-checking. But I think, you know, if you ask the question, what's involved in bringing online a new domain, it's, you know, those sort of hierarchy of things, you know, some of those take a few hours.

You can get to the point of having, you know, we've got good enough tools for ingesting data, figuring out, oh, those are names of cities in that column. Let's, you know, let's canonicalize those. You know, some may be questions, but many of them we'll be able to nail down.

And to get to the full level of you've got some complicated domain and it's fully computable is probably a year of work. And you might say, well, gosh, why are you wasting your time? You've got to be able to automate that. So you can probably tell we're fairly sophisticated about machine learning kinds of things and so on.

And we have tried, you know, to automate as much as we can. And we have got a pretty efficient pipeline, but if you actually want to get it right, and you see, here's an example of what happens. There's a level, even going between Wolfram Alpha and Wolfram Language, there's a level of, so for example, let's say you're looking at, you know, lakes in Wisconsin, okay?

So people are querying about lakes in Wisconsin and Wolfram Alpha, they'll name a particular lake and they want to know, you know, how big is the lake? Okay, fine. In Wolfram Language, they'll be doing a systematic computation about lakes in Wisconsin. So if there's a lake missing, you're gonna get the wrong answer.

And so that's a kind of higher level of difficulty. - Okay. - But there's, yeah, I think you're asking some more technical questions about ontologies and I can try and answer those. - Actually, one quick question. Can you-- - Wait, wait, wait, wait, wait. No, there's a lot of other questions.

- Yeah, that's fine. - Okay. - Thank you very much, that was a great question. - We'll recycle this. - To the left here, please. - I've got a simple question. Who or what are your key influences? - Oh gosh. In terms of language design for Wolfram Language, for example-- - So in the context of machine intelligence, if you like, if you want to make it tailored to this audience.

- I don't know, I've been absorbing stuff forever. I think my main, in terms of language design, probably Lisp and APL were my sort of early influences. But in terms of thinking about AI, hmm. You know, in, I mean, I'm kind of quite knowledgeable. I like history of science.

I'm pretty knowledgeable about the early history of kind of mathematical logic, symbolic kinds of things. I would say, okay, maybe I can answer that in the negative. I have, for example, in building Wolfram Alpha, I thought, gosh, let me do my homework, let me learn all about computational linguistics, let me hire some computational linguistics PhDs.

That will be a good way to get this started. Turns out, we used almost nothing from the previous sort of history of computational linguistics, partly because what we were trying to do, namely short question natural language understanding, is different from a lot of the natural language processing, which has been done in the past.

I also have made, to my disappointment, very little use of, you know, people like Marvin Minsky, for example, I really don't think, I mean, I knew Marvin for years, and in fact, some of his early work on simple Turing machines and things, those are probably more influential to me than his work on AI.

And, you know, probably my mistake of not understanding that better, but really, I would say that I've been rather uninfluenced by sort of the traditional AI kinds of things. I mean, it probably hasn't helped that I've kind of lived through a time when sort of AI went from, you know, when I was a kid, AI was gonna solve everything in the world and then, you know, it kind of decayed for a while and then sort of come back.

So I would say that I can describe my negative, my non-influence is better than my influence. - The impression you give is that you made it up out of your own head, and it sounds as though that's pretty much right. - Yeah, I mean, yes. I mean, insofar as there's things to, I mean, look, things like the, you know, okay, so for example, studying simple programs and trying to understand the universe of simple programs, actually, the personal history of that is sort of interesting.

I mean, I, you know, I used to do particle physics when I was a kid, basically, and then I actually got interested, okay, so I'll tell you the history of that, just as an example of how sort of interesting as a sort of history of ideas type thing. So I was interested in how order arises in the universe.

So, you know, you start off from the hot Big Bang and then pretty soon you end up with a bunch of humans and galaxies and things like this. How does this happen? So I got interested in that question. I was also interested in things like neural networks for sort of AI purposes, and I thought, let me make a minimal model that encompasses sort of how complex things arise from other stuff, and I ended up sort of making simpler and simpler and simpler models and eventually wound up with cellular automata, and which I didn't know were called cellular automata when I started looking at them and then found they did interesting things, and the two areas where cellular automata have been singularly unuseful in analyzing things are large scale structure in the universe and neural networks.

So it turned out, but that, by the way, the fact that I kind of even imagined that one could just start, yeah, I should say, you know, I've been doing physics, and in physics, the kind of intellectual concept is you take the world as it is and you try and drill down and find out what, you know, what makes the world out of primitives and so on.

It's kind of a, you know, reduced to find things. Then I built my first computer language, I think called SMP, which went the other way around, where I was just like, I'm just gonna make up this computer language and, you know, just make up what I want the primitives to be and then I'm gonna build stuff up from it.

I think that the fact that I kind of had the idea of doing things like making up cellular automata as possible models for the world was a consequence of the fact that I worked on this computer language, which was a thing which worked the opposite way around from the way that one is used to doing natural science, which is sort of this reductionist approach.

And that's, I mean, so that's just an example of, you know, I found, I happen to have spent a bunch of time studying, as I say, history of science. And one of my hobbies is sort of history of ideas. I even wrote this little book called Idea Makers, which is about biographies of a bunch of people who for one reason or another I've written about.

And so I'm always curious about this thing about how do people actually wind up figuring out the things they figure out. And, you know, one of the conclusions of my, you know, investigations of many people is there are very rarely moments of inspiration. Usually it's long, multi-decade kinds of things, which only later get compressed into something short.

And also the path is often much, you know, it's quite, what can I say, that the steps are quite small, and, you know, but the path is often kind of complicated. And that's what it's been for me. So I- - Simple question, complex answer. - Sorry about that. (laughing) - Go ahead, please.

- Hello. So what I basically see from the Wolfram language is it's a way to describe all of objective reality. It's kind of formalizing just about the entire domain of discourse, to use a philosophical term. And you kind of hinted at this in your lecture where it sort of leaves off, is that when we start to talk about more esoteric philosophical concepts, purpose, I guess this would lead into things like epistemology, because essentially you only have science there.

And as amazing as science is, there are other things that are talked about, not, you know, like idealism versus materialism, et cetera. Do you have an idea of how Wolfram might or might not be able to branch into those discourses? Because I'm hearing echoes in my head of that time.

Ballstrom said that an AI needs a, you know, when you give an AI a purpose, there's like, I think he said philosophers are divided completely evenly between the top four ways to measure how good something should be. It's like utilitarianism and- - Sure. - Do you have the four minus Japanese?

- Yeah, right. So the first thing is, I mean, this problem of making what, okay, about 300 years ago, people like Leibniz were interested in the same problem that I'm interested in, which is how do you formalize sort of everyday discourse? And Leibniz had the original idea, you know, he was originally trained as a lawyer, and he had this idea, if he could only reduce all law, all legal questions to matters of logic, he could have a machine that would basically describe, you know, answer every legal case, right?

He was unfortunately a few hundred years too early, even though he did have, you know, he tried to, he tried to do all kinds of things, very similar to things I've tried to do, like he tried to get various dukes to assemble big libraries of data and stuff like this, but the point, so what he tried to do was to make a formalized representation of everyday discourse for whatever reason, for the last 300 years, basically people haven't tried to do that.

There's, it's an almost completely barren landscape. There was this period of time in the 1600s when people talked about philosophical languages. Leibniz was one, a guy called John Wilkins was another, and they tried to, you know, break down human thought into something symbolic. People haven't done that for a long time.

In terms of what can we do that with, you know, I've been trying to figure out what the best way to do it is. I think it's actually not as hard as one might think. These areas, one thing you have to understand, these areas like philosophy and so on, are, they're on the harder end.

I mean, things like, a good example, typical example, you know, I want to have a piece of chocolate, okay? The, in Wolfram language right now, we have a pretty good description of pieces of chocolate. We know all sorts of, you know, we probably know 100 different kinds of chocolate.

We know how big the pieces are, all that kind of thing. The I want part of that sentence, we can't do that right now, but I don't think that's that hard, and I'm, you know, that's, now if you ask, let's say we had, I think the different thing you're saying is, let's say we had the omnipotent AI, so to speak, that was able to, you know, where we turn over the control of the central bank to the AI, we turn over all these other things to the AI.

Then the question is, we say to the AI, now do the right thing. And then the problem with that is, and this is why I talk about, you know, creating AI constitutions and so on, we have absolutely no idea what do the right thing is supposed to mean. And philosophers have been arguing about that, you know, utilitarianism is an example of that, of one of the answers to that, although it's not a complete answer by any means, it's not really an answer, it's just a way of posing the question.

And so I think that the, you know, one of the features of, so I think it's a really hard problem to, you know, you think to yourself, what should the AI constitution actually say? So first thing you might think is, oh, there's going to be, you know, something like Asimov's laws of robotics.

There's going to be one, you know, golden rule for AIs. And if we just follow that golden rule, all will be well. Okay, I think that that is absolutely impossible. And in fact, I think you can even sort of mathematically prove that that's impossible. Because I think as soon as you have a system that, you know, essentially what you're trying to do is you're trying to put in constraints that, okay, basically, as soon as you have a system that shows computational irreducibility, I think it is inevitable that you have unintended consequences of things, which means that you never get to just say, put everything in this one very nice box.

You always have to say, let's put in a patch here, let's put in a patch there, and so on. A version of this, much more abstract version of this, Godel's theorem. So Godel's theorem is, you know, it starts off by taking the, you know, Godel's theorem is trying to talk about integers.

It says, start off with Peano's axioms. Peano's axioms, you might say, in Peano thought, describe the integers and nothing but the integers. Okay, so anything that's provable from Peano's axioms will be true about integers and vice versa, okay? What Godel's theorem shows is that that will never work, that there are an infinite hierarchy of patches that you have to put on to Peano's axioms if you want to describe the integers and nothing but the integers.

And I think the same is true if you want to have a legal system effectively that has no bizarre unintended consequences. So I don't think it's possible to just say, you know, if you, when you're describing something in the world that's complicated like that, I don't think it's possible to just have a small set of rules that will always do what we want, so to speak.

I think it's inevitable that you have to have a long, essentially, code of laws, and that's what, you know, so my guess is that what will actually have to happen is, you know, as we try and describe what should we want the AIs to do, you know, I don't know the sociopolitical aspects of how we'll figure out whether it's one AI constitution or one per, you know, city or whatever.

We can talk about that, that's a separate issue, but, you know, I think what will happen is it'll be much like human laws. It'll be a complicated thing that gets progressively patched. And so I think it's some, and these ideas like, you know, oh, we'll just make the AIs, you know, run the world according to, you know, Mill's, you know, John Stuart Mill's idea, it's not gonna work.

Which is not surprising, 'cause philosophy has made the point that it's not an easy problem for the last 2,000 years, and they're right. It's not an easy problem. - Thank you. - Yeah. - Hi, you're talking about computational irreducibility and computational equivalence, and also that earlier on in your intellectual adventures, you're interested in particle physics and things like that.

I've heard you make the comment before in other contexts that things like molecules compute, and I was curious to ask you exactly what you mean by that, in what sense does a molecule-- - I mean, what would you like to compute, so to speak? I mean, in other words, what is the case is that, you know, one definition of your computing is given a particular computation, like, I don't know, finding square roots or something, you know, you can program a, you know, the surprising thing is that an awful lot of stuff can be programmed to do any computation you want.

That's some, and, you know, when it comes to, I mean, I think, for example, when you look at nanotechnology and so on, the current, you know, one of the current beliefs is to make very small computers, you should take what we know about making big computers and just, you know, make them smaller, so to speak.

I don't think that's the approach you have to use. I think you can take the components that exist at the level of molecules and say, how do we assemble those components to be able to do complicated computations? I mean, it's like the cellular automata, that the, you know, the underlying rule for the cellular automaton is very simple, yet when that rule is applied many times, it can do a sophisticated computation.

So I think that that's the sense in which, what can I say, the raw material that you need for computation can be, you know, there's a great diversity in the raw material that you can use for computation. Our particular human development, you know, stack of technologies that we use for computation right now is just one particular path, and we can, you know, so a very practical example of this is algorithmic drugs.

So the question is, right now, drugs pretty much work by, most drugs work by, you know, there is a binding site on a molecule, drug fits into binding site, does something. Question is, can you imagine having something where the molecule, you know, is something which has computations going on in it, where it goes around and it looks at that, you know, that thing it's supposed to be binding to, and it figures out, oh, there's this knob here and that knob there, it reconfigures itself, it's computing something, it's trying to figure out, you know, is this likely to be a tumor cell or whatever, based on some more complicated thing.

That's the type of thing that I mean by computations happening at molecular scale. - Okay, I guess I meant to ask, if it follows from that, if, in your view, like the molecules in the chalkboard and in my face and in the table are, in any sense, currently doing computing.

- Sure, I mean, the question of what computation, look, one of the things to realize, if you look at kind of the sort of past and future of things, okay, so here's an observation, actually, I was about Leibniz, actually. In Leibniz's time, Leibniz made a calculator-type computer out of brass, took him 30 years, okay?

So in his day, there was, you know, at most one computer in the world, as far as he was concerned, right? Today's world, there may be 10 billion computers, maybe 20 billion computers, I don't know. The question is, what's that gonna look like in the future? And I think the answer is that, in time, probably everything we have will be made of computers, in the following sense, that basically, it won't be, you know, in today's world, things are made of, you know, metal, plastic, whatever else, but actually, that won't make it, there won't be any point in doing that.

Once we know how to do, you know, molecular-scale manufacturing and so on, we might as well just make everything out of programmable stuff. And I think that's a sense in which, you know, and, you know, the one example we have in molecular computing right now is us, in biology.

You know, biology does a reasonable job of specific kinds of molecular computing. It's kind of embarrassing, I think, that the only, you know, molecule we know that's sort of a memory molecule is DNA, that's kind of, you know, which is kind of the, you know, the particular biological solution.

In time, we'll know lots of others. And, you know, I think the, sort of the end point is, so if you're asking, is, you know, is computation going on in, you know, in this water bottle, the answer is absolutely. It's probably even many aspects of that computation are pretty sophisticated.

If we wanted to know what would happen to particular molecules here, it's gonna be hard to tell. There's going to be computational irreducibility and so on. Can we make use of that for our human purposes? Can we piggyback on that to achieve something technological? That's a different issue. And that's, for that, we have to build up this whole, sort of, chain of technology to be able to connect it, which is what I've kind of been, been keep on talking about is, how do we connect, sort of, what is possible computationally in the universe to what we humans can kind of conceptualize that we want to do in computation?

And that's, you know, that's the bridge that we have to make and that's the hard part. But getting, the intrinsic getting the computation done is, is, you know, there's computation going on all over the place. - Maybe a couple more questions. I was hoping you could elaborate on what you were talking about earlier of, like, searching the entire space of possible programs.

So that's very broad. So maybe, like, what kind of searching of that space we're good at and, like, what we're not and I guess what the differences are. - Yeah, right, so, I mean, I would say that we're at an early stage in knowing how to do that, okay?

So I've done lots of these things and they are, the thing that I've noticed is, if you do an exhaustive search, then you don't miss even things that you weren't looking for. If you do a non-exhaustive search, there is a tremendous tendency to miss things that you weren't looking for.

And so, you know, we've done searches or a bunch of function evaluation in Wolfram Language is done by, was done by searching for optimal approximations in some big space. A bunch of stuff with hashing is done that way. Bunch of image processing is done that way. Where we're just sort of searching this, you know, doing exhaustive searches in maybe trillions of programs to find things.

Now, you know, there is, on the other side of that story is the incremental improvements story with deep learning and neural networks and so on, where because there is differentiability, you're able to sort of incrementally get to a better solution. Now, in fact, people are making less and less differentiability in deep learning neural nets.

And so, I think eventually there's going to be sort of a grand unification of these kinds of approaches. Right now, we're still, you know, I don't really know what the, you know, the exhaustive search side of things, which you can use for all sorts of purposes. I mean, the reason, the surprising thing that makes exhaustive search not crazy is that there is rich, sophisticated stuff near at hand in the computational universe.

If you had to go, you know, quadrillions, you know, through a quadrillion cases before you ever found anything, exhaustive search would be hopeless. But you don't in many cases. And, you know, I would say that we are in a fairly primitive stage of the science of how to do those searches.

Well, my guess is that there'll be some sort of unification, which needless to say, I've thought a bunch about, and between kind of the neural net. So, you know, the trade-off typically in neural nets is you can have a neural net that is very good at, that is, you know, uses its computational resources well, but it's really hard to train, or you can have a neural net that doesn't use its computational resources so well, but it's very easy to train, because it's very, you know, smoothly.

And, you know, my guess is that somewhere in the, you know, harder to train, but makes use of things that are closer to the complete computational universe is where one's going to see progress. But it's a really interesting area, and, you know, I consider us only at the beginning of figuring that out.

- One last question. - Hi. - Hello, keep going? - Yeah, okay. - All right, let's do it. - Thank you for your talk. Just to give a bit of context for my question, I research how we could teach AI to kids, and developing platforms for that, how we could teach artificial intelligence and machine learning to children, and I know you develop resources for that as well.

So, I was wondering, like, where do you think it's problematic that we have computation that is very efficient, and can, you know, from a utilitarian and problem-solving perspective, it achieves all the goals, but we don't understand how it works, so we have to create these fake steps, and if you could think of scenarios where that could become very problematic over time, and why do we approach it in such a deterministic way?

And when you mentioned that computation and intelligence are differentiated by this, like, very thin line, how does that affect the way you learn, and how do you think that will affect the way we kids learn, we learn? - Right, so I mean, look, my general principle about, you know, future generations and what they should learn, first point is, you know, very obvious point, that for every field that people study, you know, archeology to zoology, there either is now a computational X, or there will be soon.

So, you know, every field, the paradigm of computation is becoming important, perhaps the dominant paradigm in that field. Okay, so how do you teach kids to be useful in a world where everything is computational? I think the number one thing is to teach them how to think in computational terms.

What does that mean? It doesn't mean writing code, necessarily. I mean, in other words, one of the things that's happening right now as a practical matter is, you know, there've been these waves of enthusiasm for teaching coding of various kinds. You know, we're in a, actually we're in the end of an uptick wave, I think.

It's going down again. You know, it's been up and down for 40 years or so. Okay, why doesn't that work? Well, it doesn't work because while there are people, like people who are students at MIT, for example, for whom they really want to learn, you know, engineering style coding, and it really makes sense for them to learn that, the vast majority of people, it's just not going to be relevant because they're not going to write a low-level C program or something.

And it's the same thing that's happened in math education, which has been sort of a disaster there, which is the number one takeaway for most people from the math they learn in school is, I don't like math. And, you know, that's not for all of them, obviously, but that's the, you know, if you ask on a general scale, you know, what people, and why is that?

Well, part of the reason is because what's been taught is rather low-level and mechanical. It's not about mathematical thinking, particularly. It's mostly about, you know, what teachers can teach and what assessment processes can assess and so on. Okay, so how should one teach computational thinking? I mean, I'm kind of excited about what we can do with Wolfram Language because I think we have a high enough level language that people can actually write, you know, that, for example, I reckon by age 11 or 12, and I've done many experiments on this, so I have some, the only problem with my experiments is most of my experiments end up being with kids who are high-achieving kids.

Despite many efforts to reach lower-achieving kids, it always ends up that the kids who actually do the things that I set up are the high-achieving kids. But, you know, setting that aside, you know, you take the typical, you know, 11, 12, 13-year-olds and so on, and they can learn how to write stuff in this language, and what's interesting is they learn to start thinking, here, I'll show you, let's be very practical.

I can show you, I was doing, every Sunday, I do a little thing with some middle school kids, and I might even be able to find my stuff from yesterday. This is, okay, let's see. Programming Adventures, January 28th. Okay, let's see what I did. Oh, look at that. That was why I thought of the South America thing here, because I'd just done that with these kids.

And so, what are we doing? We were trying to figure out this, trying to figure out the shortest tour thing that I just showed you, which is, this is where I got what to show you, is what I was doing with these kids. But this was my version of this, but the kids all had various different versions of this, and we had somebody suggested, let's just enumerate, let's just look at all possible permutations of these cities and figure out what their distances are.

There's the histogram of those. That's what we get from those. Okay, how do you get the largest distance from those, et cetera, et cetera, et cetera. And this is, okay, this was my version of it, but the kids had similar stuff. And this is, I think, and it probably went off into, oh yeah, there we go, there's the one for the whole Earth, and then they wanted to know, how do you do that in 3D?

So I was showing them how to convert to XYZ coordinates in 3D and make the corresponding thing in 3D. So what's, this maybe isn't, this is a random example from yesterday, so it's not a highly considered example, but what I think is interesting is that we seem to have finally reached the point where we've automated enough of the actual doing of the computation that the kids can be exposed mostly to the thinking about what you might want to compute.

And part of our role in language design, as far as I'm concerned, is to get it as much as possible to the point where, for example, you can do a bunch of natural language input, you can do things which make it as easy as possible for kids to not get mixed up in the kind of what the, how the computation gets done, but rather to just think about how you formulate the computation.

So for example, a typical example I've used a bunch of times in what does it mean to write code versus do other things? Like a typical sort of test example would be, I don't know, you ask somebody, you're gonna, there's a practical problem we had in Wolf Malfoy, you give a lat-long position on the Earth, and you say, you're gonna make a map of that lat-long position.

What scale of map should you make? Right, so if the lat-long is in the middle of the Pacific, making a 10-mile radius map isn't very interesting. If it's in the middle of Manhattan, a 10-mile radius map might be quite a sensible thing to do. So the question is, come up with an algorithm, come up with even a way of thinking about that question.

What do you do? How should you figure that out? Well, you might say, oh, let's look at the visual complexity of the image. Let's look at how far it is to another city. That's far, you know, there are various different things, but thinking about that as a kind of computational thinking exercise that is, you know, that's the kind of thing.

So in terms of what one automates and whether people need to understand how it works inside, okay, main point is you'll, in the end, it will not be possible to know how it works inside. So you might as well stop having that be a criterion. I mean, that is, there are plenty of things that one teaches people that are, let's say in lots of areas of biology, medicine, whatever else, you know, maybe we'll know how it works inside one day, but you can still, there's an awful lot of useful stuff you can teach without knowing how it works inside.

And I think also, as we get computation to be more efficient, inevitably, we will be dealing with things where you don't know how it works inside. Now, you know, we've seen this in math education 'cause I've happened to make tools that automate a bunch of things that people do in math education.

And I think, well, to tell a silly story, I mean, my older daughter, who at some point in the past was doing calculus, you know, and learning doing integrals and things, and I was saying to her, you know, I didn't think humans still did that stuff anymore, which was a very unendearing comment.

But in any case, I mean, you know, there's a question of whether do humans need to know how to do that stuff or not? So I haven't done an integral by hand in probably 35 years. That true? More or less true. But when I was using computers to do them, I was for a while, you know, when I used to do physics and so on, I used computers to do this stuff, I was a really, really good integrator, except that it wasn't really me, it was me plus the computer.

So how did that come to be? Well, the answer was that because I was doing things by computer, I was able to try zillions of examples, and I got a much better intuition than most people got for how these things would work roughly, how what you did to make the thing go and so on.

Whereas people who are like, I'm just working this one thing out by hand, you get a different, you know, you don't get that intuition. So I think, you know, two points. First of all, you know, this, how do you think about things computationally? How do you formulate the question computationally?

That's really important and something that we are now in a position, I think, to actually teach. And it is not really something you teach by, you know, teaching, you know, traditional quotes coding, because a lot of that is, okay, we're gonna make a loop, we're gonna define variables. I just as a, I think I probably have a copy here, yeah.

I wrote this book for, this is a book kind of for kids about open language, except it seems to be useful to adults as well, but I wrote it for kids. So it's, one of the amusing things in this book is it doesn't talk about assigning values to variables until chapter 38.

So in other words, that will be a thing that you would find in chapter one of most, you know, low level programming, coding type things. It turns out it's not that relevant to know how to do that. It's also kind of confusing and not necessary. And so, you know, in terms of the, you asked where will we get in trouble when people don't know how the stuff works inside.

That's, I mean, you know, I think one just has to get used to that because it's like, you know, you might say, well, we live in the world and it's full of natural processes where we don't know how they work inside, but somehow we managed to survive and we go to a lot of effort to do natural science to try and figure out how stuff works inside.

But it turns out we can still use plenty of things even when we don't know how they work inside. We don't need to know. And I think the, I mean, I think the main point is computational irreducibility guarantees that we will be using things where we don't know and can't know how they work inside.

And, you know, I think the perhaps, the thing that is a little bit, you know, to me a little bit unfortunate as a, you know, as a typical human type thing, the fact that I can readily see that, you know, the AI stuff we build is sort of effectively creating languages and things that are completely outside our domain to understand.

And where, by that I mean, you know, our human language with its 50,000 words or whatever has been developed over the last however many, you know, tens of thousands of years. And as a society, we've developed this way of communicating and explaining things. You know, the AIs are reproducing that process very quickly, but they're coming up with a, an ahistorical, you know, something, you know, their way of describing the world, but it doesn't happen to relate at all to our historical way of doing it.

And that's, you know, it's a little bit disquieting to me as a human that, you know, there are things going on inside where I know it is, you know, in principle, I could learn that language, but it's, you know, not the historical one that we've all learned. And it really wouldn't make a lot of sense to do that 'cause you learn it for one AI and then another one gets trained and it's gonna use something different.

So it's, but my main, I guess my main point for education, so another point about education I'll just make, which is something I haven't figured out, but just is, you know, when do we get to make a good model for the human learner using machine learning? So in other words, you know, part of what we're trying to do, like I've got that automated proof, I would really like to manage to figure out a way, what is the best way to present that proof so a human can understand it?

And basically for that, we have to have a bunch of heuristics about how humans understand things. So as an example, if we're doing, let's say, a lot of visualization stuff in Wolfram Language, okay, we have tried to automate, do automated aesthetics. So what we're doing is, you know, we're laying out a graph, what way of laying out that graph is most likely for humans to understand it?

And we've done that, you know, by building a bunch of heuristics and so on, but that's an example of, you know, if we could do that for learning, and we say, what's the optimal path, given that the person is trying to understand this proof, for example, what's the optimal path to lead them through understanding that proof?

I suspect we will learn a lot more in probably fairly small number of years about that. And it will be the case that, you know, for example, if you've got, oh, I don't know, you can do simple things like, you know, you go to Wikipedia and you look at what the path of, you know, how do you, if you wanna learn this concept, what other concepts do you have to learn?

We have much more detailed symbolic information about what is actually necessary to know in order to understand this and so on. It is, I think, reasonably likely that we will be able to, I mean, you know, if I look at, I was interested recently in the history of math education.

So I wanted to look at the complete sort of path of math textbooks, you know, for the past, well, basically the, like, 1,200, you know, Pivarnacci produced this, one of the early math textbooks. So there've been these different ways of teaching math. And, you know, I think we've gradually evolved a fairly optimized way for the typical person, though it's probably the variation of the population is not well understood, for, you know, how to explain certain concepts.

And we've gone through some pretty weird ways of doing it from the 1600s and so on, where which have gone out of style and possibly, you know, who knows whether that's because of, well, but anyway, so, you know, we've kind of learned this path of what's the optimal way to explain adding fractions or something for humans, for the typical human.

But I think we'll learn a lot more about how, you know, by essentially making a model for the human, a machine model for the human, we'll learn more about how to, you know, how to optimize, how to explain stuff to humans, a coming attraction. But-- - Thanks. - By the way, do you think we're close to that at all?

'Cause you said that there's something in Wolfram Alpha that presents the human a nice way. Are we how far, you said, coming attraction 10 years? - Yeah, right, so I mean, in that explaining stuff to humans thing is a lot of human work right now. Being able to automate explaining stuff to humans.

Okay, so some of these things, let's see. I mean, so an interesting question, actually just today I was working on something that's related to this. Yeah, it's being able to, the question is given a whole bunch of, can we, for example, train a machine learning system from explanations that it can see, roughly, can we train it to give explanations that are likely to be understandable?

Maybe. I think the, okay, so an example that I'd like to do, okay, I'd like to do a debugging assistant where the typical thing is program runs, program gives wrong answer. Human says, why did you get the wrong, why did it give the wrong answer? Well, the first piece of information to the computer is that was, the human thought that was the wrong answer 'cause the computer just did what it was told and it didn't know that was supposed to be the wrong answer.

So then the question is, can you in fact, in that domain, can you actually have a reasonable conversation in which the human is explaining the computer what they thought it was supposed to do, the computer is explaining what happened and why did it happen and so on. Same kind of thing for math tutoring.

You know, we have a lot of, you know, we've got a lot of stuff, you know, we're sort of very widely used for people who want to understand the steps in math. You know, can we make a thing where people tell us, I think it's this? Okay, I'll tell you one little factoid, which I, which it did work out.

So if you do multi-digit arithmetic, multi-digit addition, okay? Okay, so the basis of this is, it's kind of silly, silly thing, but you know, if you get the right answer for an addition sum, okay, you don't get very much information. The student gives the wrong answer, the question is, can you tell them where they went wrong?

So let's say you have a four-digit addition sum and the student gives the wrong answer. Can you backtrace and figure out what they likely did wrong? And the answer is you can. You just make this graph of all the different things that can happen, you know, when did they, you know, there's certain things that are more common, transposing numbers and things, or you know, having a one and a seven mixed up, those kinds of things.

You can, with very high probability, given a four-digit addition sum with the wrong answer, you can say this is the mistake you made, which is sort of interesting. And that's, you know, being done in a fairly symbolic way, whether one can do that in a, you know, more machine learning kind of way for more complicated derivations, I'm not sure.

But that's a, you know, that's one that works. - Hi, sir, I just had a follow-up question. So do you think, you know, like in the future, is it possible to simulate virtual environments which can actually understand how the human mind works and then build, you know, like finite state machines inside of this virtual environment to provide a better learning experience and a more personalized learning experience?

- Well, I mean, so the question is, if you're going to, you know, can you optimize, if you're playing a video game or something and that video game is supposed to be educational, can you optimize the experience based on a model of you, so to speak? Yeah, I'm sure the answer is yes.

And I'm sure the, you know, the question of how complicated the model of you will be is an interesting question. I don't know the answer to. I mean, I've kind of wondered a similar question. So I'm a kind of personal analytics enthusiast, so I collect tons of data about myself.

And I mean, I do it mostly 'cause it's been super easy to do and I've done it for like 30 years. And I have, you know, every keystroke I've typed on a computer, like every keystroke I've typed here. And I, the screen of my computer, every 30 seconds or so, maybe 15 seconds, I'm not sure, there's a screenshot.

It's a super boring movie to watch. But anyway, I've been collecting all this stuff. And so the question that I've asked is, is there enough data that a bot of me could be made? In other words, do I have enough data about, you know, I've got, I've written a million emails, I have all of those, I've received 3 million emails over that period of time.

I've got, you know, endless, you know, things I've typed, et cetera, et cetera, et cetera. You know, is there enough data to reconstruct, you know, me basically? I think the answer is probably yes. Not sure, but I think the answer is probably yes. And so the question is in an environment where you're interacting with some video game, trying to learn something, whatever else, you know, how long is it going to be before it can learn enough about you to change that environment in a way that's useful for explaining the next thing to you?

I would guess, I would guess that if done, that this is comparatively easy. I might be wrong, but, and that the, I mean, I think, you know, it's an interesting thing because, you know, one's dealing with, you know, there's a space of human personalities, there's a space of human learning styles.

You know, I'm sort of always interested in the space of all possible XYZ. And there's, you know, there's that question of how do you parameterize the space of all possible human learning styles? And is there a way that we will learn, you know, like, can we do that symbolically and say these are 10 learning styles, or is it something, I think that's a case where it's better to use, you know, sort of soft machine learning type methods to kind of feel out that space.

- Thank you. - Yeah, maybe, very last question. - I was just intuitively thinking when you spoke about an ocean, I thought of Isaac Newton when he said, you know, the famous quote, "I might not." And I thought instead of Newton on the beach, what if Franz Liszt were there?

What question would he ask? What would he say? And I'm trying to understand your, the alien ocean and humans through maybe Franz Liszt and music. - Well, so, I mean, the quote from Newton is, it's sort of an interesting quote. I think it goes something like this. If, you know, people are talking about how wonderful calculus and all that kind of thing are, and Newton says, you know, "To others, I may seem like I've done a lot of stuff, but to me, I seem like a child who picked up a particularly elegant seashell on the beach.

And I've been studying this seashell for a while, even though there's this ocean of truth out there waiting to be discovered." That's roughly the quote, okay? I find that quote interesting for the following reason. What Newton did was, you know, calculus and things like it, if you look at the computational universe of all possible programs, there is a small corner.

Newton was exactly right in what he said. That is, he picked off calculus, which is a corner of the possible things that can happen in the computational universe that happened to be an elegant seashell, so to speak. They happened to be a case where you can figure out what's going on and so on, while there is still this sort of ocean of other sort of computational possibilities out there.

But when it comes to, you know, you're asking about music, I, oh, I think my computer stopped being able to get anywhere, but sort of interesting, the, see if we can get to the site. Yeah, so this is a website that we made years ago, and now my computer isn't playing anything, but.

(upbeat music) Let's try that. Okay, so these things are created by basically just searching computational universe of possible programs. It's sort of interesting because every one has kind of a story. Some of them look more interesting than others. Let's try that one. Anyway, the, what's interesting, actually, what was interesting to me about this was, this is a very trivial, you know, what this is doing is very trivial at some level.

It's just, it actually happens to use cellular automata. You can even have it show you, I think, someplace here. Where is it? Somewhere there's a way of showing, you know, show the evolution. This is showing the behind the scenes what was actually happening, what it chose to use to generate that musical piece.

And what I thought was interesting about this site, I thought, well, you know, how would computers be relevant to music, et cetera, et cetera, et cetera? Well, you know, what would happen is, a human would have an idea, and then the computer would kind of dress up that idea.

And then, you know, a bunch of years go by, and I talk to people, you know, who are composers and things, and they say, "Oh yeah, I really like that Wolfram Tone site." Okay, that's nice. They say, "It's a very good place for me to get ideas." So that's sort of the opposite of what I would have expected, namely, what's happening is, you know, human comes here, you know, listens to some 10 second fragment, and they say, "Oh, that's an interesting idea." And then they kind of embellish it using kind of something that is humanly meaningful.

But it's like, you know, you're taking a photograph, and you see some interesting configuration, and then kind of you're, you know, you're filling that with kind of some human sort of context. But so I'm not quite sure what, what you were asking about. I mean, back to the Newton quote, the thing that I think is another way to think about that quote is us humans, you know, with our sort of historical development of, you know, our intellectual history have explored this very small corner of what's possible in the computational universe.

And everything that we care about is contained in the small corner. And that means that, you know, you could say, "Well, gee, you know, I want to, you know, what we end up wanting to talk about are the things that we as a society have decided we care about." And what, there's an interesting feedback loop, I'll just mention, it should end, but so you might say, so here's a funny thing.

So let's take language, for example. Language evolves, we say, we make up language to describe what we see in the world, okay? Fine, that's a fine idea. Imagine the, you know, in Paleolithic times, people would make up language. They probably didn't have a word for table because they didn't have any tables.

They probably had a word for rock. But then we end up as a result of the particular, you know, development that our civilization has gone through, we build tables. And there was sort of a synergy between coming up with a word for table and deciding tables were a thing and we should build a bunch of them.

And so there's this sort of complicated interplay between the things that we learn how to describe and how to think about, the things that we build and put in our environment, and then the things that we end up wanting to talk about because they're things that we have experience of in our environment.

And so that's the, you know, I think as we look at sort of the progress of civilization, there's, you know, there's various layers of, first we, you know, we invent a thing that we can then think about and talk about. Then we build an environment based on that. Then that allows us to do more stuff and we build on top of that.

And that's why, for example, when we talk about computational thinking and teaching it to kids and so on, that's one reason that's kind of important because we're building a layer of things that people are then familiar with that's different from what we've had so far. And they give people a way to talk about things.

I'll give you one example that, let's see, did I have that still up? The, yeah, okay, one example here. (keyboard clicking) From this blog post of mine, actually. So the, where is it? Okay, so that thing there is a nested pattern. You know, it's a Sapinski. That tile pattern was created in 1210 AD, okay?

And it's the first example I know of a fractal pattern. Okay, well, the art historians wrote about these patterns. There are a bunch of this particular style of pattern. They wrote about these for years. They never discussed that nested pattern. These patterns also have, you know, pictures of lions and, you know, elephants and things like that in them.

They wrote about those kinds of things, but they never mentioned the nested pattern until basically about 25 years ago when fractals and so on became a thing. And then it's, ah, I can now talk about that. It's a nested pattern, it's a fractal. And then, you know, before that time, the art historians were blind to that particular part of this pattern.

It was just like, I don't know what that is. But there's no, you know, I don't have a word to describe it. I'm not going to, I'm not gonna talk about it. So that's a, you know, it's part of this feedback loop of things that we learn to describe, then we build in terms of those things, then we build another layer.

I think one of the things, I mean, you talk about, you know, just in the sort of, the thing, one thing I'm really interested in is the evolution of purposes. So, you know, if you look back in human history, there's a, you know, what was thought to be worth doing a thousand years ago is different from what's thought to be worth doing today.

And part of that is, you know, good examples of things like, you know, walking on a treadmill or buying goods in virtual worlds. Both of these are hard to explain to somebody from a thousand years ago, because each one ends up being a whole sort of societal story about we're doing this because we do that, because we do that.

And so the question is, how will these purposes evolve in the future? And I think one of the things that I view as a sort of sobering thought is that, actually one thing I found rather disappointing and then I became less pessimistic about it is, if you think about the future of the human condition, and, you know, we've been successful in making our AI systems and we can read out brains and we can upload consciousnesses and things like that.

And we've eventually got this box with trillions of souls in it. And the question is, what are these souls doing? And to us today, it looks like they're playing video games for the rest of eternity, right? And that seems like a kind of a bad outcome. It's like, we've gone through all of this long history and what do we end up with?

We end up with a trillion souls in a box playing video games, okay? And I thought this is a very, you know, depressing outcome, so to speak. And then I realized that actually, you know, if you look at the sort of arc of human history, people at any given time in history, people have been, you know, they've, my main conclusion is that any time in history, the things people do seem meaningful and purposeful to them at that time in history and history moves on.

And, you know, like a thousand years ago, there were a lot of purposes that people had that, you know, were to do with weird superstitions and things like that that we say, why the heck were you doing that? That just doesn't make any sense, right? But to them at that time, it made all the sense in the world.

And I think that, you know, the thing that makes me sort of less depressed about the future, so to speak, is that at any given time in history, you know, you can still have meaningful purposes, even though they may not look meaningful from a different point in history. And that there's sort of a whole theory you can kind of build up based on kind of the trajectories that you follow through the space of purposes and sort of interesting, if you can't jump, like you say, let's get cryonically frozen for, you know, 300 years, and then, you know, be back again.

The interesting case is then, you know, all the purposes that you sort of, you know, that you find yourself in, ones that have any continuity with what we know today. I should stop with that. - That's a beautiful way to end it. Please give Steven a big hand. (audience applauding)

Stephen Wolfram: Computational Universe | MIT 6.S099: Artificial General Intelligence (AGI)

Chapters

Transcript