back to indexWhat is Wolfram Language? (Stephen Wolfram) | AI Podcast Clips
Chapters
0:0 Intro
1:0 Symbolic language
3:45 Random sample
7:50 History of AI
11:30 The Dream of Machine Learning
13:45 The State of Wolfram Language
19:2 Wolfram Knowledge Base
26:16 Optimism
27:20 The Internet
30:57 Encoding ideologies
33:8 Different value systems
00:00:04.400 |
sort of, I mean I can answer the question for you, 00:00:17.200 |
in terms of stuff you can play with, what is it? 00:00:21.560 |
What are the different ways to interact with it? 00:00:29.280 |
one is Mathematica, the other is Wolfram Alpha. 00:00:49.080 |
is you're typing little pieces of computational language, 00:00:54.720 |
- It's very kind of, there's like a symbolic, 00:01:18.080 |
is just stuff that computers intrinsically do, 00:01:34.040 |
it's aimed to be an abstract language from the beginning, 00:01:46.400 |
and Wolfram language will just say, oh, that's X. 00:01:53.520 |
but in terms of the internals of the computer. 00:01:55.780 |
Now, that X could perfectly well be the city of Boston. 00:02:06.720 |
of some spacecraft represented as a symbolic thing, 00:02:14.480 |
sort of computationally work with these different, 00:02:17.000 |
these kinds of things that exist in the world 00:02:20.120 |
or describe the world, that's really powerful, 00:02:34.720 |
I kind of wanted to have this sort of infrastructure 00:02:39.720 |
for computation, which was as fundamental as possible. 00:02:42.360 |
I mean, this is what I got for having been a physicist 00:02:44.680 |
and tried to find fundamental components of things 00:02:50.660 |
of transformation rules for symbolic expressions 00:02:59.640 |
and that's what we've been building from in Wolfram language 00:03:13.080 |
and it's really been built in a very different direction 00:03:22.180 |
It really is kind of wrapped around the operations 00:03:31.240 |
is to have the language itself be able to cover 00:03:37.920 |
and that means that there are 6,000 primitive functions 00:03:57.660 |
So let's just say random sample of 10 of them, 00:04:02.240 |
Wow, okay, so these are really different things from- 00:04:11.560 |
between different types of Boolean expressions. 00:04:45.640 |
I want to pick another 10, 'cause I think this is some, 00:04:48.200 |
okay, so yeah, there's a lot of infrastructure stuff here 00:04:51.240 |
that you see if you just start sampling at random, 00:04:53.720 |
there's a lot of kind of infrastructural things. 00:04:57.760 |
- Some of the exciting machine learning stuff 00:05:01.960 |
- Oh yeah, yeah, I mean, so one of those functions 00:05:11.440 |
let's say current image, and let's pick up an image, 00:05:23.920 |
we can say image identify, open square brackets, 00:05:27.040 |
and then we just paste that picture in there. 00:05:29.800 |
- Image identify function running on the picture. 00:05:34.640 |
I look like a plunger, because I got this great big thing 00:05:37.880 |
- Classify, so this image identify classifies 00:05:45.920 |
let's see what it does, let's pick the top 10. 00:06:00.120 |
- That hopefully will not give you an existential crisis, 00:06:02.280 |
and then 8%, or I shouldn't say percent, but-- 00:06:18.440 |
There we go, let's try that, let's see what that did. 00:06:21.720 |
- We took a picture with a little bit more of your-- 00:06:34.000 |
so this is image identify as an example of one-- 00:06:38.880 |
- And that's part of the, that's like part of the language. 00:06:45.240 |
I could say, I don't know, let's find the geo nearest, 00:06:53.080 |
Let's find the 10, I wonder where it thinks here is. 00:06:59.400 |
Let's try finding the 10 volcanoes nearest here, okay? 00:07:04.080 |
- So geo nearest volcano here, 10 nearest volcanoes. 00:07:21.040 |
- Of the East Coast and the Midwest, and it's the, 00:07:23.520 |
well, no, we're okay, we're okay, it's not too bad. 00:07:29.600 |
But, you know, the fact that right in the language, 00:07:33.000 |
it knows about all the volcanoes in the world, 00:07:35.440 |
it knows, you know, computing what the nearest ones are, 00:07:38.440 |
it knows all the maps of the world, and so on. 00:07:42.320 |
- Yeah, right, that's why I like to talk about it 00:08:04.560 |
and from that extract the different hierarchies 00:08:21.280 |
The context of history of AI, if you could just comment on, 00:08:30.280 |
And there's just some sense where in the 80s and 90s, 00:08:44.240 |
- But then out of that emerges kind of Wolfram language, 00:08:57.640 |
and you actually can't do it with a particular area. 00:09:03.080 |
it's critical to have broad knowledge of the world 00:09:05.320 |
if you want to do good natural language understanding. 00:09:07.980 |
And you kind of have to bite off the whole problem. 00:09:10.400 |
If you say we're just gonna do the blocks world over here, 00:09:22.600 |
so the relationship between what we've tried to do 00:09:28.000 |
you know, in a sense, if you look at the development 00:09:33.400 |
there was kind of this notion pre 300 years ago or so now, 00:09:37.360 |
you want to figure something out about the world, 00:09:40.200 |
You can do things which are just use raw human thought. 00:09:44.160 |
And then along came sort of modern mathematical science. 00:09:47.720 |
And we found ways to just sort of blast through that 00:09:53.720 |
Now we also know we can do that with computation and so on. 00:09:59.120 |
So when we look at how do we sort of encode knowledge 00:10:04.620 |
one way we could do it is start from scratch, 00:10:08.080 |
it's just a neural net figuring everything out. 00:10:10.960 |
But in a sense that denies the sort of knowledge 00:10:16.480 |
Because in our civilization, we have learned lots of stuff. 00:10:19.640 |
We've surveyed all the volcanoes in the world. 00:10:21.520 |
We've done, you know, we figured out lots of algorithms 00:10:25.700 |
Those are things that we can encode computationally. 00:10:32.420 |
you don't have to start everything from scratch. 00:10:37.140 |
is to try and sort of capture the knowledge of the world 00:10:45.640 |
which were for a long time undoable by computers 00:10:56.620 |
which actually were pretty easy for humans to do 00:11:01.120 |
I think the thing that's interesting that's emerging now 00:11:09.380 |
and this kind of sort of much more statistical 00:11:13.060 |
kind of things like image identification and so on. 00:11:19.180 |
by having this sort of symbolic representation 00:11:23.660 |
that that's where things get really interesting 00:11:25.980 |
and where you can kind of symbolically represent patterns 00:11:30.900 |
I think that's kind of a part of the path forward, 00:11:35.680 |
- Yeah, so the dream of, so the machine learning is not, 00:11:41.520 |
is not anywhere close to building the kind of wide world 00:11:46.520 |
of computable knowledge that will from a language of build. 00:11:53.540 |
you've done the incredibly hard work of building this world, 00:12:03.460 |
- And that's what you've added with version 12, right? 00:12:12.400 |
it's sort of interesting to see the sort of the, 00:12:19.020 |
it's running in sort of a very efficient computational way, 00:12:22.220 |
but then there's sort of things like the interface 00:12:25.180 |
How do you do natural language understanding to get there? 00:12:31.140 |
That's, I mean, actually a good example right now 00:12:37.440 |
we've done a lot of stuff, natural language understanding, 00:12:40.700 |
using essentially not learning-based methods, 00:12:51.400 |
and then converting, so the process of converting, 00:12:54.300 |
NLU defined beautifully as converting their query 00:13:04.340 |
super practical definition, a very useful definition, 00:13:17.620 |
go pick out all the cities in that text, for example. 00:13:20.640 |
And so a good example of, you know, so we do that. 00:13:22.980 |
We're using modern machine learning techniques. 00:13:26.420 |
And it's actually kind of an interesting process 00:13:29.780 |
that's going on right now, is this loop between 00:13:32.460 |
what do we pick up with NLP, we're using machine learning, 00:13:42.140 |
And so we've got this kind of loop going between those, 00:13:45.760 |
- Yeah, and I think you have some of the state-of-the-art 00:13:47.420 |
transformers, like you have BERT in there, I think. 00:14:05.580 |
- You know, that's a, it's a complicated issue, 00:14:20.460 |
- And then there's sort of, like what we're talking about, 00:14:23.900 |
and some of the absorption mechanisms of ideas 00:14:35.680 |
So it's interesting how the spread of ideas works. 00:14:38.660 |
- You know what's funny with Wolfram Language 00:14:46.040 |
if you look at the, I would say, very high end of R&D 00:14:52.460 |
"Wow, that's a really, you know, impressive, smart person," 00:14:56.180 |
they're very often users of Wolfram Language, 00:14:59.860 |
If you look at the more sort of, it's a funny thing, 00:15:02.560 |
if you look at the more kind of, I would say, 00:15:05.120 |
people who are like, "Oh, we're just plodding away 00:15:17.380 |
you know, the high end, we've really been very successful 00:15:29.740 |
because it's kind of, you know, I have a company 00:15:32.420 |
which is, really emphasizes sort of creating products 00:15:45.740 |
doing the commercial side of things and pumping it out 00:15:50.180 |
- And there's an interesting idea that, you know, 00:15:53.580 |
by opening everything up, sort of the GitHub model, 00:15:58.060 |
but there's an interesting, I think I've heard you 00:16:00.200 |
discuss this, that that turns out not to work 00:16:03.100 |
in a lot of cases, like in this particular case, 00:16:05.640 |
that you want it, that when you deeply care about 00:16:20.600 |
- Yeah, it's not the nature of how things work. 00:16:31.860 |
maintaining a coherent vision over a long period of time, 00:16:35.480 |
and doing not only the cool vision related work, 00:16:49.020 |
That's the mundane, the fascinating and the mundane, 00:16:55.380 |
- Yeah, I mean, that's probably not the most, 00:16:59.660 |
in all these different cloud environments and so on. 00:17:02.340 |
That's pretty, you know, that's very practical stuff. 00:17:06.700 |
and, you know, have there be, take only, you know, 00:17:09.420 |
a fraction of a millisecond to do this or that. 00:17:18.000 |
it's an interesting thing over the period of time, 00:17:20.240 |
you know, orphan language has existed basically 00:17:23.420 |
for more than half of the total amount of time 00:17:25.740 |
that any language, any computer language has existed. 00:17:28.120 |
That is, the computer language is maybe 60 years old, 00:17:36.360 |
So it's kind of a, and I think I was realizing recently, 00:17:41.040 |
there's been more innovation in the distribution of software 00:17:46.520 |
of programming languages over that period of time. 00:17:56.320 |
because I have a simple private company and so on 00:17:59.080 |
that doesn't have, you know, a bunch of investors, 00:18:01.560 |
you know, telling us we're gonna do this or that, 00:18:05.880 |
And so, for example, we're able to, oh, I don't know, 00:18:09.120 |
we have this free Wolfram Engine for developers, 00:18:13.120 |
And we've been, you know, we've, there are site licenses 00:18:27.420 |
And, you know, we've been doing a progression of things. 00:18:31.000 |
I mean, different things like Wolfram Alpha, for example, 00:18:38.400 |
- Okay, Wolfram Alpha is a system for answering questions 00:18:42.040 |
where you ask a question with natural language 00:18:49.000 |
So the question could be something like, you know, 00:18:52.760 |
what's the population of Boston divided by New York 00:18:57.720 |
And it'll take those words and give you an answer. 00:19:12.720 |
belongs to Wolfram Alpha or to the Wolfram Language? 00:19:16.440 |
- We just call it the Wolfram Knowledge Base. 00:19:18.840 |
- I mean, that's been a big effort over the decades 00:19:23.720 |
And, you know, more of it flows in every second. 00:19:28.560 |
Like, that's one of the most incredible things. 00:19:33.760 |
Wolfram Language itself is the fundamental thing. 00:19:43.880 |
So what's the process of building that knowledge base? 00:19:58.540 |
to the incredible knowledge base that you have now? 00:20:01.520 |
- Well, yeah, it was kind of scary at some level. 00:20:03.420 |
I mean, I had wondered about doing something like this 00:20:07.200 |
So it wasn't like I hadn't thought about it for a while. 00:20:14.120 |
such a difficult engineering notion at some point. 00:20:21.120 |
it's a live-your-own-paradigm kind of theory. 00:20:26.800 |
I had assumed that to build something like Wolfram Alpha 00:20:30.160 |
would require sort of solving the general AI problem. 00:20:36.480 |
and I thought I don't really know how to do that, 00:20:39.840 |
Then I worked on my new kind of science project 00:20:42.600 |
and sort of exploring the computational universe 00:20:50.260 |
between the intelligent and the merely computational. 00:20:53.100 |
So I thought, look, that's this paradigm I've built. 00:20:56.180 |
Now I have to eat that dog food myself, so to speak. 00:21:14.800 |
I remember I took the early team to a big reference library 00:21:35.360 |
The fact that you can walk into the reference library, 00:21:36.920 |
it's a big, big thing with lots of reference books 00:21:43.760 |
it's not the infinite corridor of, so to speak, 00:22:02.840 |
It was like, let's implement this area, this area, 00:22:17.520 |
get used by sort of the world's experts in lots of areas. 00:22:27.920 |
and we're able to ask them for input and so on. 00:22:36.460 |
who helped us figure out what to do wouldn't be right. 00:22:40.360 |
'Cause our goal was to kind of get to the point 00:22:42.300 |
where we had sort of true expert level knowledge 00:22:51.220 |
on the basis of general knowledge in our civilization, 00:22:53.900 |
make it be automatic to be able to answer that question. 00:23:01.060 |
from the very beginning and it's now also used in Alexa. 00:23:03.840 |
And so it's people are kind of getting more of the, 00:23:12.160 |
I mean, in a sense, the question answering problem 00:23:15.080 |
was viewed as one of the sort of core AI problems 00:23:23.520 |
who was a well-known AI person from right around here. 00:23:28.280 |
And I remember when WolfMalFa was coming out, 00:23:30.720 |
it was a few weeks before it came out, I think. 00:23:45.040 |
And then he's talking about something different. 00:23:47.080 |
I said, "No, Marvin, this time it actually works. 00:23:56.160 |
Of course, we have a record of what he typed in, 00:23:59.240 |
- Can you share where his mind was in the testing space? 00:24:09.720 |
medical stuff and chemistry stuff and astronomy and so on. 00:24:14.720 |
And it was like, after a few minutes, he was like, 00:24:22.360 |
But that was kind of told you something about the state, 00:24:28.800 |
in a sense, by trying to solve the bigger problem, 00:24:31.740 |
we were able to actually make something that would work. 00:24:35.400 |
we had a bunch of completely unfair advantages. 00:24:37.820 |
For example, we already built a bunch of orphan language, 00:24:44.220 |
I had the practical experience of building big systems. 00:24:53.280 |
to not just sort of give up in doing something like this. 00:25:02.680 |
I've worked on a bunch of big projects in my life. 00:25:29.680 |
And usually it does, something happens in a few years, 00:25:35.200 |
And that's, and from a personal point of view, 00:25:38.280 |
always the challenge is you end up with these projects 00:25:50.320 |
And that's an interesting sort of personal challenge. 00:26:02.880 |
but it's kind of making a bet that I can kind of, 00:26:07.880 |
I can do that as well as doing the incredibly energetic 00:26:13.040 |
things that I'm trying to do with orphan language and so on. 00:26:18.680 |
I just talked for the second time with Elon Musk 00:26:21.800 |
and that you two share that quality a little bit 00:26:24.280 |
of that optimism of taking on basically the daunting, 00:26:32.880 |
And he, and you take it on out of, you can call it ego, 00:26:37.440 |
you can call it naivety, you can call it optimism, 00:26:40.960 |
but that's how you solve the impossible things. 00:26:47.760 |
it's been, I progressively got a bit more confident 00:27:01.200 |
oh, I've done these projects and they're big. 00:27:10.960 |
And often these projects are of completely unknown, 00:27:19.960 |
- On the sort of building this giant knowledge base 00:27:25.240 |
that's behind Wolfram Language, Wolfram Alpha, 00:27:33.520 |
What do you think about, for example, Wikipedia, 00:27:40.840 |
that's not converted into computable knowledge? 00:27:43.680 |
Do you think, if you look at Wolfram Language, 00:27:46.840 |
Wolfram Alpha, 20, 30, maybe 50 years down the line, 00:28:03.360 |
it doesn't include the understanding of information. 00:28:12.200 |
represented within-- - Sure, I would hope so. 00:28:16.920 |
- How hard is that problem, like closing that gap? 00:28:23.320 |
of answering general knowledge questions about the world, 00:28:25.760 |
we're in pretty good shape on that right now. 00:28:44.000 |
when it encounters this or that or the other? 00:28:52.040 |
and be able to express things about the world. 00:28:55.040 |
If the creature that you see running across the road 00:28:57.680 |
is a thing at this point in the tree of life, 00:29:02.360 |
then swerve this way, otherwise don't, those kinds of things. 00:29:08.300 |
when you start to get to some of the messy human things, 00:29:10.620 |
are those encodable into computable knowledge? 00:29:13.700 |
- Well, I think that it is a necessary feature 00:29:26.300 |
you know, is able to be dealt with by computer. 00:29:32.540 |
in the question of automated content selection 00:29:36.860 |
So, you know, the Facebooks, Googles, Twitters, 00:29:52.020 |
And what are the kind of principles behind that? 00:29:55.100 |
And what I kind of, well, a bunch of different things 00:29:58.100 |
I realized about that, but one thing that's interesting 00:30:06.440 |
you have to build an AI ethics module, in effect, 00:30:17.540 |
that, you know, there's not gonna be one of these things. 00:30:20.180 |
It's not possible to decide, or it might be possible, 00:30:23.380 |
but it would be really bad for the future of our species 00:30:25.620 |
if we just decided there's this one AI ethics module 00:30:35.100 |
And I kind of realized one has to sort of break it up, 00:30:39.700 |
of how one does that and how one sort of has people 00:30:45.340 |
I'm buying in in the case of just content selection, 00:30:51.860 |
that kind of cuts across sort of societal boundaries. 00:31:03.500 |
sort of maybe in the, sort of have different AI systems 00:31:12.860 |
I don't know, whether it's conservative or liberal, 00:31:17.780 |
and then libertarian, and there's an Iranian objectivist 00:31:24.900 |
I mean, it's almost encoding some of the ideologies 00:31:28.140 |
which we've been struggling, I come from the Soviet Union, 00:31:31.140 |
that didn't work out so well with the ideologies 00:31:45.540 |
that system could be encoded into computational knowledge, 00:31:53.220 |
in the digital space, that's a really exciting possibility. 00:31:57.020 |
Are you playing with those ideas in Wolfram Language? 00:32:00.380 |
- Yeah, yeah, I mean, that's, Wolfram Language 00:32:03.700 |
has sort of the best opportunity to kind of express 00:32:06.900 |
those essentially computational contracts about what to do. 00:32:15.140 |
is this a credible news story, what does that mean, 00:32:31.260 |
because there are these big projects that I think about, 00:32:34.340 |
like, find the fundamental theory of physics, 00:32:41.820 |
in the case of, figure out how you rank all content, 00:32:46.940 |
that's kind of a box number two, so to speak. 00:32:56.460 |
- Depends who you ask, it's one of these things 00:32:58.300 |
that's exactly like, what's the ranking, right? 00:33:08.620 |
- Having multiple modules is a really compelling notion 00:33:10.980 |
to us humans, that in a world where it's not clear 00:33:16.220 |
perhaps you have systems that operate under different, 00:33:30.500 |
I mean, I'm not really a politics-oriented person, 00:33:37.380 |
it's kind of like, you're gonna have this system, 00:33:44.540 |
of sort of a market-based system where you have, 00:33:47.860 |
okay, I as a human, I'm gonna pick this system, 00:33:50.700 |
I as another human, I'm gonna pick this system. 00:33:54.980 |
this case of automated content selection is a non-trivial, 00:33:59.980 |
but it is probably the easiest of the AI ethics situations, 00:34:03.460 |
because it is, each person gets to pick for themselves, 00:34:10.420 |
By the time you're dealing with other societal things, 00:34:18.460 |
and all those kind of centralized kind of things. 00:34:24.820 |
each person can pick for themselves, so to speak. 00:34:31.700 |
where that's not, where that doesn't get to be 00:34:35.300 |
something which people can, what they pick for themselves, 00:34:46.420 |
we need to move away into digital currency and so on, 00:34:54.460 |
And that's where, that's sort of the motivation 00:35:05.540 |
The idea of a computational contract is just to say, 00:35:12.620 |
of the contract are represented in computational form. 00:35:15.100 |
So in principle, it's automatic to execute the contract. 00:35:18.900 |
And I think that's, that will surely be the future 00:35:22.620 |
of the idea of legal contracts written in English 00:35:27.260 |
and where people have to argue about what goes on 00:35:36.780 |
if everything can be represented computationally 00:35:38.780 |
and the computers can kind of decide what to do. 00:35:48.780 |
but he had, his pinnacle of technical achievement 00:35:52.220 |
was this brass four-function mechanical calculator thing 00:35:58.620 |
And so he was like 300 years too early for that idea. 00:36:02.740 |
But now that idea is pretty realistic, I think. 00:36:08.820 |
than what we have now in Wolfram language to express, 00:36:13.980 |
being able to express sort of everything in the world 00:36:22.700 |
I mean, I think it's a, you know, I don't know, 00:36:28.580 |
to have a pretty well-built out version of that 00:36:31.140 |
that will allow one to encode the kinds of things 00:36:43.020 |
can you try to define the scope of what it is? 00:36:48.020 |
- So we're having a conversation, it's a natural language. 00:36:52.540 |
Can we have a representation of the sort of actionable parts 00:36:56.740 |
of that conversation in a precise computable form 00:37:04.780 |
some of the things we think of as common sense, essentially, 00:37:13.420 |
I'm getting hungry and want to eat something, right? 00:37:17.620 |
That's something we don't have a representation, 00:37:21.580 |
if I was like, I'm eating blueberries and raspberries 00:37:23.740 |
and things like that, and I'm eating this amount of them, 00:37:25.980 |
we know all about those kinds of fruits and plants 00:37:28.500 |
and nutrition content and all that kind of thing, 00:37:30.660 |
but the I want to eat them part of it is not covered yet. 00:37:40.260 |
to be able to have a natural language conversation. 00:37:42.820 |
- Right, right, to be able to express the kinds of things 00:37:45.700 |
that say, you know, if it's a legal contract, 00:37:48.540 |
it's, you know, the party's desire to have this and that. 00:37:55.900 |
- But isn't that, isn't this, just to let you know, 00:38:06.540 |
the dream of Turing and formulating the Turing test. 00:38:11.340 |
- So, do you hope, do you think that's the ultimate test 00:38:25.900 |
look, if the test is, does it walk and talk like a human? 00:38:39.380 |
You know, people have attached the Wolfram Alpha API 00:38:47.220 |
'Cause all you have to do is ask it five questions 00:39:05.660 |
It's actually legitimately, Wolfram Alpha is legitimately, 00:39:22.420 |
he thought about taking Encyclopedia Britannica 00:39:25.340 |
and, you know, making it computational in some way. 00:39:31.740 |
he was a bit more pessimistic than the reality. 00:39:45.340 |
'cause we had a lot, we had layers of automation 00:39:50.820 |
it's hard to imagine those layers of abstraction