back to indexRegina Barzilay: Deep Learning for Cancer Diagnosis and Treatment | Lex Fridman Podcast #40
Chapters
0:0 Intro
0:48 Regina Barzilay
1:11 Books that had profound impact
3:7 Americana
5:13 Personality
7:2 Chemistry
8:59 Computer Science
16:0 Community
18:37 Machine Learning for Cancer
21:55 Breast Cancer
22:43 Machine Learning
23:42 Access to Data
25:30 Privacy Concerns
27:30 Technical Solutions
31:33 Data Exchange
33:8 AI in Healthcare
34:22 Traditional Breast Cancer Risk Assessment
39:47 Who will be successful
41:20 Drug design
45:22 Drug design process
48:24 Property prediction
50:34 Reginas NLP journey
52:39 Machine Translation
00:00:00.000 |
The following is a conversation with Regina Barsley. 00:00:03.240 |
She's a professor at MIT and a world-class researcher 00:00:08.360 |
and applications of deep learning to chemistry and oncology 00:00:12.480 |
or the use of deep learning for early diagnosis, 00:00:21.040 |
of several successful AI related courses at MIT, 00:00:36.400 |
support it on Patreon or simply connect with me on Twitter 00:00:43.760 |
And now here's my conversation with Regina Barsley. 00:00:51.960 |
it would be a literature course with a friend of yours 00:00:56.360 |
Just out of curiosity, 'cause I couldn't find anything on it, 00:01:00.240 |
are there books or ideas that had profound impact 00:01:06.760 |
perhaps outside of computer science and the technical fields? 00:01:10.760 |
- I think because I'm spending a lot of my time at MIT 00:01:14.680 |
and previously in other institutions where I was a student, 00:01:18.280 |
I have a limited ability to interact with people. 00:01:27.240 |
that had profound impact on me and how I view the world. 00:01:31.360 |
Let me just give you one example of such a book. 00:01:42.480 |
It's a book about, it's kind of a history of science book 00:01:45.760 |
on how the treatments and drugs for cancer were developed. 00:02:05.840 |
and what makes science succeed and be implemented. 00:02:11.080 |
And sometimes it's actually not the strength of the idea, 00:02:14.120 |
but devotion of the person who wants to see it implemented. 00:02:19.800 |
at least for the last year, quite changed the way 00:02:36.040 |
which is not kind of, which is a fiction book, 00:02:43.200 |
And this is a book about a young female student 00:02:48.760 |
who comes from Africa to study in the United States. 00:03:00.600 |
in a new country and kind of adaptation to a new culture. 00:03:14.920 |
but it also kind of gave me the lens on different events. 00:03:19.920 |
And some events that I never actually paid attention, 00:03:35.720 |
because that's how she was educated in her country. 00:03:40.960 |
And then she notices that the person who talks to her, 00:03:44.640 |
talks to her in a very funny way, in a very slow way. 00:03:48.280 |
And she's thinking that this woman is disabled 00:03:50.560 |
and she's also trying to kind of to accommodate her. 00:03:54.440 |
And then after a while, when she finishes her discussion 00:03:58.560 |
she sees how she interacts with other students, 00:04:02.040 |
with American students, and she discovers that actually 00:04:09.640 |
because she saw that she doesn't understand English. 00:04:12.160 |
And I thought, wow, this is a funny experience. 00:04:26.560 |
And then I noticed that this person is talking 00:04:37.160 |
and I was with another professor, Ernst Frankel, 00:04:43.160 |
that I don't get that the guy is talking in this way 00:04:47.000 |
So it was really kind of mirroring experience 00:04:50.080 |
and it led me think a lot about my own experiences 00:05:13.720 |
- It's not necessarily that they are more important 00:05:15.760 |
than ideas, but I think that ideas on their own 00:05:20.520 |
And many times, at least at the local horizon, 00:05:24.720 |
it's the personalities and their devotion to their ideas 00:05:29.200 |
is really that locally changes the landscape. 00:05:33.040 |
If you're looking at AI, like let's say 30 years ago, 00:05:37.680 |
dark ages of AI or whatever word is symbolic times, 00:05:46.600 |
and we're kind of thinking this was not really 00:05:50.640 |
but you can see that some people managed to take it 00:05:53.040 |
and to make it so shiny and dominate the academic world 00:06:02.320 |
If you look at the area of natural language processing, 00:06:05.160 |
it is well-known fact that the reason that statistics 00:06:09.120 |
in NLP took such a long time to become mainstream 00:06:13.960 |
because there were quite a number of personalities 00:06:18.400 |
and didn't stop research progress in this area. 00:06:22.040 |
So I do not think that kind of asymptotically 00:06:28.920 |
but I think locally it does make quite a bit of impact 00:06:33.920 |
and it's generally speeds up the rate of adoption 00:06:43.480 |
is in the early days of particular discipline, 00:06:57.840 |
the trying to, the medicine, was it centered on-- 00:07:07.200 |
Like for me, it was really a discovery how people, 00:07:10.680 |
what was the science of chemistry behind drug development, 00:07:17.240 |
like coloring industry, that people who develop chemistry 00:07:34.480 |
and like historically, yeah, this is fascinating 00:07:38.680 |
and look under the microscope and do all this discovery. 00:08:03.320 |
let's put it this way, and they tried it on the patients 00:08:05.960 |
and those were children with leukemia and they died. 00:08:11.640 |
You look at the process, how imperfect is this process? 00:08:15.000 |
And you know, like if we're again looking back 00:08:24.600 |
were really happening, you know, maybe decades ago. 00:08:35.640 |
the way I'm thinking computer science, scientific. 00:08:38.200 |
- So from the perspective of computer science, 00:08:40.400 |
you've gotten a chance to work the application 00:08:44.880 |
From a perspective of an engineer and a computer scientist, 00:08:48.440 |
how far along are we from understanding the human body, 00:09:02.220 |
And if you're thinking as a computer scientist 00:09:06.040 |
about this problem, I think one of the reasons 00:09:16.280 |
we are not trying to understand in some ways. 00:09:19.000 |
Like if you're thinking about like e-commerce, Amazon, 00:09:24.240 |
and that's why it recommends you certain books 00:09:45.560 |
If you're looking about recommendation system, 00:09:47.320 |
they're not claiming that they're understanding somebody, 00:10:03.200 |
at any way, you know, educated in this field, 00:10:23.200 |
of this process is, you know, beyond our capacity. 00:10:30.520 |
when you do recommendation in many other areas, 00:10:44.480 |
but in parallel, we can actually do this kind of matching 00:11:00.720 |
what exactly does it mean to understand here? 00:11:19.560 |
of being able to reduce human knowledge into logic 00:11:34.840 |
what seems to work today, and we'll talk about it more, 00:11:37.880 |
is as much as possible, reduce stuff into data, 00:11:40.780 |
reduce whatever problem you're interested in to data 00:11:51.120 |
you were diagnosed with breast cancer in 2014. 00:11:54.160 |
What did facing your mortality make you think about? 00:12:01.840 |
and I think that I was interviewed many times, 00:12:09.680 |
and it's the first time I realized in my life 00:12:11.480 |
that I may die, and I never thought about it before. 00:12:14.480 |
And there was a long time since you diagnosed 00:12:20.160 |
For me, it was like maybe two and a half months. 00:12:23.520 |
And I didn't know where I am during this time 00:12:47.400 |
It was really a mixture with many components at the time, 00:13:03.360 |
and you're hopeful, and then you're desperate. 00:13:04.720 |
So it's like there is a whole slew of emotions 00:13:08.720 |
But what I remember is that when I came back to MIT, 00:13:19.800 |
But when I came back, really finished my treatment, 00:13:24.560 |
you know, I look back at what my group was doing, 00:13:33.240 |
on improving some parts, around 2% or 3% or whatever. 00:13:40.760 |
like a language that nobody speak, and whatever. 00:13:46.160 |
When all of a sudden, you know, I walked out of MIT, 00:13:49.000 |
which is, you know, when people really do care, 00:13:51.600 |
you know, what happened to your Eclair paper, 00:14:01.920 |
and I'm kind of totally shielded on it on daily basis. 00:14:04.880 |
And it's like the first time I've seen, like, 00:14:10.720 |
why are we trying to improve the parts there, 00:14:16.120 |
when we have capacity to really make a change? 00:14:25.760 |
who really want to do their papers and their work, 00:14:28.720 |
and they want to continue to do what they were doing, 00:14:31.920 |
And then it was me who really kind of reevaluated 00:14:52.960 |
We have, like, millions of papers on topic models 00:15:19.760 |
maybe matter to you at that particular point, 00:15:28.880 |
as what matters to the rest of your scientific community, 00:15:39.600 |
of just the general amount of suffering in the world. 00:15:56.960 |
Is that the way you started to see the world, perhaps? 00:16:06.040 |
where you need to go to the hospital every day, 00:16:08.520 |
and you see, you know, the community of people that you see, 00:16:11.600 |
and many of them are much worse than I was at the time, 00:16:28.480 |
you take it totally for granted that you feel well, 00:16:30.800 |
that if you decide to go running, you can go running, 00:16:32.920 |
and you can, you know, you're pretty much free 00:16:42.800 |
And I remember one of my friends, Dina Khatabi, 00:16:47.480 |
took me to Prudential to buy me a gift for my birthday, 00:16:58.960 |
and they are laughing, and they're very different 00:17:04.640 |
they're like laughing and wasting their money 00:17:06.360 |
on some stupid gifts, and, you know, they may die. 00:17:11.360 |
They already may have cancer, and they don't understand it. 00:17:33.880 |
and see how little means sometimes the system 00:17:37.640 |
has to harm them, you really feel that we need 00:17:41.000 |
to take a lot of our brilliance that we have here at MIT 00:17:48.040 |
- Yeah, and useful can have a lot of definitions, 00:17:52.320 |
alleviating trying to cure cancer is a beautiful mission. 00:17:57.320 |
So I of course know theoretically the notion of cancer, 00:18:13.480 |
So this has a huge impact, United States globally. 00:18:19.340 |
When broadly, before we talk about how machine learning, 00:18:24.340 |
how MIT can help, when do you think we as a civilization 00:18:39.340 |
What I do believe will happen with the advancement 00:18:42.100 |
in machine learning, that a lot of types of cancer 00:18:48.500 |
and more effectively utilize existing treatments. 00:18:53.420 |
I think, I hope at least, that with all the advancements 00:19:04.700 |
What I'm not sure about is how long it will take 00:19:08.220 |
the medical establishment and regulatory bodies 00:19:14.780 |
And I think this is a very big piece of puzzles 00:19:21.820 |
So first, a small detail that I think the answer is yes, 00:19:25.460 |
but is cancer one of the diseases that when detected earlier 00:19:43.020 |
And I think where machine learning can really help 00:19:49.660 |
For instance, the vast majority of pancreatic cancer patients 00:19:53.940 |
are detected at the stage that they are incurable. 00:19:57.300 |
That's why they have such a terrible survival rate. 00:20:16.740 |
And in fact, I know a number of people who were diagnosed 00:20:20.740 |
and saved just because they had food poisoning. 00:20:35.820 |
So as we become better, we would be able to help 00:20:41.220 |
too many more people that are likely to develop diseases. 00:20:46.220 |
And I just want to say that as I got more into this field, 00:20:50.980 |
I realized that cancer is of course terrible disease, 00:20:53.580 |
but there are really the whole slew of terrible diseases 00:20:56.660 |
out there like neurodegenerative diseases and others. 00:21:00.780 |
So we, of course, a lot of us are fixated on cancer 00:21:04.540 |
just because it's so prevalent in our society 00:21:12.580 |
that we still don't have a good solution for. 00:21:17.180 |
And we, you know, and I felt as a computer scientist, 00:21:22.180 |
we kind of decided that it's other people's job 00:21:26.940 |
because it's like traditionally people in biology 00:21:30.460 |
or in chemistry or MDs are the ones who are thinking 00:21:33.660 |
about it and have to kind of start paying attention. 00:21:43.020 |
- So how, it seems like in cancer specifically, 00:21:46.540 |
that there's a lot of ways that machine learning can help. 00:21:57.260 |
we really don't know what is your likelihood to get cancer. 00:22:11.180 |
80% of the patients are first in their families. 00:22:15.400 |
And I never saw that I had any increased risk 00:22:18.500 |
because, you know, nobody had it in my family. 00:22:32.420 |
that are currently used and in clinical practice, 00:22:40.380 |
the same true for non-smoking lung cancer and many others. 00:22:53.140 |
and using all the information that is already there, 00:22:59.980 |
and, you know, eventually liquid biopsies and others 00:23:04.860 |
where the signal itself is not sufficiently strong 00:23:15.620 |
a machine which is trained on large volumes of data 00:23:20.700 |
And that's what we've seen with breast cancer 00:23:22.500 |
and people are reporting it in other diseases as well. 00:23:28.260 |
And in the different kinds of sources of data. 00:23:42.660 |
So it took me after I decided that I want to work on it 00:23:53.580 |
there is no publicly available data set of modern mammogram 00:24:07.560 |
There are data that came out of clinical trials. 00:24:11.300 |
But we're talking about you as a computer scientist 00:24:22.900 |
And there is a set which is called like Florida Dataset, 00:24:33.880 |
Whatever you're learning on them doesn't scale up. 00:24:46.300 |
and the hospital decides whether they would give it 00:24:56.120 |
assuming that you're doing research collaboration, 00:24:59.220 |
you can submit, there is a proper approval process 00:25:22.780 |
MGH or any kind of hospital, are they scanning the data? 00:25:31.580 |
You don't need to do any extra processing steps. 00:25:41.180 |
because the hospital is legally responsible for the data. 00:25:53.220 |
but they may not have a lot to gain if they give it, 00:25:56.540 |
as a hospital, as a legal entity is giving it to you. 00:26:02.820 |
happening in the future is the same thing that happens 00:26:06.860 |
You can decide whether you want to donate your organs. 00:26:09.900 |
You can imagine that whenever a person goes to the hospital, 00:26:12.900 |
it should be easy for them to donate their data for research 00:26:21.340 |
or only imaging data or the whole medical record? 00:26:33.900 |
And it's not like you say, I want to keep my data private, 00:26:36.100 |
but I would really love to get it from other people 00:26:38.820 |
because other people are thinking the same way. 00:26:40.780 |
So if there is a mechanism to do this donation 00:26:48.060 |
how they want to use their data for research, 00:26:54.140 |
- People, when they think about this problem, 00:27:02.460 |
Generally, not just medical data, just any kind of data. 00:27:05.900 |
It's what you said, my data, it should belong kind of to me. 00:27:15.660 |
Because that seems like a problem that needs to be, 00:27:21.660 |
needs to be solved before we build large datasets 00:27:30.220 |
- So I think there are two things that could be done. 00:27:40.180 |
we today have ability to improve disambiguation, 00:28:02.220 |
There are other data, like if it is a raw text, 00:28:10.060 |
and actually some of them are developed at MIT, 00:28:22.420 |
and then you send the outcome back to the hospital 00:28:28.020 |
There are a lot of people who are working in this space 00:28:30.660 |
where the learning happens in the encoded form. 00:28:45.180 |
processing community, how to do the identification better. 00:28:49.560 |
But even today, there are already a lot of data 00:28:58.720 |
Where you can just, you know the name of the patient, 00:29:00.960 |
you just want to extract the part with the numbers. 00:29:27.880 |
and I still remember myself when I really needed an answer. 00:29:33.360 |
and there was no information to make a choice. 00:29:36.720 |
And at that moment, you feel that your life is at stake, 00:29:41.100 |
but you just don't have information to make the choice. 00:29:52.840 |
can you please run statistics and see what are the outcomes? 00:30:00.080 |
that comes by mail to my office at MIT, I'm serious. 00:30:03.500 |
That people ask to run because they need to make 00:30:10.060 |
And of course, I'm not planning to open a clinic here, 00:30:13.000 |
but we do run and give them the results for their doctors. 00:30:20.080 |
that we all at some point, or our loved ones, 00:30:23.760 |
will be in the situation where you need information 00:30:35.120 |
And then the question is, you know, what do I care more? 00:30:37.880 |
Because at the end, everything is a trade-off, correct? 00:30:55.060 |
Is it possible for patients to own their data 00:31:01.040 |
Of course, theoretically, I guess patients own their data, 00:31:06.640 |
containing everything, or upload it to the cloud, 00:31:21.880 |
Basically, companies helping you upload your data 00:31:24.780 |
to the cloud, so that you can move from hospital to hospital, 00:31:29.240 |
Do you see a promise of that kind of possibility? 00:31:38.160 |
I don't know now who's the biggest player in this field, 00:31:49.300 |
and many of us are sent to some specialized treatment, 00:31:59.420 |
need to go to the hospital, find some small office, 00:32:01.780 |
which gives them the CD, and they ship us the CD, 00:32:06.460 |
at kind of decades old mechanism of data exchange. 00:32:30.620 |
and Microsoft Health Vault, or whatever it's called, 00:32:36.100 |
either regulatory pressure, or there's not a business case, 00:32:46.520 |
the two biggest that I was aware of closed their doors. 00:32:54.780 |
It seems like one of those Elon Musk style problems 00:33:07.540 |
- So I know there is an initiative in Massachusetts, 00:33:11.780 |
to try to create this kind of health exchange system, 00:33:30.340 |
is actually who are the successful players in this space, 00:33:33.820 |
and the whole implementation, how does it go? 00:33:37.260 |
To me, it is from the anthropological perspective, 00:33:40.300 |
it's more fascinating that AI that today goes in healthcare. 00:33:50.380 |
And it's interesting to understand that I by no means, 00:33:59.620 |
- Yeah, it's interesting, 'cause data is really fuel 00:34:04.980 |
And when that data requires regulatory approval, 00:34:26.500 |
we still don't know what exactly do you need to demonstrate 00:34:41.100 |
So in traditional breast cancer risk assessment, 00:34:47.100 |
which determines the likelihood of a woman to get cancer. 00:34:54.220 |
The whiter it is, the more likely the tissue is dense. 00:34:58.980 |
And the idea behind density, it's not a bad idea. 00:35:06.860 |
decided to look back at women who were diagnosed 00:35:12.420 |
Can we look back and say that they're likely to develop? 00:35:16.180 |
It was the best that his human eye can identify. 00:35:22.660 |
into four categories, and that's what we are using today. 00:35:38.780 |
where women are supposed to be advised by their providers 00:35:50.220 |
supplementary screening paid by your insurance 00:35:53.700 |
Now you can say, how much science do we have behind it? 00:35:56.780 |
Whatever, biological science or epidemiological evidence. 00:36:00.860 |
So it turns out that between 40 and 50% of women 00:36:06.660 |
So about 40% of patients are coming out of their screening 00:36:11.140 |
and somebody tells them, you are in high risk. 00:36:16.900 |
if you as half of the population are in high risk? 00:36:30.180 |
we cannot really provide very expensive solutions for them. 00:36:34.620 |
And the reason this whole density became this big deal, 00:36:46.260 |
And then it turns out that they already had cancer, 00:36:50.580 |
So they didn't have a way to know who is really at risk 00:36:54.420 |
and what is the likelihood that when the doctor tells you, 00:37:02.140 |
this maybe was the best piece of science that we had. 00:37:06.820 |
And it took quite 15, 16 years to make it federal law. 00:37:21.620 |
just because you are trained on a logical thing. 00:37:35.700 |
and predict the risk when you're training the machine 00:37:48.620 |
And I really it's not helping to bring this new models. 00:37:53.620 |
And I would say it's not a matter of the algorithm. 00:37:56.700 |
Algorithms already orders of magnitude better 00:38:04.380 |
How many hospitals do you need to run the experiment? 00:38:07.540 |
What, you know, all this mechanism of adoption 00:38:20.460 |
And again, I don't think it's an AI question. 00:38:22.740 |
We can work more and make the algorithm even better, 00:38:36.860 |
And coming back to your question about books, 00:38:42.980 |
It's called "American Sickness" by Elizabeth Rosenthal. 00:38:47.980 |
And I got this book from my clinical collaborator, 00:38:53.100 |
And I said, I know everything that I need to know 00:38:56.020 |
but you know, every page doesn't fail to surprise me. 00:38:59.220 |
And I think that there is a lot of interesting 00:39:06.860 |
from computer science who are coming into this field 00:39:23.980 |
who do you think most likely would be successful 00:39:38.860 |
Who needs to be inspired to most likely lead to adoption? 00:39:48.260 |
but I think there is a lot of good people in medical system 00:39:55.200 |
And I think a lot of power will come from us as consumers, 00:40:01.480 |
because we all are consumers or future consumers 00:40:11.500 |
in explaining the potential and not in the hype terms 00:40:15.540 |
and not saying that we now cured all Alzheimer 00:40:19.500 |
these kind of articles which make these claims, 00:40:24.780 |
what this implementation does and how it changes the care. 00:40:30.020 |
it doesn't matter what kind of politician it is, 00:40:32.620 |
you know, we all are susceptible to these diseases. 00:40:41.060 |
and we're looking for a way to alleviate the suffering. 00:40:50.940 |
- So it sounds like the biggest problems are outside of AI 00:40:55.100 |
in terms of the biggest impact at this point. 00:41:00.420 |
in the application of ML to oncology in general? 00:41:03.780 |
So improving the detection or any other creative methods, 00:41:20.340 |
- Yeah, I just want to mention that beside detection, 00:41:24.820 |
and I think it's really an increasingly important area 00:41:33.100 |
- Because, you know, it's fine if you detect something early 00:41:43.860 |
And today, all of the drug design, ML is non-existent there. 00:41:48.300 |
We don't have any drug that was developed by the ML model 00:41:52.980 |
or even not developed, but at least even knew 00:42:03.300 |
to generate molecules with desired properties 00:42:05.780 |
to do in silica screening is really a big open area. 00:42:14.940 |
primarily taking the ideas that were developed 00:42:17.260 |
for other areas and you applying them with some adaptation, 00:42:20.460 |
the area of drug design is really technically interesting 00:42:39.820 |
And I think there are a lot of open questions in this area. 00:42:44.820 |
You know, we're already getting a lot of successes 00:42:48.820 |
even with the kind of the first generation of these models, 00:42:52.700 |
but there is much more new creative things that you can do. 00:42:59.300 |
the more powerful, the more interesting models 00:43:12.520 |
And some of these techniques are really unique 00:43:16.660 |
to let's say to graph generation and other things. 00:43:23.980 |
Graph generation or graphs, drug discovery in general, 00:43:33.340 |
Is this trying to predict different chemical reactions? 00:43:37.500 |
Or is it some kind of, what do graphs even represent 00:43:48.500 |
but let's say you're gonna talk about small molecules 00:43:55.020 |
The molecule is just where the node in the graph is an atom 00:44:23.740 |
to get certain biological activity of the compound. 00:44:33.020 |
You identify some compounds which have the right activity 00:44:39.260 |
and they're trying to now to optimize this original heat 00:44:44.260 |
to different properties that you want it to be, 00:44:46.340 |
maybe soluble, you want it to decrease toxicity, 00:45:12.620 |
- The screening is just check them for certain property. 00:45:15.100 |
- Like in the physical space, in the physical world, 00:45:18.780 |
that's doing some, that's actually running the reaction. 00:45:33.820 |
You run the screening, you identify potential, 00:45:37.660 |
potential good starts and then when the chemists come in 00:45:48.260 |
to get the desired profile in terms of all other properties? 00:45:53.260 |
So maybe how do I make it more bioactive and so on? 00:45:59.460 |
really is the one that determines the success of this design 00:46:04.460 |
because again, they have a lot of domain knowledge 00:46:09.300 |
of what works, how do you decrease the CCD and so on 00:46:15.020 |
So all the drugs that are currently in the FDA approved 00:46:19.780 |
drugs or even drugs that are in clinical trials, 00:46:31.980 |
and find the right one or adjust it to be the right ones. 00:46:40.500 |
- It's not necessarily that, it's really driven 00:46:44.260 |
by deep understanding, it's not like they just observe it. 00:46:53.140 |
So there is a lot of science that gets into it 00:47:03.900 |
- So they're quite effective at this design, obviously. 00:47:08.420 |
Like depending on how do you measure effective? 00:47:10.780 |
If you measure it in terms of cost, it's prohibitive. 00:47:15.780 |
we have lots of diseases for which we don't have any drugs 00:47:23.420 |
or neurodegenerative disease drugs that fail. 00:47:27.100 |
So there are lots of trials that fail in later stages 00:47:32.100 |
which is really catastrophic from the financial perspective. 00:47:35.140 |
So is it the effective, the most effective mechanism? 00:47:50.780 |
and what a deep understanding of the domain do they have. 00:47:56.460 |
There is really a lot of science behind what they do. 00:47:59.620 |
But if you ask me, can machine learning change it? 00:48:07.260 |
cannot hold in their memory and understanding 00:48:15.460 |
- And the space of graphs is a totally new space. 00:48:22.100 |
for machine learning to explore, graph generation. 00:48:24.020 |
- Yeah, so there are a lot of things that you can do here. 00:48:31.660 |
was the tool that can predict properties of the molecules. 00:48:36.340 |
So you can just give the molecule and the property. 00:48:52.220 |
Now, when people started working in this area, 00:48:58.620 |
which is kind of handcrafted features of the molecule 00:49:03.020 |
and then you run, I don't know, feed forward neural network. 00:49:06.020 |
And what was interesting to see that clearly, 00:49:08.540 |
this was not the most effective way to proceed. 00:49:11.060 |
And you need to have much more complex models 00:49:16.340 |
which can translate this graph into the embeddings 00:49:23.260 |
Then another direction, which is kind of related 00:49:25.340 |
is not only to stop by looking at the embedding itself, 00:49:29.260 |
but actually modify it to produce better molecules. 00:49:32.860 |
So you can think about it as machine translation, 00:49:38.220 |
and then there is an improved version of molecule. 00:49:52.700 |
We already have seen that the property prediction 00:50:08.940 |
Speaking of machine translation and embeddings, 00:50:11.900 |
you have done a lot of really great research in NLP, 00:50:21.540 |
What ideas, problems, approaches were you working on? 00:50:26.500 |
Did you explore before this magic of deep learning 00:50:42.540 |
and the time I could barely understand English. 00:50:53.460 |
where people took more kind of heavy linguistic approaches 00:50:56.140 |
for small domains and try to build up from there. 00:51:00.020 |
And then there were the first generation of papers 00:51:10.060 |
But I found it really fascinating that, you know, 00:51:28.340 |
And this was a standard of the field at the time. 00:51:32.060 |
In some ways, I mean, people maybe just started 00:51:37.860 |
but for many applications like summarization, 00:51:44.660 |
how the statistical approaches dominated the field. 00:51:48.300 |
And we've seen, you know, increased performance 00:52:11.580 |
through the whole proceeding to find one or two papers 00:52:14.540 |
which make some interesting linguistic references. 00:52:30.300 |
structured, hierarchical, representing language 00:52:34.300 |
in a way that's human explainable, understandable, 00:52:45.940 |
and it's okay to have a machine which performs a function. 00:52:50.180 |
Like when you're thinking about your calculator, correct? 00:52:53.220 |
Your calculator can do calculation very different 00:52:58.860 |
And this is fine if we can achieve certain tasks 00:53:04.460 |
it doesn't necessarily mean that it has to understand 00:53:11.260 |
because you have so many other sources of information 00:53:14.940 |
that are absent when you are training your system. 00:53:24.260 |
there were some papers on machine translation. 00:53:27.420 |
like people were trying really, really simple. 00:53:31.020 |
And the feeling, my feeling was that, you know, 00:53:36.220 |
it's like to fly in the moon and build a house there 00:53:42.580 |
I never could imagine that within, you know, 10 years, 00:53:56.220 |
in the sense that people for a very long time 00:54:03.180 |
That's why coming back to a question about biology, 00:54:06.140 |
that in linguistics, people try to go this way 00:54:13.500 |
and try to obstruct it and to find the right representation. 00:54:42.620 |
it can do like really bizarre and funny thing. 00:54:59.180 |
In your intuition, I mean, it's unknown, I suppose. 00:55:03.740 |
But as we start to creep towards romantic notions 00:55:14.220 |
and something that maybe to me or to us silly humans 00:55:23.500 |
with these neural networks or statistical methods? 00:55:27.180 |
- So I guess I am very much driven by the outcomes. 00:55:35.420 |
which would be satisfactory for us for different tasks? 00:55:40.420 |
Now, if you again look at machine translation systems, 00:55:43.100 |
which are, you know, trained on large amounts of data, 00:55:48.780 |
relatively to where they've been a few years ago. 00:56:06.580 |
we still don't really understand what we are doing. 00:56:11.820 |
and there is obviously a lot of progress in studying, 00:56:16.380 |
you know, in our brains when we process language 00:56:25.420 |
What does bother me is that, you know, again, 00:56:41.300 |
you know, I show them some examples of translation 00:56:43.540 |
from some newspaper in Hebrew or whatever, it was perfect. 00:56:47.260 |
And then I have a recipe that Tommy Yakala's sister 00:56:51.300 |
sent me a while ago and it was written in Finnish 00:56:59.260 |
You cannot understand anything, what it does. 00:57:01.460 |
It's not like some syntactic mistakes, it's just terrible. 00:57:04.300 |
And year after year I try it and it will translate, 00:57:07.020 |
and year after year it does this terrible work 00:57:10.980 |
are not a big part of the training repertoire. 00:57:18.020 |
that's a really clean, good way to look at it. 00:57:32.500 |
in the best possible formulation of the Turing test? 00:57:37.060 |
Which is, would you wanna have a conversation 00:58:01.100 |
which would enable it to have a conversation, 00:58:06.700 |
- So you think it's a problem of data, perhaps? 00:58:08.140 |
- I think in some ways it's a problem of data. 00:58:32.540 |
So there are many capacities in which it's doing very well. 00:58:35.180 |
And you can ask me, would you trust the machine 00:58:40.820 |
especially if we're talking about newspaper data 00:58:43.540 |
or other data, which is in the realm of its own training set, 00:58:47.880 |
But, you know, having conversations with the machine, 00:58:52.900 |
it's not something that I would choose to do. 00:58:59.420 |
and about all this kind of ELISA conversations, 00:59:06.940 |
and they claim that it's like really humongous amount 00:59:09.520 |
of the local population, which like for hours 00:59:17.100 |
that there are some people who enjoy this conversation. 00:59:20.760 |
And you know, it brought to me another MIT story 00:59:26.940 |
I don't know if you're familiar with this story. 00:59:34.620 |
very trivial, like restating of what you said 00:59:48.180 |
And at the time there was no beautiful interfaces, 00:59:53.540 |
And Weizenbaum himself was so horrified by this phenomena 00:59:56.900 |
that people can believe enough to the machine 01:00:00.820 |
that machine understands you and you can complete the rest. 01:00:08.660 |
what this artificial intelligence can do to our brains. 01:00:22.620 |
that it delivers the good that we are trying to get. 01:00:27.220 |
I, by the way, I'm not horrified by that possibility, 01:00:34.780 |
human connection, whether it's through language 01:00:37.060 |
or through love, it seems like it's very amenable 01:00:49.340 |
Like you said, the secretaries who enjoy spending hours. 01:00:52.460 |
I would say I would describe most of our lives 01:00:55.020 |
as enjoying spending hours with those we love 01:01:02.780 |
So I'm not sure how much intelligence we exhibit 01:01:12.660 |
of what it means to pass the Turing test with language. 01:01:16.020 |
I think you're right in terms of conversation. 01:01:18.220 |
I think machine translation has very clear performance 01:01:24.420 |
What it means to have a fulfilling conversation 01:01:28.020 |
is very, very person dependent and context dependent 01:01:36.340 |
But in your view, what's a benchmark in natural language, 01:01:52.760 |
- I think it goes across specific application. 01:01:55.820 |
It's more about the ability to learn from few examples 01:02:01.460 |
and all these cases, because the way we publish 01:02:07.540 |
like naively we get 55, but now we had a few example 01:02:12.500 |
None of these methods actually are realistically 01:02:23.540 |
and to move or to be autonomous in finding the data 01:02:28.940 |
that you need to learn, to be able to perfect new task 01:02:33.180 |
or new language, this is an area where I think 01:02:43.020 |
- Are you at all excited, curious by the possibility 01:02:48.500 |
Is this, 'cause you've been very, in your discussion, 01:02:52.540 |
so if we look at oncology, you're trying to use 01:03:09.860 |
that our civilization has dreamed about creating, 01:03:19.040 |
- So as you said yourself earlier, talking about, 01:03:25.220 |
how do you perceive our communications with each other, 01:03:28.980 |
that we're matching keywords and certain behaviors 01:03:37.220 |
relations with another person, you have separate 01:03:39.900 |
kind of measurements and outcomes inside your head 01:03:42.420 |
that determine what is the status of the relation. 01:03:49.600 |
Is it the fact that now we are gonna do the same way 01:03:51.860 |
as human is doing when we don't even understand 01:03:55.460 |
Or we now have an ability to deliver these outcomes, 01:04:01.260 |
not just to translate or just to answer questions, 01:04:03.940 |
but across many, many areas that we can achieve 01:04:09.740 |
with the ability to learn and do other things. 01:04:12.380 |
I think this is, and this we can actually measure 01:04:21.580 |
in my lifetime, at least so far what we've seen, 01:04:28.740 |
And I think it will be really exciting to see 01:04:39.340 |
there are machines which are improving their functionality. 01:04:41.860 |
Another one is to think about us with our brains, 01:04:44.940 |
which are imperfect, how they can be accelerated 01:04:49.060 |
by this technology as it becomes stronger and stronger. 01:05:02.940 |
- So there is this point that the patient gets 01:05:07.980 |
and all of a sudden they see life in a different way 01:05:16.460 |
So you can imagine this kind of computer augmented cognition 01:05:21.460 |
where it can bring you that now in the same way 01:05:44.980 |
and technology augmenting our intelligence as humans. 01:05:49.980 |
Yesterday, a company called Neuralink announced, 01:05:58.900 |
They demonstrated brain, computer, brain machine interface, 01:06:02.660 |
where there's like a sewing machine for the brain. 01:06:11.100 |
in terms of things that some people would say 01:06:20.340 |
a hope for that more direct interaction with the brain? 01:06:32.240 |
and I think there will be a lot of developments. 01:06:36.540 |
we are not aware of our feelings of motivation 01:06:41.420 |
Like let me give you a trivial example, our attention. 01:06:47.260 |
that it takes a while to a person to understand 01:06:52.180 |
who really have strong capacity to hold attention. 01:06:57.980 |
that they have problem to regulate their attention. 01:07:00.740 |
Imagine to yourself that you have like a cognitive aid 01:07:06.260 |
that your attention is now not on what you are doing, 01:07:10.580 |
you're now dreaming of what you're gonna do in the evening. 01:07:12.780 |
So even this kind of simple measurement things, 01:07:18.060 |
and I see it even in simple ways with myself. 01:07:35.860 |
"Who would ever care about some status in some app?" 01:07:41.620 |
you have to do set a number of points every month. 01:07:44.700 |
And not only is it I do it every single month 01:07:56.220 |
in two days, I did like some humongous amount of running 01:08:10.240 |
So you can already see that this direct measurement 01:08:20.460 |
can be expanded to many other areas of our life 01:08:31.200 |
imagine that the person who is generating the keywords, 01:08:44.020 |
Maybe it will be really behavior-modifying moment. 01:09:00.860 |
now have other non-human entities helping us out. 01:09:17.600 |
What ideas do students struggle with the most 01:09:20.940 |
as they first enter this world of machine learning? 01:09:28.040 |
I started teaching a small machine learning class 01:09:32.880 |
in my big machine learning class at Tomiakala 01:09:42.900 |
more and more people at MIT want to take this class. 01:09:47.000 |
And while we designed it for computer science majors, 01:09:55.720 |
was not enabling them to do well in the class. 01:10:06.460 |
And that's why we actually started a new class 01:10:08.720 |
which we call machine learning from algorithms to modeling, 01:10:12.700 |
which emphasizes more the modeling aspects of it 01:10:16.820 |
and focuses on, it has majors and non-majors. 01:10:21.780 |
So we kind of try to extract the relevant parts 01:10:27.460 |
because the fact that we're teaching 20 classifiers 01:10:43.940 |
what different and exciting things they can do 01:10:51.100 |
Some of them are like matching their relations 01:10:53.740 |
and other things like variety of different applications. 01:10:55.780 |
- Everything is amenable to machine learning. 01:11:05.420 |
but it almost seems like everybody needs to learn 01:11:10.060 |
If you're 20 years old or if you're starting school, 01:11:21.860 |
So when you interacted with those non-majors, 01:11:24.980 |
is there skills that they were simply lacking at the time 01:11:32.020 |
and that they learned in high school and so on? 01:11:42.100 |
that there is a Python component in the class. 01:11:47.020 |
and the class is not really heavy on programming. 01:11:49.140 |
They primarily kind of add parts to the programs. 01:11:52.420 |
I think it was more of the mathematical barriers 01:11:55.420 |
and the class, again, with a design on the majors 01:11:58.220 |
was using the notation like big O for complexity 01:12:01.180 |
and others, people who come from different backgrounds 01:12:05.780 |
It's not necessarily very challenging notion, 01:12:11.460 |
So I think that kind of linear algebra and probability, 01:12:17.620 |
multivariate calculus are things that can help. 01:12:31.380 |
If they want to get into that field, what should they do? 01:12:34.520 |
Get into it and succeed as researchers and entrepreneurs. 01:12:50.140 |
and you can find online or in your school classes 01:12:54.780 |
which are more mathematical, more applied and so on. 01:13:04.500 |
and you can make many different types of contribution 01:13:09.580 |
And the second point, I think it's really important 01:13:13.660 |
to find some area which you really care about 01:13:31.300 |
and we should be doing it and we're still not doing it 01:13:39.660 |
- So you've been very successful in many directions in life 01:13:53.100 |
mention somewhere that researchers often get lost 01:13:56.680 |
This is per our original discussion with cancer and so on 01:14:32.780 |
and I didn't take humanities classes in my undergrad. 01:14:43.540 |
each one of us inside of them have their own set of, 01:14:48.220 |
you know, things that we believe are important 01:14:53.380 |
with achieving various goal, busy listening to others 01:14:56.260 |
and to kind of try to conform and to be part of the crowd 01:15:03.740 |
And, you know, we all should find some time to understand 01:15:14.100 |
and to make sure that while we are running 10,000 things, 01:15:28.500 |
And if I look over my time, when I was younger, 01:15:35.060 |
I was primarily driven by the external stimulus, 01:15:41.540 |
And now a lot of what I do is driven by really thinking 01:15:46.540 |
what is important for me to achieve independently 01:15:55.140 |
And, you know, I don't mind to be viewed in certain ways. 01:16:00.100 |
The most important thing for me is to be true to myself, 01:16:16.780 |
sometimes, you know, the vanity and the triviality