back to indexDan Kokotov: Speech Recognition with AI and Humans | Lex Fridman Podcast #151
Chapters
0:0 Introduction
3:23 Dune
6:39 Rev
12:39 Translation
19:28 Gig economy
28:8 Automatic speech recognition
38:58 Create products that people love
47:8 The future of podcasts at Spotify
68:46 Book recommendations
70:8 Stories of our dystopian future
73:50 Movies about Stalin and Hitler
79:5 Interviewing Putin
85:2 Meaning of life
00:00:00.000 |
The following is a conversation with Dan Kokorov, 00:00:03.280 |
VP of Engineering at Rev.ai, which is by many metrics, 00:00:08.280 |
the best speech to text AI engine in the world. 00:00:12.380 |
Rev in general is a company that does captioning 00:00:15.440 |
and transcription of audio by humans and by AI. 00:00:20.020 |
I've been using their services for a couple of years now 00:00:22.680 |
and planning to use Rev to add both captions and transcripts 00:00:26.960 |
to some of the previous and future episodes of this podcast 00:00:33.160 |
the conversation or reference various parts of the episode, 00:00:36.480 |
since that's something that quite a few people requested. 00:00:45.480 |
so people can provide suggestions and improvements there. 00:01:04.960 |
And if you wish, click the sponsor links below 00:01:07.840 |
to get a discount and to support this podcast. 00:01:10.880 |
As a side note, let me say that I reached out to Dan 00:01:15.080 |
because I've been using and genuinely loving their service 00:01:22.480 |
I previously talked to the head of Adobe Research 00:01:33.160 |
Examples are Adobe Premiere for video editing, 00:01:38.680 |
AutoHotKey on Windows for automating keyboard 00:01:50.720 |
I just like talking to people who create things 00:01:56.480 |
the folks at Rev.ai offered to sponsor this podcast 00:02:02.720 |
This conversation is not sponsored by the guest. 00:02:10.020 |
that you cannot buy your way onto this podcast. 00:02:15.480 |
I wanted to bring this up to make a specific point 00:02:37.080 |
I will never take money for bringing a guest on. 00:02:44.920 |
or just genuinely love something they've created. 00:02:58.880 |
In general, no amount of money can buy my integrity. 00:03:06.800 |
If you enjoy this thing, subscribe on YouTube, 00:03:18.240 |
And now, here's my conversation with Dan Kokorov. 00:03:28.060 |
What's the greatest sci-fi novel of all time, 00:03:37.860 |
- The greatest sci-fi novel of all time is Dune, 00:03:41.220 |
and the second greatest is the Children of Dune, 00:03:44.140 |
and the third greatest is the God Emperor of Dune, so. 00:03:50.360 |
I mean, it's just an incredible world that he created. 00:03:53.980 |
And I don't know if you've read the book or not. 00:03:58.820 |
especially 'cause the new movie is coming out. 00:04:11.940 |
I guess you would call that real-time strategy. 00:04:14.340 |
- Right, right, I think I remember that game. 00:04:15.900 |
- Yeah, it was kind of awesome, '90s or something. 00:04:18.060 |
I think I played it, actually, when I was in Russia. 00:04:23.580 |
I think at the time that I used to live in Russia, 00:04:26.080 |
I think video games were about the suspicion of Pong. 00:04:29.760 |
I think Pong was pretty much the greatest game 00:04:56.980 |
- So okay, but back to Dune, what do you get? 00:04:59.980 |
And by the way, the new movie I'm pretty interested in, 00:05:08.260 |
I don't know, so there's a David Lynch movie, Dune, 00:05:30.000 |
So in the fourth book, "God, Emperor of Dune," 00:05:47.820 |
You probably saw the worms in the trailer, right? 00:05:55.100 |
and he like oppresses the people for a long time, right? 00:06:01.900 |
kind of a stagnation period in civilization, right? 00:06:05.580 |
But people have gotten too comfortable, right? 00:06:07.240 |
And so he kind of oppresses them so that they explode 00:06:14.240 |
and kind of renew the forward momentum of humanity, right? 00:06:17.800 |
And so to me, that's kind of fascinating, right? 00:06:19.780 |
You need a little bit of pressure and suffering, right, 00:06:22.880 |
to kind of make progress, not get too comfortable. 00:06:29.540 |
Maybe that's a bit of a cruel philosophy to take away, but. 00:06:47.560 |
that I do transcripts for this podcast and do captioning. 00:07:00.580 |
and it was always a pain in the ass, if I'm being honest. 00:07:21.080 |
when people take a problem and they just make it easy. 00:07:28.520 |
there's so many, it's like there's so many things in life 00:07:34.380 |
that you might not even be aware of that are painful, 00:07:37.740 |
and then Rev, you just like give the audio, give the video, 00:07:53.340 |
with the captions, all in a standardized format. 00:08:00.380 |
So I thought I had, just for the hell of it, talk to you. 00:08:05.020 |
One other product, it just made my soul feel good. 00:08:12.980 |
is called iZotope RX, it's for audio editing. 00:08:25.940 |
and it just cleans everything up really nicely. 00:08:49.820 |
I mean, it just, I don't know, everything else sucked 00:08:53.900 |
for like voice-based cleanup that I've ever used. 00:09:00.100 |
I've used all kinds of other things with plugins, 00:09:04.700 |
you have to do it manually, here it just worked. 00:09:34.620 |
- Like, do you have the actual domain, or is it just-- 00:09:42.540 |
So we use Rev.ai to denote our ASR services, right? 00:09:50.620 |
- So it's like wordpress.com and wordpress.org, 00:09:53.180 |
they actually have separate brands that like, 00:09:55.860 |
I don't know if you're familiar with what those are. 00:09:58.420 |
They provide almost like a separate branch of-- 00:10:01.020 |
- A little bit, I think with that, it's like, 00:10:02.940 |
wordpress.org is kind of their open source, right? 00:10:09.500 |
- And with us, the differentiation is a little bit different, 00:10:17.420 |
I was gonna say, you know, like you were talking about, 00:10:35.180 |
Like our CEO, Jason, was a early employee of Upwork, 00:10:45.540 |
and he wanted to make the whole experience better, 00:10:50.140 |
at that time, Upwork was primarily programmers. 00:10:54.860 |
if you wanna hire someone to help you code a little site, 00:11:00.580 |
and you could like browse through a list of freelancers, 00:11:03.060 |
pick a programmer, have a contract with them, 00:11:09.740 |
because for you, you would kind of have to browse 00:11:16.140 |
well, is this guy good, or is somebody else better? 00:11:24.540 |
If you're an expert, you probably wouldn't be 00:11:29.820 |
So there's kind of like a lot of potential regret, right? 00:11:43.540 |
but kind of figuring out how can I make my profile 00:11:51.340 |
So like, Rob's idea was, let's remove the barrier, right? 00:11:55.340 |
We'll pick a few verticals that are fairly standardizable. 00:12:02.580 |
and then we added audio transcription a bit later. 00:12:08.660 |
We'll give you back the results as soon as possible. 00:12:15.580 |
then we made it shorter and shorter and shorter. 00:12:21.780 |
And we'll hide all the details from you, right? 00:12:45.420 |
Or is this like the focus now is translation, 00:12:50.180 |
- The focus now is language or speech services generally, 00:13:05.100 |
And so we wanted work that was done by people on a computer. 00:13:08.220 |
You know, we weren't trying to get into, you know, 00:13:13.220 |
And something that could be relatively standard, 00:13:17.100 |
So we could kind of present the simplified interface, right? 00:13:21.460 |
because each programming project is kind of unique, right? 00:13:24.740 |
We're looking for something that transcription is, 00:13:30.860 |
Translation is somewhat similar in that, you know, 00:13:43.980 |
We started with translation because we saw the need 00:13:48.340 |
and we picked up kind of a specialty of translation 00:13:53.340 |
where we would translate things like birth certificates, 00:14:01.540 |
And so they were fairly, even more well-defined 00:14:06.020 |
and easy to kind of tell if we did a good job. 00:14:08.260 |
- So you can literally charge per type of document? 00:14:21.500 |
- So now it looks like for audio transcription, right? 00:14:27.420 |
we don't really actually focus on that anymore. 00:14:30.580 |
But, you know, back when it was still a main business 00:14:41.980 |
So it would be both transcription and translation. 00:14:45.900 |
- I wanted to test the system to see how good it is, 00:14:56.700 |
- But now it's only in like the one direction, right? 00:15:02.900 |
- Got it, because I'm deeply curious about this. 00:15:07.860 |
when the economy, when the world opens up a little bit. 00:15:14.020 |
First of all, I'm allergic to the word brand. 00:15:17.980 |
I'm definitely not building any brands in Russia. 00:15:21.140 |
But I'm going to Paris to talk to the translators 00:15:26.660 |
There's this famous couple that does translation. 00:15:29.820 |
And I'm more and more thinking of how is it possible 00:15:34.820 |
to have a conversation with a Russian speaker? 00:15:37.860 |
'Cause I have just some number of famous Russian speakers 00:15:44.940 |
And my Russian is not strong enough to be witty and funny. 00:15:51.980 |
I'm an extra level of like awkward idiot in Russian, 00:16:06.540 |
Like if I, there's a guy named Grigori Perlman, 00:16:21.460 |
And then it would be like a, not to use a weird term, 00:16:28.100 |
where it's like a dance of, like I understand it one way, 00:16:50.820 |
'cause I think I could do a good job with it. 00:16:54.620 |
understanding the fundamentals of translation 00:16:59.300 |
So that's why I'm starting with the actual translators 00:17:14.180 |
So that's like a little bit of a baby project 00:17:30.420 |
not a hobby, but he had a job, like a day job, 00:17:42.620 |
First he was translating English poetry to Russian. 00:17:49.060 |
You kind of gain some small fame in that world anyways, 00:17:52.620 |
because recently this poet, like Louise Clark, 00:17:59.620 |
she was awarded the Nobel Prize for literature. 00:18:14.060 |
And he kind of talked about some of the intricacies 00:18:18.060 |
So that's like an extra level of difficulty, right? 00:18:19.700 |
Because translating poetry is even more challenging 00:18:25.460 |
- Do you remember any experiences and challenges 00:18:28.500 |
to having to do the translation that stuck out to you? 00:18:34.380 |
- I mean, a lot of it I think is word choice, right? 00:18:41.620 |
Just there's inflections in Russian and genders 00:18:46.020 |
One of the reasons actually why machine translation 00:18:53.980 |
But then English has like a huge number of words, 00:18:58.500 |
So it's often difficult to find the right word 00:19:04.020 |
- Yeah, Russian language, they play with words much more. 00:19:07.620 |
So you were mentioning that Rev was kind of born 00:19:19.060 |
the freelancer marketplace idea better, right? 00:19:28.780 |
- Is there something else to the story of Rev, 00:19:32.700 |
Like what did it take to bring it to actually to life? 00:19:39.860 |
I mean, as often the case, it's with scaling it up, right? 00:19:44.020 |
And in this case, the scaling is kind of scaling 00:19:49.300 |
Rev is essentially a two-sided marketplace, right? 00:19:51.620 |
Because there's the customers and then there's the Revvers. 00:20:04.020 |
Takes longer to get your work done, things like that. 00:20:07.580 |
If there's too many, then Revvers have a bad experience 00:20:10.620 |
because they might log on to see what work is available 00:20:20.220 |
And that's like a problem we've been working on 00:20:26.020 |
- If you can kind of talk to this gig economy idea, 00:20:29.660 |
I did a bunch of different psychology experiments 00:20:33.540 |
I've asked to do different kinds of very tricky 00:20:36.300 |
computer vision annotation on Mechanical Turk 00:20:38.580 |
and it's connecting people in a more systematized way. 00:20:53.740 |
What do you think about this world of gig economies, 00:20:57.180 |
of there being a service that connects customers to workers 00:21:10.300 |
it could be scaled to like tens of thousands of people, 00:21:14.060 |
Is there something interesting about that world 00:21:18.260 |
- Yeah, well, we don't think of it as kind of gig economy, 00:21:34.180 |
but in work, it kind of sounds like it's frivolous. 00:21:39.180 |
To us, it's improving the nature of working from home 00:21:45.220 |
on your own time and on your own terms, right? 00:21:48.060 |
And kind of taking away geographical limitations 00:21:54.220 |
So, many of our freelancers are maybe work from home moms, 00:21:59.580 |
And they don't want the traditional nine to five job, 00:22:06.260 |
and decide like exactly how much to work and when to work. 00:22:22.500 |
And like, generally that wouldn't be compatible 00:22:25.020 |
before this new world, you kind of had to choose. 00:22:48.220 |
Most of them are in the US, that's the majority. 00:22:51.500 |
Yeah, because most of our work is audio transcription 00:22:54.900 |
and so you have to speak pretty good English. 00:23:03.220 |
And as far as like US, it's really all over the place. 00:23:10.460 |
where the management team will go to some place 00:23:19.140 |
You know, the most recent one we did is in Utah. 00:23:30.220 |
Like I said, you know, one category is, you know, 00:23:43.700 |
And this is one way for them to make a living. 00:24:03.420 |
So it really is a pretty wide variety of folks. 00:24:13.100 |
because like some are clearly like weirdly knowledgeable 00:24:25.260 |
at like capitalizing stuff, like technical terms, 00:24:35.460 |
the deep learning lectures or machine learning lectures 00:24:39.860 |
And it's funny, like a large number of them were like, 00:24:47.260 |
but they do a really good job at like, I don't know. 00:24:52.340 |
They will like do research, they will Google things, 00:24:54.980 |
you know, to kind of make sure they get it right. 00:24:59.060 |
it's actually part of the enjoyment of the work. 00:25:07.380 |
And I learned something while I'm transcribing, right? 00:25:11.740 |
So what's that captioning transcription process 00:25:16.180 |
Can you maybe speak to that to give people a sense, 00:25:18.940 |
like how much is automated, how much is manual? 00:25:28.380 |
a pretty good amount of time to give like our Revvers 00:25:37.100 |
they'll see a list of audios that need to be transcribed. 00:25:41.380 |
And we try to give them tools to pick specifically 00:25:44.460 |
So maybe some people like to do longer audios 00:25:52.300 |
Some people like to do audios in a particular subject 00:26:01.060 |
And then when they pick what they want to do, 00:26:04.580 |
we'll launch a specialized editor that we've built 00:26:07.460 |
to make transcription as efficient as possible. 00:26:12.340 |
So, you know, we have our machine learning model 00:26:18.500 |
And then our tools are optimized to help them correct that. 00:26:24.940 |
- Yeah, it depends on, you know, I would say the audio. 00:26:36.780 |
But if you imagine someone recorded a lecture, you know, 00:26:47.340 |
and there's maybe a lot of crosstalk and things like that, 00:26:57.620 |
- What would you say is the speed that you can possibly get? 00:27:05.220 |
As you're like listening, can you write as fast as- 00:27:10.420 |
It's actually a pretty, it's not an easy job. 00:27:12.940 |
You know, we actually encourage everyone at the company 00:27:19.020 |
And it's way harder than you might think it is, right? 00:27:24.060 |
Because people talk fast and people have accents 00:27:32.580 |
Like there's somebody, we're probably gonna use Rev 00:27:46.380 |
maybe two to three X, I would say, real time. 00:27:56.740 |
I could just imagine the Revvers working on this right now. 00:28:11.180 |
Can you speak to what is ASR, automatic speech recognition? 00:28:26.660 |
- Yeah, so ASR, automatic speech recognition, 00:28:28.780 |
it's a class of machine learning problem, right? 00:28:34.220 |
and transform it into a sequence of words, essentially. 00:28:40.540 |
And there's a variety of different approaches and techniques 00:28:47.100 |
So we think we have pretty much the world's best ASR 00:28:54.020 |
So there's different kinds of domains, right, for ASR. 00:28:56.940 |
Like one domain might be voice assistance, right? 00:29:00.220 |
So Siri, very different than what we're doing, right? 00:29:04.220 |
Because Siri, there's fairly limited vocabulary. 00:29:18.220 |
And Siri will also generally adapt to your voice 00:29:21.420 |
So for this kind of audio, we think we have the best. 00:29:29.420 |
it's maybe 14% word error rate on our test suite 00:29:35.340 |
So word error rate is like one way to measure 00:30:12.300 |
Substitutions is, you said Apple, but I said, 00:30:15.740 |
but the ASR thought it was Able, something like this. 00:30:18.440 |
Human accuracy, most people think realistically, 00:30:33.380 |
when I upload videos often generates automatic captions. 00:30:39.740 |
from a tech perspective, are you trying to beat YouTube? 00:30:47.180 |
I mean, I don't know how seriously they take this task, 00:30:51.860 |
And they, you know, Google is probably up there 00:31:20.580 |
- Yeah, I mean, we measure ourselves against like Google, 00:31:23.140 |
Amazon, Microsoft, you know, some smaller competitors. 00:31:30.300 |
We try to compose it of a pretty representative set of 00:31:33.060 |
audios, maybe it's some podcasts, some videos, 00:31:36.380 |
some interviews, some lectures, things like that, right? 00:31:45.940 |
like you can actually just do the automated captioning. 00:31:49.300 |
So like, I guess it's like way cheaper, whatever it is, 00:31:55.660 |
- By the way, it used to be a dollar per minute 00:32:00.100 |
I think it's like a dollar 15 or something like that. 00:32:09.260 |
That was the other thing that was surprising to me. 00:32:10.940 |
It was actually like the cheapest thing you could, 00:32:26.900 |
I think there were services that you can get like similar 00:32:30.900 |
to Rev kind of feel to it, but it wasn't as automated. 00:32:35.820 |
Like the drag and drop, the entirety of the interface. 00:33:04.380 |
So I think I've, I stopped using this pipeline, 00:33:12.460 |
but it was causing me some issues on my side, 00:33:22.780 |
So like it closes the loop to where I don't have to go 00:33:36.300 |
- Right, you put in your Dropbox and you know, 00:33:41.900 |
- Depending on if you're in a rush, it just shows up, yeah. 00:33:53.460 |
but then I realized this is the programmer in me. 00:33:56.180 |
Like, dude, you don't need to automate everything 00:34:01.140 |
'Cause I wasn't doing enough captions to justify 00:34:07.820 |
- Yeah, I would say if you're doing so many interviews 00:34:15.500 |
Now you're talking about Elon Musk levels of business. 00:34:18.980 |
- But for sure we have like a variety of ways 00:34:24.180 |
You mentioned, I think it's through a company called Zapier, 00:34:26.220 |
which kind of can connect Dropbox to Rev and vice versa. 00:34:31.140 |
We have an API if you wanna really like customize it, 00:34:33.460 |
you know, if you wanna create the Lex Friedman, 00:34:42.300 |
So can you speak to the ASR a little bit more? 00:34:51.460 |
machine learning-wise, how hard is this problem? 00:34:59.340 |
- Yeah, well, the 3% error rate is definitely, 00:35:18.340 |
The more data you have and the higher quality of the data, 00:35:26.540 |
And we at Rev have kind of like the best data, like we have. 00:35:34.020 |
- Our business model is being paid to annotate the data. 00:35:39.140 |
- So it's kind of like a pretty magical flywheel. 00:35:42.900 |
- And so we've kind of like ridden this flywheel 00:35:47.060 |
And we think we're still kind of in the early stages 00:35:50.540 |
of figuring out all the parts of the flywheel to use, 00:35:53.100 |
you know, because we have the final transcripts 00:36:01.660 |
But we, in principle, also have all the edits 00:36:10.540 |
- We basically, that's something for us to figure out 00:36:14.380 |
we feel like we're only in the early stages, right? 00:36:16.300 |
- So the data is there, that'd be interesting, 00:36:23.380 |
I always remember we did a segmentation annotation 00:36:28.380 |
for driving data, so segmenting the scene, like visual data. 00:36:35.980 |
people were drawing polygons around different objects 00:36:40.060 |
And it feels like, it always felt like there was a lot 00:36:47.020 |
the kind of fixing of the polygons that they do. 00:36:49.460 |
Now there's a few papers written about how to draw polygons, 00:36:54.860 |
like with recurrent neural nets to try to learn 00:36:59.220 |
from the human clicking, but it was just like experimental, 00:37:04.380 |
you know, it was one of those like CVPR type papers 00:37:08.980 |
It didn't feel like people really tried to do it seriously. 00:37:13.100 |
And I wonder, I wonder if there's information 00:37:15.140 |
in the fixing that provides deeper set of signal 00:37:24.460 |
- The intuition is for sure there must be, right? 00:37:27.060 |
- And in all kinds of signals and how long you took 00:37:40.340 |
You mentioned Rev.ai, that's where you want to. 00:37:43.340 |
- Yeah, so Rev.ai is kind of our way of bringing this ASR 00:38:00.660 |
which was kind of ASR for the consumer, right? 00:38:04.580 |
but you want to pay, now it's 25 cents a minute, I think. 00:38:12.740 |
you get an editor and you can kind of fix it up yourself. 00:38:17.460 |
Then we started using ASR for human transcriptionists. 00:38:21.980 |
And then the kind of Rev.ai is the final step 00:38:23.460 |
of the journey, which is, we have this amazing engine. 00:38:28.860 |
What kind of new applications could be enabled 00:38:43.500 |
and kind of learning from what people do with it. 00:38:45.580 |
And we have ideas of our own as well, of course, 00:38:49.220 |
when AWS provided the building blocks, right? 00:38:53.940 |
and they try to make it easier to build those things, right? 00:38:59.180 |
- Although AWS kind of does a shitty job of like, 00:39:02.860 |
I'm continually surprised, like Mechanical Turk, 00:39:07.780 |
We're talking about like Rev.ai making me feel good. 00:39:11.140 |
Like when I first discovered Mechanical Turk, 00:39:27.740 |
Does nobody at Amazon wanna like seriously invest in it? 00:39:37.260 |
And it feels like they have a committee of like two people 00:39:43.980 |
like what are we gonna do with Mechanical Turk? 00:39:46.540 |
It's like two websites make me feel like this, 00:39:49.260 |
that and craiglist.org, whatever the hell it is. 00:39:55.940 |
- Well, craiglist basically hasn't been updated 00:40:01.860 |
like how big is the team working on Mechanical Turk? 00:40:09.460 |
- Yeah, well, if nothing else, they benefit from, 00:40:13.500 |
you know, the other teams like moving things forward, 00:40:19.740 |
we use Mechanical Turk for a couple of things as well, 00:40:25.700 |
- I think most people, the thing is most people 00:40:29.140 |
Like, so like we, for example, we use it through the API. 00:40:37.540 |
I don't even know what to, I mean, same criticism, 00:40:45.100 |
as long as we're ranting, my same criticism goes 00:40:50.940 |
like Google, for example, the API for the different services 00:40:58.980 |
Like, it's not so shitty, I should actually be, 00:41:10.900 |
The, you know, the documentation is pretty good. 00:41:14.340 |
Like most of the things that the API makes available 00:41:19.700 |
It's just that in the sense that it's accurate, 00:41:23.100 |
sometimes outdated, but like the degree of explanations 00:41:27.260 |
with examples is only covering, I would say like 50% 00:41:33.900 |
And it just feels a little bit like there's a lot 00:41:36.300 |
of natural questions that people would wanna ask 00:41:44.580 |
Like it's such a magical thing, like the Maps API, 00:42:05.860 |
I sometimes think about this for somebody who wants 00:42:15.820 |
You know, YouTube, the service is one of the most magical, 00:42:24.500 |
And yet they seem to be quite clueless on so many things 00:42:33.420 |
Like it feels like whatever the mechanism that you use 00:42:44.820 |
Like there's literally people that are like screaming, 00:42:51.060 |
There's like features that were like begged for, 00:42:56.940 |
like being able to upload multiple videos at the same time. 00:43:00.180 |
That was missing for a really, really long time. 00:43:03.940 |
Now, like there's probably things that I don't know, 00:43:08.020 |
which is maybe for that kind of huge infrastructure, 00:43:10.980 |
it's actually very difficult to build some of these features. 00:43:23.860 |
And it feels like the company doesn't give a damn about you. 00:43:30.020 |
That might have to do with just like small groups 00:43:32.540 |
working on these small features and these specific features. 00:43:35.940 |
And there's no overarching like dictator type of human 00:43:40.340 |
that says like, why the hell are we neglecting 00:43:43.940 |
Like there's people that we need to speak to the people 00:43:48.940 |
that like wanna love our product and they don't. 00:43:52.540 |
at some point you just get so fixated on the numbers. 00:43:54.900 |
And it's like, well, the numbers are pretty great. 00:44:01.980 |
- And you're not like the person that like build this thing. 00:44:05.860 |
You're just there, you came in as a product manager. 00:44:10.700 |
your mandate is like increase this number like 10%, right? 00:44:17.540 |
Like if you, this is, okay, if there's a lesson in this, 00:44:21.380 |
is don't reduce your company into a metric of like, 00:44:27.820 |
how much people watching the videos and so on, 00:44:31.020 |
and like convince yourself that everything is working 00:44:36.220 |
There's something, you have to have a vision. 00:44:49.260 |
is that they always should love your product. 00:44:52.620 |
and have that like creator's love for your own thing, right? 00:44:55.420 |
Like, and you paint by, you know, these comments, right? 00:44:59.580 |
And probably like, Apple, I think did this generally 00:45:03.860 |
- They're well known for kind of keeping teams small, 00:45:10.380 |
like there's that book, "Creative Selection." 00:45:12.700 |
I don't know if you read it by an Apple engineer 00:45:18.300 |
because unlike most of these business books where it's, 00:45:21.420 |
you know, here's how Steve Jobs ran the company. 00:45:24.580 |
It's more like, here's how life was like for me, 00:45:29.020 |
and here what it was like to pitch Steve Jobs, you know, 00:45:31.660 |
on like, you know, I think it was in charge of like 00:45:36.860 |
And at Apple, like Steve Jobs reviewed everything. 00:45:41.140 |
to show my demos to Steve Jobs and, you know, 00:45:43.740 |
to change them because like Steve Jobs didn't like how, 00:45:46.580 |
you know, the shape of the little key was off 00:45:48.780 |
because the rounding of the corner was like not quite right 00:45:50.900 |
or something like this, but he was famously a stickler 00:45:58.660 |
- Yeah, Elon Musk does that similar kind of thing with Tesla, 00:46:07.660 |
And like, he talks to like the lowest level engineers. 00:46:24.980 |
- Yeah, and kind of try to build this platform 00:46:30.340 |
where all of your, you know, all of your meetings, 00:46:33.940 |
you know, they're as easily accessible as your notes, right? 00:46:42.780 |
Now that I'm like no longer a programmer, right? 00:46:48.100 |
that's less like my day is in meetings, right? 00:46:51.460 |
And, you know, pretty often I wanna like see what was said, 00:46:54.860 |
right, who said it, you know, what's the context. 00:47:03.220 |
were indexed, archived, you know, you could go back, 00:47:05.780 |
you could share a clip like really easily, right? 00:47:10.060 |
Like everything that's said converted to text 00:47:22.740 |
I mean, for me, I care about podcasts, right? 00:47:25.580 |
And one of the things that was, you know, I'm torn. 00:47:33.580 |
So I love them very much because they dream big 00:47:38.580 |
in terms of like, they wanna empower creators. 00:47:46.780 |
or something like that to start converting everything 00:47:55.180 |
Like one of the things that sucks with podcasts 00:48:04.460 |
Like you find, it's similar to what YouTube used to be like, 00:48:09.460 |
which is you basically find a creator that you enjoy 00:48:14.220 |
and you subscribe to them and like, you just, 00:48:19.700 |
But the search and discovery wasn't a big part of YouTube 00:48:28.500 |
like is the search and discovery is like non-existent. 00:48:36.420 |
which is like keywords in the titles of episodes. 00:48:48.580 |
because I was trying to, I'm trying to remember, 00:48:54.700 |
Maybe like some people have pretty good show notes. 00:48:56.780 |
So maybe you'll get lucky and you can find it, right? 00:49:05.300 |
- I mean, that's one of the things that I wanted to, 00:49:08.460 |
I mean, one of the reasons we're talking today 00:49:21.220 |
that there's enough money now to do a transcription 00:49:36.420 |
who are like graduate students in computer science 00:49:38.340 |
or graduate students in whatever the heck field. 00:49:43.140 |
like they enjoy podcasts when they're doing laundry 00:49:45.220 |
or whatever, but they wanna revisit the conversation 00:49:53.340 |
It's clear that they want to like analyze conversations. 00:49:56.780 |
So many people wrote to me about a transcript 00:50:08.260 |
they wanna write a blog post about your conversation. 00:50:15.500 |
on your conversation transcription privately. 00:50:18.340 |
They're doing it for themselves and then starting to pick, 00:50:21.780 |
but it's so much easier when you can actually do it 00:50:26.180 |
- Yeah, and you can like embed a little thing, 00:50:30.500 |
You can go listen to like this clip from the section. 00:50:35.940 |
I'll probably on the website create like a place 00:50:52.740 |
that are complete falsifying, which I'm fine with. 00:50:57.740 |
Like I've had this conversation with a friend of mine, 00:51:07.100 |
as I've been reading the rise and fall of the Third Reich 00:51:11.820 |
And we brought up Hitler and he made some kind of comment 00:51:16.360 |
where like we should be able to forgive Hitler. 00:51:19.580 |
And, you know, like we were talking about forgiveness 00:51:26.760 |
Like even, you know, for people who are Holocaust survivors, 00:51:43.340 |
but it might be a worthwhile pursuit psychologically. 00:51:50.860 |
I think people should go back and listen to it. 00:51:55.680 |
all these articles written about like MMA fight, 00:52:01.920 |
- No, like, well, no, they were somewhat accurate. 00:52:07.120 |
They said, thinks that if Hitler came back to life 00:52:14.440 |
Like they kind of, it's kind of accurate-ish, 00:52:18.520 |
but it, the headline made it sound a lot worse 00:52:29.740 |
I wanna almost make it easier for those journalists 00:52:32.680 |
and make it easier for people who actually care 00:52:34.880 |
about the conversation to go and look and see. 00:52:42.960 |
like the audio that makes it difficult to go, 00:52:53.200 |
I think some of it, you know, I'm interested in creating 00:53:05.160 |
I do dream that like, I'm not in the loop anymore, 00:53:24.920 |
- Like from the tool makers and the podcast creators, 00:53:35.800 |
Here's how you include a transcript into a podcast, right? 00:53:40.680 |
And actually just yesterday I saw this company called 00:53:48.320 |
They proposed a spec, an extension to their RSS format 00:53:59.080 |
there's one client dimension that will support it, 00:54:02.160 |
but imagine like more clients support it, right? 00:54:04.040 |
So any podcast you could go and see the transcripts, right? 00:54:29.280 |
This is where I'd like to say like F Spotify. 00:54:41.320 |
there's this ecosystem of different podcast players 00:54:59.600 |
I've been kind of nervous about the whole thing, 00:55:06.000 |
which is very surprising that they were able to-- 00:55:20.180 |
I, you know, and Spotify gave him $100 million for that. 00:55:40.960 |
I have a very different relationship with money. 00:55:46.120 |
I believe in the pirate radio aspect of podcasting, 00:55:52.320 |
- The open source spirit, it just doesn't seem right. 00:55:57.320 |
because so many people care about Joe Rogan's program, 00:56:00.480 |
they're gonna hold Spotify's feet to the fire. 00:56:05.040 |
what Joe told me is the reason he likes working with Spotify 00:56:10.040 |
is that they're like ride or die together, right? 00:56:19.160 |
So that's why they're not actually telling him what to do, 00:56:36.800 |
And that process can actually be very fruitful. 00:56:43.400 |
YouTube generally, no matter how big the creator, 00:57:07.600 |
So, and especially with somebody like Joe Rogan, 00:57:15.080 |
not as a person who might revolutionize the nature of news 00:57:34.440 |
So, you know, a lot of people talk about this. 00:57:37.960 |
It's a hard place to be for YouTube actually, 00:57:40.600 |
is figuring out with the search and discovery process 00:57:49.040 |
and which conspiracy theories represent dangerous untruths 00:57:53.360 |
and which conspiracy theories are like vanilla untruths. 00:58:09.720 |
Of too much, you know, too much not censorship. 00:58:14.640 |
I mean, censorship is usually government censorship, 00:58:17.960 |
but still, yeah, putting yourself in a position of arbiter 00:58:25.420 |
Like, it's like, well, you know, like no Nazis, right? 00:58:29.100 |
But, you know, yes, I mean, no one likes Nazis. 00:58:37.400 |
- Yeah, and then, you know, of course everybody, 00:58:42.240 |
And then there's like, so you start getting Sam Harris. 00:58:45.720 |
I don't know if you know that is wasted, in my opinion, 00:58:51.640 |
Now, I spoke with Jack before on this podcast, 00:58:55.680 |
But Sam brought up, Sam Harris does not like Donald Trump. 00:59:08.960 |
how can you not ban Donald Trump from Twitter? 00:59:20.520 |
think that the current president of the United States 00:59:24.920 |
And it's like, okay, so if that's even on the table 00:59:28.060 |
as a conversation, then everything's on the table 00:59:37.040 |
I'm with you, I think that censorship is bad, 00:59:44.000 |
if you're the kind of person that's gonna be convinced, 00:59:49.460 |
that, I don't know, our government's been taken over 00:59:53.160 |
by aliens, it's unlikely that, like, you know, 00:59:56.120 |
you'll be returned to sanity simply because, you know, 00:59:59.000 |
that video is not available on YouTube, right? 01:00:02.860 |
I tend to believe in the intelligence of people 01:00:07.000 |
But I also do think it's the responsibility of platforms 01:00:24.280 |
And I think philosophically, I think about that a lot. 01:00:50.700 |
Like, you can't really put that in the constitution 01:00:56.100 |
But I feel like platforms have a role to be like, 01:01:00.860 |
Maybe do the carrot, like encourage people to be nicer 01:01:06.820 |
But I think it's an interesting machine learning problem. 01:01:12.040 |
- Machine, yeah, machine learning for niceness. 01:01:16.640 |
- Responsible AI, I mean, it is a thing for sure. 01:01:20.160 |
- Jack Dorsey kind of talks about it as a vision for Twitter 01:01:23.760 |
is how do we increase the health of conversations? 01:01:30.800 |
which is one of the reasons I am in part considering 01:01:42.000 |
people are kind of driven by rage and outrage maybe 01:01:53.080 |
these companies are judged by engagement, right? 01:01:56.040 |
- In the short term, but this goes to the metrics thing 01:01:59.360 |
I do believe, I have a fundamental belief that 01:02:03.480 |
if you have a metric of long-term happiness of your users, 01:02:11.480 |
and both like intellectual, emotional health of your users, 01:02:18.800 |
like you should be able to optimize for that. 01:02:21.360 |
You don't need to necessarily optimize for engagement. 01:02:26.360 |
- Yeah, no, I mean, I generally agree with you, 01:02:31.960 |
trust from Wall Street to be able to carry out 01:02:36.840 |
- This is what I believe the Steve Jobs character 01:02:41.920 |
you basically have to be so good at your job. 01:02:48.680 |
and all the investors hostage by saying like, 01:03:06.680 |
- There's kind of a reason why like a third name 01:03:10.840 |
Like there's maybe a handful of other people, 01:03:15.480 |
like people say that I'm like a fan of Elon Musk. 01:03:27.640 |
that like we can get to Mars, you know, in 10 years, right? 01:03:32.480 |
- And it's kind of making it happen, which is like. 01:03:37.480 |
- It's kind of gone like that kind of like spirit, right? 01:03:42.280 |
You know, like we can get to the moon in 10 years 01:03:45.720 |
- Yeah, especially in this time of so much kind of 01:03:50.720 |
existential dread that people are going through 01:03:56.680 |
that just keep going out there now with humans. 01:04:03.240 |
I mean, it gives you a reason to wake up in the morning 01:04:13.200 |
Well, let me ask you this, the worst possible question, 01:04:17.160 |
which is, so you're like at the core, you're a programmer, 01:04:21.400 |
you're an engineer, but now you made the unfortunate choice 01:04:30.760 |
of basically moving away from the low level work 01:04:35.160 |
and becoming a manager, becoming an executive, 01:04:38.120 |
having meetings, what's that transition been like? 01:04:43.120 |
- It's been interesting, it's been a journey. 01:04:49.320 |
Because as a kid, I just remember this like incredible 01:04:54.320 |
amazement at being able to write a program, right? 01:04:57.400 |
And something comes to life that kind of didn't exist before. 01:05:01.280 |
I don't think you have that in like many other fields. 01:05:03.960 |
Like you have that with some other kinds of engineering, 01:05:10.720 |
But with a computer, you can literally imagine 01:05:21.360 |
- Do you remember like first program you wrote 01:05:23.240 |
or maybe the first program that like made you fall in love 01:05:29.440 |
It's probably like trying to write one of those games 01:05:31.880 |
and basic, you know, like emulate the snake game or whatever. 01:05:35.400 |
I don't remember to be honest, but I enjoyed like, 01:05:40.000 |
being a programmer is just the creation process. 01:05:41.840 |
And it's a little bit different when you're not the one 01:05:46.200 |
And, you know, another aspect to it I would say is, 01:05:54.200 |
it's kind of very easy to know when you're doing a good job, 01:06:00.400 |
You can kind of see like you trying to make something 01:06:05.560 |
And when you're a manager, you know, it's more diffuse, 01:06:09.720 |
Like, well, you hope, you know, you're motivating your team 01:06:12.760 |
and making them more productive and inspiring them, right? 01:06:15.920 |
But it's not like you get some kind of like dopamine signal 01:06:18.920 |
because you like completed X lines of code, you know, today. 01:06:22.440 |
So kind of like you missed that dopamine rush a little bit 01:06:38.200 |
Like a ripple effect of somebody else's dopamine rush. 01:06:44.560 |
- So is there pain points and challenges you had to overcome 01:06:50.760 |
from going to a programmer to becoming a programmer 01:06:58.360 |
I don't know, humans are difficult to understand, you know? 01:07:03.680 |
like trying to understand other people's motivations 01:07:08.240 |
It's difficult, maybe like never really know, right? 01:07:21.560 |
I found that like some people I could like scream at 01:07:35.960 |
And there's some people that I had to nonstop compliment 01:07:39.840 |
because like they're so already self-critical, 01:07:51.520 |
because they're already criticizing themselves. 01:07:59.960 |
like how that, the complete difference in people. 01:08:04.120 |
- Definitely people will respond to different motivations 01:08:13.720 |
which for some reason now the name escapes me, 01:08:16.080 |
about management, "First Break All the Rules." 01:08:20.920 |
It's a book that we generally like ask a lot of 01:08:31.120 |
Which is, don't like have some standard template, 01:08:34.480 |
like here's how I tell this person to do this 01:08:38.560 |
or the other thing, here's how I get feedback, 01:08:42.800 |
you have to try to understand what drives them 01:08:48.920 |
I don't know if you can answer this question, 01:08:52.480 |
which is, are there books, technical, fiction, 01:08:58.560 |
or had an impact on your life that you would recommend? 01:09:01.360 |
You already mentioned "Dune," like all of the "Dune." 01:09:06.760 |
but anyway, so yeah, all of the "Dune" is good. 01:09:09.800 |
- I mean, yeah, can you just slow little tangent on that? 01:09:16.320 |
Like, do you recommend people start with the first one 01:10:09.000 |
So one is "Brave New World" by Aldous Huxley. 01:10:13.440 |
And it's kind of incredible how prescient he was 01:10:25.200 |
You kind of see a genetic sorting in this book, 01:10:35.800 |
Like you can kind of see it in a slightly similar way today 01:10:39.080 |
where, well, one of the problems with society 01:10:42.120 |
is people are kind of genetically sorting a little bit. 01:10:49.040 |
between people of similar kind of intellectual level 01:11:01.720 |
he illustrated what that could be like in the extreme. 01:11:05.880 |
- Different versions of it on social media as well. 01:11:28.240 |
- Yeah, it's a slightly different view of the future, right? 01:11:33.660 |
Speaking of, not a book, but my favorite kind of 01:11:39.920 |
dystopian science fiction is a movie called "Brazil," 01:11:44.160 |
- I've heard of it and I know I need to watch it, 01:11:46.360 |
but yeah, 'cause it's in, is it in English or no? 01:12:05.720 |
but no one is kind of like willing to challenge it. 01:12:10.040 |
It kind of strikes me as like a very plausible future 01:12:13.680 |
of like, you know, what authoritarians might look like. 01:12:21.880 |
It's just kind of like this badly functioning, you know, 01:12:30.080 |
- Yeah, that's one funny thing that stands out to me 01:12:33.520 |
is in what is this, authoritarian dystopian stuff, 01:12:44.480 |
government is almost always exceptionally competent. 01:12:55.480 |
Like, you know, you use it whether it's good or evil, 01:13:01.840 |
where much more realistically is incompetence, 01:13:06.440 |
and that incompetence is itself has consequences 01:13:13.200 |
Like, bureaucracy has a very boring way of being evil. 01:13:21.400 |
HBO show "Chernobyl," it's a really good story 01:13:34.280 |
in any one particular place, but more just like the-- 01:13:43.240 |
that people unwilling to take responsibility for things, 01:13:46.040 |
and just kind of like this laziness resulting in evil. 01:14:02.440 |
about, you know, Hitler and, you know, so on. 01:14:11.200 |
but like, I just feel like there's not enough good movies, 01:14:21.480 |
but even Hitler, there's a movie called "Downfall" 01:14:35.600 |
there's not good movies about the evil of Stalin. 01:14:40.720 |
I actually, so I agree with you on "Inglorious Bastards", 01:14:44.560 |
because I felt like kind of the stylizing of it, right? 01:14:50.040 |
The whole like Tarantino kind of Tarantinoism, 01:14:56.160 |
and made it seem like unserious a little bit. 01:15:02.280 |
Maybe it's because it's a comedy to begin with, 01:15:03.880 |
so it's not like I'm expecting, you know, seriousness, 01:15:13.360 |
I mean, it was funny, so maybe it does make light of it, 01:15:15.320 |
but it, some degree, it's probably like this, right? 01:15:18.240 |
Like a bunch of kind of people that are like, oh shit, 01:15:25.480 |
it was so close to like what probably was reality, 01:15:35.520 |
to where I think an observer might think that this is not, 01:15:48.840 |
I mean, I guess it was too close to reality for me. 01:15:53.840 |
- The kind of banality of like what were eventually 01:15:59.520 |
But like, yeah, they're just a bunch of people 01:16:12.520 |
I think it's Gary Oldman, I may be making that up, 01:16:16.320 |
like he was nominated for an Oscar or something. 01:16:18.040 |
So I like, I love these movies about these humans 01:16:24.880 |
the HBO show that there's not enough movies about Russia 01:16:35.720 |
but the fact that some British dude that like did comedy, 01:16:39.400 |
I feel like he did like "Hangover" or some shit like that. 01:16:51.960 |
and then got it so accurate, like poetically. 01:17:20.240 |
and in World War II itself, like Stalingrad and so on. 01:17:23.800 |
Like, I feel like that story needs to be told. 01:17:30.160 |
And to me, it's so much more fascinating than Hitler 01:17:32.880 |
'cause Hitler is like a caricature of evil almost 01:17:44.080 |
that something like that is possible ever again. 01:17:47.640 |
Stalin to me represents something that is possible. 01:17:52.640 |
Like the so interesting, like the bureaucracy of it, 01:17:56.800 |
it's so fascinating that it potentially might be happening 01:18:01.240 |
in the world now, like that we're not aware of, 01:18:08.320 |
and like the possible things that could be happening 01:18:13.160 |
I don't know, there's a lot of possibilities there, 01:18:18.360 |
I guess the archives should be maybe more open nowadays. 01:18:20.480 |
I mean, for a long time, they just, we didn't know, right? 01:18:25.960 |
- Well, there's a, I don't know if you know him, 01:18:29.520 |
He is a historian of Stalin that I spoke to on this podcast. 01:18:46.240 |
like he knows Stalin better than Stalin knew himself. 01:19:09.800 |
English biography of Putin, I need to read some Russians. 01:19:30.720 |
- But I actually haven't even thought about that. 01:19:40.200 |
but I try not to think about questions until last minute. 01:19:48.680 |
And so that's why I'm soaking in a lot of stuff, 01:20:07.000 |
he's much closer to like mob morality, which is like. 01:20:18.160 |
It's a little bit like, you know, Hannibal, right? 01:20:21.360 |
Like if you ever watched the show Hannibal, right? 01:20:34.080 |
who's a character like extreme empath, right? 01:20:36.320 |
So in the way he like catches all these killers, 01:20:49.440 |
spending half your time in the head of evil people, right? 01:20:54.240 |
- I mean, I definitely try to do that with other, 01:21:12.240 |
- What's his name, Chris Latner, who's a Google, 01:21:19.160 |
He's one of the most legit engineers I've talked with. 01:21:23.400 |
and one of the, he gives me private advice a lot, 01:21:26.280 |
and he said for this podcast, I should like interview, 01:21:34.640 |
because that gives you much more freedom to do stuff. 01:21:38.200 |
Like, so his idea, which I think I agree with Chris, 01:21:50.440 |
And it's kinda, I think there's a safe place for that. 01:21:53.960 |
There's certainly a hunger for that nuanced conversation, 01:21:56.680 |
I think, amongst people, where like on social media, 01:22:00.440 |
you get canceled for anything slightly tense, 01:22:08.440 |
And it's like demystifies it a little bit, right? 01:22:11.600 |
- There is a person behind all of these things. 01:22:15.120 |
- And that's the cool thing about podcasting, 01:22:19.320 |
that it's very different than a clickbait journalism. 01:22:24.120 |
It's like the opposite, that there's a hunger for that. 01:22:29.480 |
how many people do you even see face to face anymore? 01:22:33.320 |
It's like not that many people, like in my day to day, 01:22:36.080 |
aside from my own family, that like I sit across. 01:22:52.280 |
There's somebody who is just smoked some weed, 01:22:58.480 |
I guarantee you that we'll write in the comments right now 01:23:00.760 |
that yes, I'm in St. Petersburg, I'm in Moscow, I'm whatever. 01:23:05.040 |
And we're in their head and they have a friendship with us. 01:23:10.040 |
I'm the same way, I'm a huge fan of podcasting. 01:23:15.600 |
I mean, it's a weird one way human connection. 01:23:25.800 |
I've been a friend with Joe Rogan for 10 years, but one way. 01:23:29.000 |
- Yeah, from this way, from the St. Petersburg way. 01:23:35.000 |
I mean, now it's like two way, but it's still surreal. 01:23:53.080 |
- Yeah, we evolved over millions of years, right? 01:23:57.440 |
To be very fine tuned to things like that, right? 01:24:09.360 |
you had a good relationship with the rest of your tribe 01:24:18.840 |
- Yeah, but it's weird that the tribe is different now. 01:24:22.600 |
Like you could have a one way connection with Joe Rogan 01:24:26.040 |
as opposed to the tribe of your physical vicinity. 01:24:30.560 |
- But that's why it works with the podcasting, 01:24:33.360 |
but it's the opposite of what happens on Twitter, right? 01:24:35.960 |
Because all those nuances are removed, right? 01:24:42.320 |
You're connecting with like an abstraction, right? 01:24:55.040 |
or dehumanize them, which is much harder to do 01:24:59.160 |
Because you realize it's a real person behind the voice. 01:25:19.360 |
Like why are we descended to vapes even on this planet? 01:25:28.120 |
I think I don't allow myself to think about it too often, 01:25:35.800 |
But in some ways, I guess, the meaning of life 01:25:39.080 |
is kind of contributing to this kind of weird thing 01:25:45.320 |
Like it's in a way, you can think of humanity 01:25:47.640 |
as like a living and evolving organism, right? 01:25:52.520 |
but just by existing, by having our own unique set 01:25:57.320 |
And maybe that means like creating something great, 01:26:04.640 |
are unique and different and seeing like, you know, 01:26:11.040 |
I mean, if you're not a religious person, right, 01:26:13.200 |
which I guess I'm not, that's the meaning of life. 01:26:27.280 |
I mean, it's even just actually what you said 01:26:39.040 |
I don't know what that is in there, but that seems, 01:26:41.580 |
that's probably some version of like reproduction 01:26:49.800 |
But like creating that HTML button has echoes 01:26:57.800 |
- Right, well, I mean, if you're a religious person, 01:27:04.440 |
Well, I mean, I guess part of that is the drive 01:27:11.760 |
- Yeah, that HTML button is the creation in God's image. 01:27:14.840 |
- So maybe hopefully it'll be something a little more-- 01:27:20.960 |
- Yeah, maybe some JavaScript, some React and so on. 01:27:25.400 |
But no, I mean, I think that's what differentiates us 01:27:42.000 |
This is actually a little bit of an experiment, 01:27:45.080 |
allowing me to sort of fanboy over some of the things 01:27:48.520 |
I love, so thanks for wasting your time with me today. 01:27:53.120 |
Thanks for having me on and giving me a chance 01:28:00.720 |
with Dan Kokorov and thank you to our sponsors, 01:28:19.340 |
click the sponsor links below to get a discount 01:28:29.800 |
"The limits of my language means the limits of my world." 01:28:33.840 |
Thank you for listening and hope to see you next time.