back to indexYouTube Algorithm Basics (Cristos Goodrow, VP Engineering at Google) | AI Podcast Clips
Chapters
0:0 YouTube Algorithm Basics
6:25 YouTube recommendation system
8:55 Quality of videos
12:31 Like dislike subscribe
15:41 Content analysis
19:34 Collaborative Filtering
22:42 Game the Algorithm
25:22 Helping the Algorithm
27:35 User Experience
29:2 Value
32:20 YouTube Algorithm
35:7 B Experiments
00:00:00.000 |
- Maybe the basics of the quote unquote YouTube algorithm. 00:00:08.960 |
to make recommendation for what to watch next? 00:00:11.160 |
And it's from a machine learning perspective, 00:00:32.080 |
Even I observe that it's improved quite a bit. 00:00:39.680 |
YouTube uses the best technology we can get from Google 00:00:50.120 |
And of course, the very first things that one thinks about 00:00:53.960 |
is, okay, well, does the word occur in the title? 00:01:04.080 |
where we're mostly trying to do some syntactic match 00:01:16.040 |
For instance, maybe is this video watched a lot 00:01:28.640 |
make sure that that document would be retrieved 00:01:44.120 |
And probably the first real attempt to do that well 00:01:57.260 |
- Can you describe what collaborative filtering is? 00:02:02.060 |
is we observe which videos get watched close together 00:02:15.060 |
where the videos that get watched close together 00:02:18.340 |
by the most people are sort of very close to one another 00:02:34.300 |
that basically represents videos that are very similar 00:02:46.820 |
that are in the same language together, for instance. 00:02:49.540 |
And we didn't even have to think about language. 00:02:54.660 |
And it puts all the videos that are about sports together, 00:02:57.340 |
and it puts most of the music videos together, 00:02:59.380 |
and it puts all of these sorts of videos together 00:03:07.700 |
- So that already cleans up a lot of the problem. 00:03:23.260 |
I was talking to someone who was trying to propose 00:03:39.980 |
based on the idea that YouTube could not possibly be good 00:03:44.700 |
at recommending videos well to people who are bilingual. 00:03:54.300 |
and I said, "Well, can you give me an example 00:03:55.820 |
"of what problem do you think we have on YouTube 00:03:59.380 |
And so she said, "Well, I'm a researcher in the US, 00:04:11.940 |
and then looked at the Watch Next suggestions, 00:04:17.420 |
"YouTube must think that I speak only English." 00:04:20.340 |
And so she said, "Now, I'm actually originally from Turkey, 00:04:26.740 |
"I really like to watch videos that are in Turkish." 00:04:29.420 |
And so she searched for a video about making the baklava, 00:04:35.380 |
and the Watch Next recommendations were in Turkish. 00:04:37.500 |
And she just couldn't believe how this was possible. 00:04:46.620 |
And it's just sort of an outcome of this related graph 00:04:50.660 |
that's created through collaborative filtering. 00:05:02.740 |
to discover what individual people wanna watch next. 00:05:22.020 |
And it's a fascinating picture of who I am, actually. 00:05:30.980 |
a summary of who I am as a person on the internet, to me. 00:05:41.460 |
you know, that's actually quite revealing and interesting. 00:06:10.540 |
played around with the idea of giving a map to people? 00:06:15.460 |
Sort of, as opposed to just using this information 00:06:32.420 |
to see what it is that you've been watching on YouTube. 00:06:57.240 |
the way the recommendation system of YouTube sees a user 00:07:06.360 |
And so you can think of yourself or any user on YouTube 00:07:11.360 |
as kind of like a DNA strand of all your videos, right? 00:07:26.400 |
And so, now, once you think of it as a vector 00:07:35.000 |
which other vectors are close to me and to my vector? 00:07:42.920 |
that we generate some diverse recommendations 00:07:49.920 |
with respect to the videos they've watched on YouTube, 00:07:58.840 |
That could be an opportunity to make a good recommendation. 00:08:03.840 |
I'm gonna ask for things that are impossible, 00:08:05.360 |
but I would love to cluster them human beings. 00:08:09.000 |
Like, I would love to know who has similar trajectories as me 00:08:12.680 |
'cause you probably would wanna hang out, right? 00:08:18.360 |
some of the most fascinating people I find on YouTube 00:08:20.720 |
have like no followers, and I start following them, 00:08:48.120 |
So the measure of quality, is it just something, 00:08:52.280 |
yeah, how do you know that something is good? 00:09:05.800 |
In that realm, quality news or quality journalism 00:09:10.800 |
relies on having a journalism department, right? 00:09:29.200 |
quality has a lot to do with the authoritativeness 00:09:35.920 |
Now, if you think about the other end of the spectrum, 00:09:43.000 |
Or what is the highest quality Minecraft video, right? 00:09:47.640 |
That might be the one that people enjoy watching the most 00:09:53.360 |
Or it might be the one that when we ask people 00:10:03.560 |
And so we, especially in the realm of entertainment, 00:10:08.560 |
have been trying to get at better and better measures 00:10:18.520 |
And we started with, well, the first approximation 00:10:25.160 |
But we both know that things can get a lot of views 00:10:33.080 |
especially if people are clicking on something 00:10:49.480 |
the time that someone spends watching a video 00:10:53.440 |
is related to the value that they get from that video. 00:10:58.600 |
but it has something to say about how much value they get. 00:11:08.400 |
clicking through channels on television late at night 00:11:17.560 |
are you glad that you watched that show on TV last night? 00:11:21.760 |
I'd say, yeah, I wish I would have gone to bed 00:11:24.080 |
or read a book or almost anything else, really. 00:11:27.100 |
And so that's why some people got the idea a few years ago 00:11:34.720 |
And so we get feedback data from those surveys 00:11:39.720 |
and then use that in the machine learning system 00:11:56.040 |
what are the signals from a machine learning perspective 00:11:59.560 |
So you mentioned just clicking on the video views, 00:12:02.280 |
the time watched, maybe the relative time watched, 00:12:14.640 |
And then the one I wasn't actually quite aware of, 00:12:20.080 |
is a survey afterwards, which is a brilliant idea. 00:12:31.280 |
- Well, you mentioned commenting, also sharing the video. 00:12:38.720 |
- Within YouTube or outside of YouTube as well? 00:12:44.080 |
- Yeah, like and dislike, how important is that? 00:13:16.280 |
We would ask some users who didn't subscribe very much, 00:13:32.880 |
like actually it doesn't cost anything, it's free. 00:13:35.040 |
It just helps us know that you are very interested 00:13:48.480 |
And we say, "Well, why did you subscribe to this 00:13:51.720 |
if you weren't really interested in any more videos 00:13:59.600 |
and I just wanted to kind of give him a high five." 00:14:04.840 |
I actually subscribe to channels where I just, 00:14:18.940 |
Even though I may never actually want to click 00:14:24.380 |
And it's maybe outside of my interest area and so on, 00:14:28.580 |
which is probably the wrong way to use the subscribe button. 00:14:35.980 |
with all the space of people that see the subscribe button 00:14:45.980 |
We're not gonna pay attention to what you've done." 00:14:51.220 |
in which all the different people in the world 00:14:53.000 |
use the subscribe button or the like and the dislike button. 00:14:57.100 |
- So in terms of signals of machine learning, 00:14:59.800 |
using for the search and for the recommendation, 00:15:05.820 |
so like metadata, like text data that people provide, 00:15:12.960 |
So maybe you can speak to the value of those things 00:15:16.460 |
in search and also this incredible fascinating area 00:15:23.600 |
trying to understand what's happening in the video. 00:15:28.180 |
in the machine learning computer vision world, 00:15:34.980 |
how much are you playing with that currently? 00:15:38.160 |
of being able to analyze the content of the video itself? 00:15:52.180 |
our ability to do it well is still somewhat crude. 00:16:04.020 |
we can probably tell you that people are playing soccer. 00:16:06.820 |
We probably can't tell whether it's Manchester United 00:16:14.700 |
So these things are kind of difficult and using them, 00:16:20.540 |
So for instance, we use that kind of information 00:16:23.740 |
to understand and inform these clusters that I talked about. 00:16:27.660 |
And also maybe to add some words like soccer, for instance, 00:16:33.020 |
to the video if it doesn't occur in the title 00:16:43.180 |
is please help us out with the title and the description. 00:17:01.740 |
but if you typed World of Warcraft in search, 00:17:16.060 |
- Being literal on the internet is actually very uncool, 00:17:26.980 |
I mean, there's a humor to just being indirect, 00:17:33.820 |
machine learning algorithms want you to be literal. 00:17:42.060 |
And in some sense, that gets away from wit and humor, 00:17:52.460 |
the content of the description, the actual text, 00:18:02.380 |
- That's right, and I would go further and say that 00:18:25.980 |
well, maybe this isn't, somehow search made a mistake. 00:18:37.380 |
They get a clue that it's what they're looking for 00:18:46.980 |
So I think from the algorithm perspective, yes, 00:18:51.700 |
and saw a video with the title simply winning, 00:18:57.020 |
and the thumbnail has a sad orc or something, 00:19:15.220 |
that this is indeed a World of Warcraft video, 00:19:17.620 |
that would have created the most beautiful experience. 00:19:20.140 |
I think in terms of just the wit and the humor 00:19:22.700 |
and the curiosity that we human beings naturally have. 00:19:27.980 |
it's really hard for the algorithm to figure out 00:20:00.740 |
the collaborative filtering part of the system 00:20:03.460 |
starts by the same user watching videos together, right? 00:20:08.460 |
So the way that they're probably gonna do that 00:20:20.900 |
what degree for collaborative filtering in general 00:20:36.160 |
how many people are just like watching the same thing 00:20:43.780 |
and then help the other ant in the ant's colony 00:20:56.920 |
people come to YouTube with some certain amount of intent. 00:21:02.460 |
to the extent to which they try to satisfy that intent, 00:21:10.600 |
on kind of a faithful amount of behavior, right? 00:21:14.960 |
And there are people who try to trick us, right? 00:21:25.600 |
but they're trying to get that association made 00:21:33.440 |
to that sort of attempt at gaming the systems. 00:21:38.600 |
there's a lot of people that in a positive way, perhaps, 00:22:05.000 |
where basically describes that it seems what works 00:22:08.280 |
is to create a high quality video, really good video, 00:22:11.520 |
where people would want to watch it once they click on it, 00:22:16.960 |
to get them to click on it in the first place. 00:22:25.800 |
And you will enjoy my videos once you click on them." 00:22:28.360 |
So in what sense do you see this kind of click-bait style 00:22:33.360 |
attempt to manipulate, to get people in the door, 00:22:39.360 |
or play with the algorithm or game the algorithm? 00:22:46.720 |
but even if you were to take the algorithm out of it 00:22:56.640 |
about which one to put at the top or the bottom, 00:23:03.000 |
And I'll tell you the same thing that I told Derek is, 00:23:08.520 |
and they have two kinds of books on them, science books. 00:23:11.920 |
I have my math books from when I was a student 00:23:20.520 |
They're all yellow, they're all from Springer, 00:23:33.000 |
and they all have very interesting covers, right? 00:23:35.280 |
And they have provocative titles and things like that. 00:23:39.480 |
I mean, I wouldn't say that they're click-baity 00:23:47.560 |
but, you know, that's just a decision you have to make. 00:23:52.600 |
Like the people who write "Classical Recursion Theory" 00:23:55.800 |
by Pierotti-Fredi, he was fine with the yellow title 00:24:07.200 |
understand that they need to have a compelling cover 00:24:41.200 |
And so for the users who are offended by that, 00:24:45.840 |
we will then depress or suppress those videos. 00:24:51.440 |
there's also another signal where users can say, 00:25:05.560 |
like there's certain videos that just cut me the wrong way. 00:25:09.960 |
It's like, I don't want to, I don't want this. 00:25:14.120 |
To be like, I don't, that's not, that's not for me. 00:25:23.200 |
Right, we don't want to make a recommendation 00:25:29.520 |
that particular one makes me feel good as a user in general 00:25:34.680 |
'Cause I feel like I'm helping the algorithm. 00:25:36.960 |
My interactions on YouTube don't always feel like 00:25:42.320 |
Like for example, Tesla and Autopilot and Elon Musk 00:25:51.040 |
that they're helping the algorithm of Tesla vehicle. 00:25:58.920 |
that you're helping the algorithm get smarter. 00:26:07.320 |
They're all together creating a beautiful thing. 00:26:13.760 |
This conversation is reminding me of that, but. 00:26:21.680 |
I'm not sure I really thought about it that way, 00:26:27.200 |
- It's an interesting question of personalization 00:26:30.240 |
that I feel like when I click like on a video, 00:26:40.280 |
It would make me personally, people are different, 00:26:44.200 |
also the YouTube algorithm broadly say something. 00:26:47.800 |
Like there's a, I don't know if that's human nature, 00:26:55.280 |
You want to help it get smarter and smarter and smarter 00:27:08.840 |
And I'm not sure how many people share that feeling. 00:27:11.600 |
That could be just a machine learning feeling. 00:27:13.480 |
But on that point, how much personalization is there 00:27:22.040 |
So is it kind of all really boiling down to clustering? 00:27:27.040 |
Like if I'm in your clusters to me and so on, 00:27:38.160 |
So your experience will be quite a bit different 00:27:42.440 |
from anybody else's who's watching that same video, 00:27:47.600 |
And the reason is is that we found that users 00:27:57.160 |
Sometimes they want to keep watching more on that topic 00:28:06.600 |
and they're ready to move on to something else. 00:28:08.520 |
And so the question is, well, what is the something else? 00:28:12.280 |
And one of the first things one can imagine is, 00:28:15.800 |
well, maybe something else is the latest video 00:28:18.880 |
from some channel to which you've subscribed. 00:28:26.200 |
And even if it's not something that you subscribe to, 00:28:34.160 |
And so even the watch next, as well as the homepage, 00:28:46.880 |
What does success look like in terms of the algorithm 00:28:49.360 |
creating a great long-term experience for a user? 00:28:52.800 |
Or put another way, if you look at the videos 00:28:59.280 |
how do you know the algorithm succeeded for me? 00:29:10.400 |
- So just the number of hours is a powerful indicator. 00:29:20.040 |
So that's probably the most simple indicator. 00:29:28.600 |
There's a lot of other things that they could do. 00:29:44.040 |
And so that's why we survey them and ask them, 00:30:00.680 |
is this the best video you've ever seen on YouTube? 00:30:20.480 |
but that Ray Dalio has this video on the economic machine. 00:30:26.240 |
I forget what it's called, but it's a 30 minute video. 00:30:38.040 |
of how at least the American economic system works. 00:30:42.320 |
And I was just, I wanted to click on something 00:30:50.480 |
I mean, the views and the likes reflect its quality, 00:30:54.920 |
but I was almost upset that I haven't found it earlier 00:31:01.720 |
that this is the best video I've ever watched. 00:31:08.000 |
the best experiences were every single video. 00:31:26.840 |
that's one of the most beautiful and ambitious, 00:31:32.120 |
So you've mentioned kind of the YouTube algorithm 00:31:39.240 |
It's potentially sort of more than a million lines of code. 00:31:54.240 |
on top of patches of heuristics and human experts 00:32:13.960 |
of what is the YouTube algorithm at this point? 00:32:20.820 |
- Well, we don't usually think about it as the algorithm 00:32:29.320 |
The other thing that I think people don't understand 00:32:31.920 |
is that what you might refer to as the YouTube algorithm 00:32:36.960 |
from outside of YouTube is actually a bunch of code 00:32:47.660 |
of all the people who come to YouTube every day. 00:32:49.800 |
- So the people part of the code, essentially. 00:32:52.500 |
Like if there were no people who came to YouTube tomorrow, 00:32:54.860 |
then the algorithm wouldn't work anymore, right? 00:33:01.820 |
well, the algorithm does this, the algorithm does that, 00:33:09.640 |
and the algorithm is mostly just keeping track 00:33:12.540 |
of what the viewers do and then reacting to those things 00:33:23.460 |
that the recommendation system and the search system 00:33:26.660 |
and probably many machine learning systems evolve 00:33:41.820 |
Well, how about the most popular ones, right? 00:33:57.220 |
that can actually learn what to do in different situations 00:34:00.740 |
based on some observations of those situations in the past. 00:34:04.660 |
And you keep chipping away at these heuristics over time. 00:34:08.620 |
And so I think that just like with diversity, 00:34:13.120 |
I think the first diversity measure we took was, 00:34:20.420 |
It's a pretty simple heuristic to encourage diversity, 00:34:25.700 |
Who needs to see four, five, six videos in a row 00:34:53.940 |
of the kind of A/B testing that this idea was a good one, 00:34:58.880 |
How do you measure that and across which time scale, 00:35:09.540 |
And so just about every single change we make to YouTube, 00:35:13.860 |
we do it only after we've run a A/B experiment. 00:35:38.500 |
for ultimately does this improve the experience for viewers? 00:35:47.060 |
because we can see certain things go up and down. 00:35:50.100 |
So for instance, if we noticed in the experiment, 00:35:53.860 |
people are dismissing videos less frequently, 00:35:57.540 |
or they're saying that they're more satisfied, 00:36:01.940 |
they're giving more videos five stars after they watch them, 00:36:09.460 |
that it's improving the situation for viewers. 00:36:31.100 |
So we get a sense for what the impact is gonna be, 00:36:35.740 |
but also to the different channels and all of that.