back to index

YouTube Algorithm Basics (Cristos Goodrow, VP Engineering at Google) | AI Podcast Clips


Chapters

0:0 YouTube Algorithm Basics
6:25 YouTube recommendation system
8:55 Quality of videos
12:31 Like dislike subscribe
15:41 Content analysis
19:34 Collaborative Filtering
22:42 Game the Algorithm
25:22 Helping the Algorithm
27:35 User Experience
29:2 Value
32:20 YouTube Algorithm
35:7 B Experiments

Whisper Transcript | Transcript Only Page

00:00:00.000 | - Maybe the basics of the quote unquote YouTube algorithm.
00:00:05.000 | What does the YouTube algorithm look at
00:00:08.960 | to make recommendation for what to watch next?
00:00:11.160 | And it's from a machine learning perspective,
00:00:13.940 | or when you search for a particular term,
00:00:17.680 | how does it know what to show you next?
00:00:19.880 | 'Cause it seems to, at least for me,
00:00:22.280 | do an incredible job of both.
00:00:24.800 | - Well, that's kind of you to say.
00:00:26.600 | It didn't used to do a very good job.
00:00:28.680 | (laughing)
00:00:30.120 | But it's gotten better over the years.
00:00:32.080 | Even I observe that it's improved quite a bit.
00:00:34.760 | Those are two different situations.
00:00:37.160 | Like when you search for something,
00:00:39.680 | YouTube uses the best technology we can get from Google
00:00:44.600 | to make sure that the YouTube search system
00:00:48.240 | finds what someone's looking for.
00:00:50.120 | And of course, the very first things that one thinks about
00:00:53.960 | is, okay, well, does the word occur in the title?
00:00:58.240 | For instance.
00:00:59.160 | But there are much more sophisticated things
00:01:04.080 | where we're mostly trying to do some syntactic match
00:01:08.400 | or maybe a semantic match based on words
00:01:12.280 | that we can add to the document itself.
00:01:16.040 | For instance, maybe is this video watched a lot
00:01:21.040 | after this query?
00:01:22.200 | That's something that we can observe.
00:01:26.400 | And then as a result,
00:01:28.640 | make sure that that document would be retrieved
00:01:32.400 | for that query.
00:01:33.440 | Now, when you talk about what kind of videos
00:01:37.280 | would be recommended to watch next,
00:01:40.600 | that's something, again,
00:01:42.500 | we've been working on for many years.
00:01:44.120 | And probably the first real attempt to do that well
00:01:49.120 | was to use collaborative filtering.
00:01:56.420 | So you--
00:01:57.260 | - Can you describe what collaborative filtering is?
00:01:59.020 | - Sure, it's just, basically what we do
00:02:02.060 | is we observe which videos get watched close together
00:02:07.060 | by the same person.
00:02:08.320 | And if you observe that,
00:02:12.340 | and if you can imagine creating a graph
00:02:15.060 | where the videos that get watched close together
00:02:18.340 | by the most people are sort of very close to one another
00:02:21.060 | in this graph,
00:02:21.900 | and videos that don't frequently get watched
00:02:23.940 | close together by the same person
00:02:26.140 | or the same people are far apart,
00:02:28.740 | then you end up with this graph
00:02:32.380 | that we call the related graph
00:02:34.300 | that basically represents videos that are very similar
00:02:38.220 | or related in some way.
00:02:40.260 | And what's amazing about that
00:02:43.380 | is that it puts all the videos
00:02:46.820 | that are in the same language together, for instance.
00:02:49.540 | And we didn't even have to think about language.
00:02:53.020 | It just does it, right?
00:02:54.660 | And it puts all the videos that are about sports together,
00:02:57.340 | and it puts most of the music videos together,
00:02:59.380 | and it puts all of these sorts of videos together
00:03:01.900 | just because that's sort of the way
00:03:04.980 | the people using YouTube behave.
00:03:07.700 | - So that already cleans up a lot of the problem.
00:03:12.060 | It takes care of the lowest hanging fruit,
00:03:14.900 | which happens to be a huge one
00:03:17.180 | of just managing these millions of videos.
00:03:20.260 | - That's right.
00:03:21.220 | I remember a few years ago,
00:03:23.260 | I was talking to someone who was trying to propose
00:03:28.140 | that we do a research project
00:03:30.900 | concerning people who are bilingual.
00:03:35.580 | And this person was making this proposal
00:03:39.980 | based on the idea that YouTube could not possibly be good
00:03:44.700 | at recommending videos well to people who are bilingual.
00:03:49.580 | And so she was telling me about this,
00:03:54.300 | and I said, "Well, can you give me an example
00:03:55.820 | "of what problem do you think we have on YouTube
00:03:58.260 | "with the recommendations?"
00:03:59.380 | And so she said, "Well, I'm a researcher in the US,
00:04:04.380 | "and when I'm looking for academic topics,
00:04:06.700 | "I wanna see them in English."
00:04:09.460 | And so she searched for one, found a video,
00:04:11.940 | and then looked at the Watch Next suggestions,
00:04:14.180 | and they were all in English.
00:04:16.140 | And so she said, "Oh, I see.
00:04:17.420 | "YouTube must think that I speak only English."
00:04:20.340 | And so she said, "Now, I'm actually originally from Turkey,
00:04:23.580 | "and sometimes when I'm cooking,
00:04:24.980 | "let's say I wanna make some baklava,
00:04:26.740 | "I really like to watch videos that are in Turkish."
00:04:29.420 | And so she searched for a video about making the baklava,
00:04:32.780 | and then selected it, and it was in Turkish,
00:04:35.380 | and the Watch Next recommendations were in Turkish.
00:04:37.500 | And she just couldn't believe how this was possible.
00:04:41.300 | And how is it that you know
00:04:43.420 | that I speak both these two languages
00:04:45.420 | and put all the videos together?
00:04:46.620 | And it's just sort of an outcome of this related graph
00:04:50.660 | that's created through collaborative filtering.
00:04:53.460 | - So for me, one of my huge interests
00:04:55.300 | is just human psychology, right?
00:04:56.700 | And that's such a powerful platform
00:05:00.380 | on which to utilize human psychology
00:05:02.740 | to discover what individual people wanna watch next.
00:05:06.700 | But it's also be just fascinating to me.
00:05:08.900 | You know, Google Search has ability
00:05:14.260 | to look at your own history.
00:05:16.540 | And I've done that before,
00:05:18.660 | just what I've searched, three years,
00:05:20.820 | for many, many years.
00:05:22.020 | And it's a fascinating picture of who I am, actually.
00:05:25.260 | And I don't think anyone's ever summarized,
00:05:29.060 | I personally would love that,
00:05:30.980 | a summary of who I am as a person on the internet, to me.
00:05:35.460 | Because I think it reveals,
00:05:36.900 | I think it puts a mirror to me or to others,
00:05:41.460 | you know, that's actually quite revealing and interesting.
00:05:44.460 | You know, just maybe the number of,
00:05:48.140 | it's a joke, but not really,
00:05:50.340 | is the number of cat videos I've watched.
00:05:52.660 | Or videos of people falling, you know,
00:05:54.620 | stuff that's absurd, that kind of stuff.
00:05:58.620 | It's really interesting.
00:05:59.700 | And of course, it's really good
00:06:00.780 | for the machine learning aspect to show,
00:06:05.300 | to figure out what to show next.
00:06:06.420 | But it's interesting.
00:06:08.300 | Hey, have you just, as a tangent,
00:06:10.540 | played around with the idea of giving a map to people?
00:06:15.460 | Sort of, as opposed to just using this information
00:06:18.860 | to show what's next,
00:06:20.260 | showing them, here are the clusters
00:06:22.740 | you've loved over the years, kind of thing.
00:06:25.100 | - Well, we do provide the history
00:06:27.140 | of all the videos that you've watched.
00:06:28.660 | - Yes.
00:06:29.500 | - So you can definitely search through that
00:06:30.940 | and look through it and search through it
00:06:32.420 | to see what it is that you've been watching on YouTube.
00:06:34.980 | We have actually, in various times,
00:06:40.280 | experimented with this sort of cluster idea,
00:06:43.520 | finding ways to demonstrate or show people
00:06:46.160 | what topics they've been interested in
00:06:49.480 | or what clusters they've watched from.
00:06:52.040 | It's interesting that you bring this up
00:06:53.840 | because in some sense,
00:06:57.240 | the way the recommendation system of YouTube sees a user
00:07:01.600 | is exactly as the history of all the videos
00:07:04.640 | they've watched on YouTube.
00:07:06.360 | And so you can think of yourself or any user on YouTube
00:07:11.360 | as kind of like a DNA strand of all your videos, right?
00:07:17.160 | That sort of represents you.
00:07:22.200 | You can also think of it as maybe a vector
00:07:24.040 | in the space of all the videos on YouTube.
00:07:26.400 | And so, now, once you think of it as a vector
00:07:31.280 | in the space of all the videos on YouTube,
00:07:32.920 | then you can start to say, okay, well,
00:07:35.000 | which other vectors are close to me and to my vector?
00:07:40.000 | And that's one of the ways
00:07:42.920 | that we generate some diverse recommendations
00:07:45.200 | is because you're like, okay, well,
00:07:47.520 | these people seem to be close
00:07:49.920 | with respect to the videos they've watched on YouTube,
00:07:52.080 | but here's a topic or a video
00:07:54.840 | that one of them has watched and enjoyed,
00:07:57.040 | but the other one hasn't.
00:07:58.840 | That could be an opportunity to make a good recommendation.
00:08:02.320 | - I got to tell you, I mean, I know,
00:08:03.840 | I'm gonna ask for things that are impossible,
00:08:05.360 | but I would love to cluster them human beings.
00:08:09.000 | Like, I would love to know who has similar trajectories as me
00:08:12.680 | 'cause you probably would wanna hang out, right?
00:08:15.120 | There's a social aspect there.
00:08:17.360 | Like, actually finding,
00:08:18.360 | some of the most fascinating people I find on YouTube
00:08:20.720 | have like no followers, and I start following them,
00:08:23.320 | and they create incredible content.
00:08:25.960 | And on that topic, I just love to ask,
00:08:28.840 | there's some videos that just blow my mind
00:08:31.040 | in terms of quality and depth,
00:08:34.240 | and just in every regard are amazing videos,
00:08:37.920 | and they have like 57 views.
00:08:40.880 | Okay, how do you get videos of quality
00:08:45.880 | to be seen by many eyes?
00:08:48.120 | So the measure of quality, is it just something,
00:08:52.280 | yeah, how do you know that something is good?
00:08:55.100 | - Well, I mean, I think it depends initially
00:08:56.920 | on what sort of video we're talking about.
00:08:59.920 | So in the realm of, let's say,
00:09:03.480 | you mentioned politics and news.
00:09:05.800 | In that realm, quality news or quality journalism
00:09:10.800 | relies on having a journalism department, right?
00:09:17.480 | Like you have to have actual journalists
00:09:20.080 | and fact checkers and people like that.
00:09:22.000 | And so in that situation, and in others,
00:09:26.340 | maybe science or in medicine,
00:09:29.200 | quality has a lot to do with the authoritativeness
00:09:32.200 | and the credibility and the expertise
00:09:34.280 | of the people who make the video.
00:09:35.920 | Now, if you think about the other end of the spectrum,
00:09:39.640 | what is the highest quality prank video?
00:09:43.000 | Or what is the highest quality Minecraft video, right?
00:09:47.640 | That might be the one that people enjoy watching the most
00:09:52.080 | and watch to the end.
00:09:53.360 | Or it might be the one that when we ask people
00:09:58.760 | the next day after they watched it,
00:10:01.680 | were they satisfied with it?
00:10:03.560 | And so we, especially in the realm of entertainment,
00:10:08.560 | have been trying to get at better and better measures
00:10:12.160 | of quality or satisfaction or enrichment
00:10:17.160 | since I came to YouTube.
00:10:18.520 | And we started with, well, the first approximation
00:10:22.480 | is the one that gets more views.
00:10:25.160 | But we both know that things can get a lot of views
00:10:30.040 | and not really be that high quality,
00:10:33.080 | especially if people are clicking on something
00:10:35.120 | and then immediately realizing
00:10:36.960 | that it's not that great and abandoning it.
00:10:39.240 | And that's why we moved from views
00:10:42.840 | to thinking about the amount of time
00:10:44.640 | people spend watching it,
00:10:46.460 | with the premise that in some sense,
00:10:49.480 | the time that someone spends watching a video
00:10:53.440 | is related to the value that they get from that video.
00:10:56.920 | It may not be perfectly related,
00:10:58.600 | but it has something to say about how much value they get.
00:11:02.160 | But even that's not good enough, right?
00:11:04.860 | Because I myself have spent time
00:11:08.400 | clicking through channels on television late at night
00:11:11.040 | and ended up watching "Under Siege 2"
00:11:14.000 | for some reason I don't know.
00:11:15.800 | And if you were to ask me the next day,
00:11:17.560 | are you glad that you watched that show on TV last night?
00:11:21.760 | I'd say, yeah, I wish I would have gone to bed
00:11:24.080 | or read a book or almost anything else, really.
00:11:27.100 | And so that's why some people got the idea a few years ago
00:11:32.720 | to try to survey users afterwards.
00:11:34.720 | And so we get feedback data from those surveys
00:11:39.720 | and then use that in the machine learning system
00:11:43.280 | to try to not just predict
00:11:44.520 | what you're gonna click on right now,
00:11:46.700 | what you might watch for a while,
00:11:48.540 | but what when we ask you tomorrow,
00:11:50.960 | you'll give four or five stars to.
00:11:53.320 | - So just to summarize,
00:11:56.040 | what are the signals from a machine learning perspective
00:11:58.440 | that a user can provide?
00:11:59.560 | So you mentioned just clicking on the video views,
00:12:02.280 | the time watched, maybe the relative time watched,
00:12:05.360 | the clicking like and dislike on the video,
00:12:10.360 | maybe commenting on the video.
00:12:12.880 | - All of those things.
00:12:13.800 | - All of those things.
00:12:14.640 | And then the one I wasn't actually quite aware of,
00:12:18.120 | even though I might've engaged in it,
00:12:20.080 | is a survey afterwards, which is a brilliant idea.
00:12:23.960 | Is there other signals?
00:12:25.700 | I mean, that's already a really rich space
00:12:28.560 | of signals to learn from.
00:12:29.980 | Is there something else?
00:12:31.280 | - Well, you mentioned commenting, also sharing the video.
00:12:35.320 | If you think it's worthy to be shared
00:12:37.760 | with someone else you know.
00:12:38.720 | - Within YouTube or outside of YouTube as well?
00:12:40.680 | - Either.
00:12:41.600 | Let's see, you mentioned like, dislike.
00:12:44.080 | - Yeah, like and dislike, how important is that?
00:12:46.720 | - It's very important, right?
00:12:48.680 | It's predictive of satisfaction.
00:12:52.240 | But it's not perfectly predictive.
00:12:56.080 | Subscribe, if you subscribe to the channel
00:12:59.340 | of the person who made the video,
00:13:00.860 | then that also is a piece of information
00:13:04.040 | and it signals satisfaction.
00:13:06.520 | Although, over the years, we've learned
00:13:10.040 | that people have a wide range of attitudes
00:13:13.040 | about what it means to subscribe.
00:13:16.280 | We would ask some users who didn't subscribe very much,
00:13:20.420 | but they watched a lot from a few channels,
00:13:24.200 | we'd say, "Well, why didn't you subscribe?"
00:13:25.720 | And they would say, "Well, I can't afford
00:13:27.600 | to pay for anything."
00:13:28.640 | And we tried to let them understand,
00:13:32.880 | like actually it doesn't cost anything, it's free.
00:13:35.040 | It just helps us know that you are very interested
00:13:37.880 | in this creator.
00:13:39.460 | But then we've asked other people
00:13:41.840 | who subscribe to many things
00:13:44.340 | and don't really watch any of the videos
00:13:47.640 | from those channels.
00:13:48.480 | And we say, "Well, why did you subscribe to this
00:13:51.720 | if you weren't really interested in any more videos
00:13:54.920 | from that channel?"
00:13:55.760 | And they might tell us, "Well, I just,
00:13:58.080 | I thought the person did a great job
00:13:59.600 | and I just wanted to kind of give him a high five."
00:14:01.320 | - Yeah. - Right?
00:14:02.520 | And so--
00:14:03.500 | - Yeah, that's where I sit.
00:14:04.840 | I actually subscribe to channels where I just,
00:14:08.400 | this person is amazing.
00:14:10.760 | I like this person.
00:14:12.480 | But then I like this person,
00:14:14.580 | I really wanna support them.
00:14:15.980 | That's how I click subscribe.
00:14:18.940 | Even though I may never actually want to click
00:14:21.120 | on their videos when they're releasing it,
00:14:22.940 | I just love what they're doing.
00:14:24.380 | And it's maybe outside of my interest area and so on,
00:14:28.580 | which is probably the wrong way to use the subscribe button.
00:14:31.100 | But I just wanna say congrats.
00:14:32.780 | This is great work.
00:14:33.820 | (both laughing)
00:14:34.940 | - Well, I mean-- - So you have to deal
00:14:35.980 | with all the space of people that see the subscribe button
00:14:38.700 | is totally different.
00:14:39.540 | - That's right.
00:14:40.360 | - So we can't just close our eyes and say,
00:14:43.960 | "Sorry, you're using it wrong.
00:14:45.980 | We're not gonna pay attention to what you've done."
00:14:49.540 | We need to embrace all the ways
00:14:51.220 | in which all the different people in the world
00:14:53.000 | use the subscribe button or the like and the dislike button.
00:14:57.100 | - So in terms of signals of machine learning,
00:14:59.800 | using for the search and for the recommendation,
00:15:04.700 | you've mentioned titles,
00:15:05.820 | so like metadata, like text data that people provide,
00:15:08.240 | description and title, and maybe keywords.
00:15:12.960 | So maybe you can speak to the value of those things
00:15:16.460 | in search and also this incredible fascinating area
00:15:20.860 | of the content itself.
00:15:22.160 | So the video content itself,
00:15:23.600 | trying to understand what's happening in the video.
00:15:25.860 | So YouTube will release a data set that,
00:15:28.180 | in the machine learning computer vision world,
00:15:30.260 | this is just an exciting space.
00:15:32.540 | How much is that currently,
00:15:34.980 | how much are you playing with that currently?
00:15:36.620 | How much is your hope for the future
00:15:38.160 | of being able to analyze the content of the video itself?
00:15:41.740 | - Well, we have been working on that also
00:15:43.840 | since I came to YouTube.
00:15:45.580 | - Analyzing the content.
00:15:46.540 | - Analyzing the content of the video, right?
00:15:48.740 | And what I can tell you is that
00:15:52.180 | our ability to do it well is still somewhat crude.
00:15:57.220 | We can tell if it's a music video,
00:16:01.740 | we can tell if it's a sports video,
00:16:04.020 | we can probably tell you that people are playing soccer.
00:16:06.820 | We probably can't tell whether it's Manchester United
00:16:12.600 | or my daughter's soccer team.
00:16:14.700 | So these things are kind of difficult and using them,
00:16:18.860 | we can use them in some ways.
00:16:20.540 | So for instance, we use that kind of information
00:16:23.740 | to understand and inform these clusters that I talked about.
00:16:27.660 | And also maybe to add some words like soccer, for instance,
00:16:33.020 | to the video if it doesn't occur in the title
00:16:35.300 | or the description, which is remarkable
00:16:37.260 | that often it doesn't.
00:16:38.400 | One of the things that I ask creators to do
00:16:43.180 | is please help us out with the title and the description.
00:16:46.220 | For instance, we were a few years ago
00:16:51.700 | having a live stream of some competition
00:16:55.100 | for World of Warcraft on YouTube.
00:16:57.100 | And it was a very important competition,
00:17:01.740 | but if you typed World of Warcraft in search,
00:17:03.540 | you wouldn't find it.
00:17:04.740 | - World of Warcraft wasn't in the title?
00:17:06.820 | - World of Warcraft wasn't in the title.
00:17:08.500 | It was match 478, A team versus B team,
00:17:12.580 | and World of Warcraft wasn't in the title.
00:17:14.700 | Just like, come on, give me--
00:17:16.060 | - Being literal on the internet is actually very uncool,
00:17:20.580 | which is the problem.
00:17:21.820 | - Oh, is that right?
00:17:23.300 | - Well, I mean, in some sense,
00:17:25.780 | well, some of the greatest videos,
00:17:26.980 | I mean, there's a humor to just being indirect,
00:17:29.420 | being witty and so on, and actually being,
00:17:33.820 | machine learning algorithms want you to be literal.
00:17:37.580 | You just wanna say what's in the thing,
00:17:39.940 | be very, very simple.
00:17:42.060 | And in some sense, that gets away from wit and humor,
00:17:45.460 | so you have to play with both.
00:17:46.960 | But you're saying that for now,
00:17:49.780 | sort of the content of the title,
00:17:52.460 | the content of the description, the actual text,
00:17:55.200 | is one of the best ways to,
00:17:58.980 | for the algorithm to find your video
00:18:01.020 | and put them in the right cluster.
00:18:02.380 | - That's right, and I would go further and say that
00:18:05.620 | if you want people, human beings,
00:18:08.220 | to select your video in search,
00:18:10.820 | then it helps to have, let's say,
00:18:12.620 | World of Warcraft in the title,
00:18:14.060 | because why would a person,
00:18:17.140 | if they're looking at a bunch,
00:18:18.140 | they type World of Warcraft,
00:18:19.420 | and they have a bunch of videos,
00:18:20.500 | all of whom say World of Warcraft,
00:18:22.300 | except the one that you uploaded,
00:18:24.380 | well, even the person is gonna think,
00:18:25.980 | well, maybe this isn't, somehow search made a mistake.
00:18:28.620 | This isn't really about World of Warcraft.
00:18:30.820 | So it's important,
00:18:32.340 | not just for the machine learning systems,
00:18:33.980 | but also for the people who might be looking
00:18:36.540 | for this sort of thing.
00:18:37.380 | They get a clue that it's what they're looking for
00:18:41.460 | by seeing that same thing prominently
00:18:44.180 | in the title of the video.
00:18:45.640 | - Okay, let me push back on that.
00:18:46.980 | So I think from the algorithm perspective, yes,
00:18:49.100 | but if they typed in World of Warcraft
00:18:51.700 | and saw a video with the title simply winning,
00:18:57.020 | and the thumbnail has a sad orc or something,
00:19:02.020 | I don't know, right?
00:19:03.740 | Like, I think that's much,
00:19:06.740 | it gets your curiosity up.
00:19:11.020 | And then if they could trust
00:19:12.620 | that the algorithm was smart enough
00:19:14.020 | to figure out somehow
00:19:15.220 | that this is indeed a World of Warcraft video,
00:19:17.620 | that would have created the most beautiful experience.
00:19:20.140 | I think in terms of just the wit and the humor
00:19:22.700 | and the curiosity that we human beings naturally have.
00:19:25.500 | But you're saying, I mean,
00:19:26.860 | realistically speaking,
00:19:27.980 | it's really hard for the algorithm to figure out
00:19:30.500 | that the content of that video
00:19:32.060 | will be a World of Warcraft video.
00:19:33.980 | - And you have to accept
00:19:34.820 | that some people are gonna skip it.
00:19:36.700 | - Yeah.
00:19:37.540 | - Right?
00:19:38.380 | I mean, and so you're right.
00:19:40.300 | The people who don't skip it and select it
00:19:42.900 | are gonna be delighted.
00:19:44.660 | But other people might say,
00:19:47.460 | yeah, this is not what I was looking for.
00:19:49.260 | - And making stuff discoverable,
00:19:51.460 | I think is what you're really working on
00:19:55.340 | and hoping, so yeah.
00:19:56.420 | So from your perspective,
00:19:58.060 | put stuff in the title of the scripture.
00:19:59.740 | - And remember,
00:20:00.740 | the collaborative filtering part of the system
00:20:03.460 | starts by the same user watching videos together, right?
00:20:08.460 | So the way that they're probably gonna do that
00:20:11.980 | is by searching for them.
00:20:13.380 | - That's a fascinating aspect of it.
00:20:14.820 | It's like ant colonies.
00:20:15.860 | That's how they find stuff.
00:20:17.220 | So, I mean,
00:20:20.900 | what degree for collaborative filtering in general
00:20:24.080 | is one curious ant,
00:20:26.140 | one curious user essential?
00:20:28.380 | So just the person who is more willing
00:20:30.500 | to click on random videos
00:20:32.180 | and sort of explore these cluster spaces.
00:20:34.660 | In your sense,
00:20:36.160 | how many people are just like watching the same thing
00:20:38.480 | over and over and over and over?
00:20:39.680 | And how many are just like the explorers
00:20:42.140 | that just kind of like click on stuff
00:20:43.780 | and then help the other ant in the ant's colony
00:20:47.660 | discover the cool stuff?
00:20:49.540 | Do you have a sense of that at all?
00:20:50.540 | - I really don't think I have a sense
00:20:52.840 | of the relative sizes of those groups,
00:20:54.960 | but I would say that,
00:20:56.920 | people come to YouTube with some certain amount of intent.
00:21:00.200 | And as long as they,
00:21:02.460 | to the extent to which they try to satisfy that intent,
00:21:06.880 | that certainly helps our systems, right?
00:21:08.740 | Because our systems rely
00:21:10.600 | on kind of a faithful amount of behavior, right?
00:21:14.960 | And there are people who try to trick us, right?
00:21:18.320 | There are people and machines
00:21:20.080 | that try to associate videos together
00:21:23.760 | that really don't belong together,
00:21:25.600 | but they're trying to get that association made
00:21:28.400 | because it's profitable for them.
00:21:30.560 | And so we have to always be resilient
00:21:33.440 | to that sort of attempt at gaming the systems.
00:21:37.000 | - So speaking to that,
00:21:38.600 | there's a lot of people that in a positive way, perhaps,
00:21:41.200 | I don't know, I don't like it,
00:21:42.880 | but like to want to try to game the system,
00:21:45.920 | to get more attention.
00:21:46.760 | Everybody, creators in a positive sense
00:21:49.320 | want to get attention, right?
00:21:50.840 | So how do you work in this space
00:21:54.160 | when people create more and more
00:21:56.840 | sort of click-baity titles and thumbnails?
00:22:02.520 | Sort of very tasking, Derek has made a video
00:22:05.000 | where basically describes that it seems what works
00:22:08.280 | is to create a high quality video, really good video,
00:22:11.520 | where people would want to watch it once they click on it,
00:22:14.080 | but have click-baity titles and thumbnails
00:22:16.960 | to get them to click on it in the first place.
00:22:19.280 | And he's saying, "I'm embracing this fact
00:22:21.040 | and I'm just going to keep doing it.
00:22:22.720 | And I hope you forgive me for doing it.
00:22:25.800 | And you will enjoy my videos once you click on them."
00:22:28.360 | So in what sense do you see this kind of click-bait style
00:22:33.360 | attempt to manipulate, to get people in the door,
00:22:38.040 | to manipulate the algorithm
00:22:39.360 | or play with the algorithm or game the algorithm?
00:22:42.840 | - I think that you can look at it
00:22:44.600 | as an attempt to game the algorithm,
00:22:46.720 | but even if you were to take the algorithm out of it
00:22:51.000 | and just say, "Okay, well, all these videos
00:22:52.640 | happen to be lined up,
00:22:54.440 | which the algorithm didn't make any decision
00:22:56.640 | about which one to put at the top or the bottom,
00:22:59.000 | but they're all lined up there.
00:23:00.160 | Which one are the people going to choose?"
00:23:03.000 | And I'll tell you the same thing that I told Derek is,
00:23:05.920 | you know, I have a bookshelf
00:23:08.520 | and they have two kinds of books on them, science books.
00:23:11.920 | I have my math books from when I was a student
00:23:16.280 | and they all look identical
00:23:18.160 | except for the titles on the covers.
00:23:20.520 | They're all yellow, they're all from Springer,
00:23:22.880 | and they're every single one of them,
00:23:24.320 | the cover is totally the same.
00:23:26.600 | - Yes.
00:23:27.600 | - Right? - Yeah.
00:23:28.720 | - On the other hand,
00:23:29.560 | I have other more pop science type books
00:23:33.000 | and they all have very interesting covers, right?
00:23:35.280 | And they have provocative titles and things like that.
00:23:39.480 | I mean, I wouldn't say that they're click-baity
00:23:41.480 | because they are indeed good books.
00:23:44.720 | And I don't think that they cross any line,
00:23:47.560 | but, you know, that's just a decision you have to make.
00:23:51.760 | Right?
00:23:52.600 | Like the people who write "Classical Recursion Theory"
00:23:55.800 | by Pierotti-Fredi, he was fine with the yellow title
00:23:58.960 | and nothing more.
00:24:01.560 | Whereas I think other people
00:24:02.760 | who wrote a more popular type book
00:24:07.200 | understand that they need to have a compelling cover
00:24:11.320 | and a compelling title.
00:24:12.880 | And, you know, I don't think
00:24:15.920 | there's anything really wrong with that.
00:24:17.480 | We do take steps to make sure
00:24:20.240 | that there is a line that you don't cross.
00:24:23.880 | And if you go too far,
00:24:25.720 | maybe your thumbnail is especially racy
00:24:27.880 | or, you know, it's all caps
00:24:31.200 | with too many exclamation points.
00:24:33.000 | We observe that users are kind of,
00:24:38.000 | you know, sometimes offended by that.
00:24:41.200 | And so for the users who are offended by that,
00:24:45.840 | we will then depress or suppress those videos.
00:24:50.400 | - And which reminds me,
00:24:51.440 | there's also another signal where users can say,
00:24:54.560 | I don't know if it was recently added,
00:24:55.840 | but I really enjoy it.
00:24:57.280 | Just saying, something like,
00:24:59.800 | I don't want to see this video anymore
00:25:01.720 | or something like,
00:25:02.760 | like this is a,
00:25:05.560 | like there's certain videos that just cut me the wrong way.
00:25:08.520 | Like just jump out at me.
00:25:09.960 | It's like, I don't want to, I don't want this.
00:25:11.400 | And it feels really good to clean that up.
00:25:14.120 | To be like, I don't, that's not, that's not for me.
00:25:17.560 | I don't know.
00:25:18.400 | I think that might've been recently added,
00:25:19.600 | but that's also a really strong signal.
00:25:21.800 | - Yes, absolutely.
00:25:23.200 | Right, we don't want to make a recommendation
00:25:25.480 | that people are unhappy with.
00:25:28.680 | - And that makes me,
00:25:29.520 | that particular one makes me feel good as a user in general
00:25:32.920 | and as a machine learning person.
00:25:34.680 | 'Cause I feel like I'm helping the algorithm.
00:25:36.960 | My interactions on YouTube don't always feel like
00:25:39.360 | I'm helping the algorithm.
00:25:40.360 | Like I'm not reminded of that fact.
00:25:42.320 | Like for example, Tesla and Autopilot and Elon Musk
00:25:47.400 | create a feeling for their customers,
00:25:50.040 | for people that own Teslas,
00:25:51.040 | that they're helping the algorithm of Tesla vehicle.
00:25:53.360 | Like they're all like a really proud,
00:25:55.080 | they're helping the fleet learn.
00:25:56.640 | I think YouTube doesn't always remind people
00:25:58.920 | that you're helping the algorithm get smarter.
00:26:01.800 | And for me, I love that idea.
00:26:03.880 | Like we're all collaboratively,
00:26:05.840 | like Wikipedia gives that sense.
00:26:07.320 | They're all together creating a beautiful thing.
00:26:11.000 | YouTube doesn't always remind me of that.
00:26:13.760 | This conversation is reminding me of that, but.
00:26:17.880 | - Well, that's a good tip.
00:26:18.720 | We should keep that fact in mind
00:26:20.320 | when we design these features.
00:26:21.680 | I'm not sure I really thought about it that way,
00:26:24.400 | but that's a very interesting perspective.
00:26:27.200 | - It's an interesting question of personalization
00:26:30.240 | that I feel like when I click like on a video,
00:26:34.560 | I'm just improving my experience.
00:26:37.120 | It would be great.
00:26:40.280 | It would make me personally, people are different,
00:26:42.280 | but make me feel great if I was helping
00:26:44.200 | also the YouTube algorithm broadly say something.
00:26:46.960 | You know what I'm saying?
00:26:47.800 | Like there's a, I don't know if that's human nature,
00:26:50.320 | but you want the products you love,
00:26:53.680 | and I certainly love YouTube.
00:26:55.280 | You want to help it get smarter and smarter and smarter
00:26:58.480 | 'cause there's some kind of coupling
00:27:00.120 | between our lives together being better.
00:27:04.080 | If YouTube was better than I will,
00:27:05.520 | my life will be better.
00:27:06.480 | And there's that kind of reasoning.
00:27:08.000 | Not sure what that is.
00:27:08.840 | And I'm not sure how many people share that feeling.
00:27:11.600 | That could be just a machine learning feeling.
00:27:13.480 | But on that point, how much personalization is there
00:27:18.160 | in terms of next video recommendations?
00:27:22.040 | So is it kind of all really boiling down to clustering?
00:27:27.040 | Like if I'm in your clusters to me and so on,
00:27:30.200 | and that kind of thing,
00:27:31.840 | or how much is personalized to me
00:27:34.000 | the individual completely?
00:27:35.320 | - It's very, very personalized.
00:27:38.160 | So your experience will be quite a bit different
00:27:42.440 | from anybody else's who's watching that same video,
00:27:45.760 | at least when they're logged in.
00:27:47.600 | And the reason is is that we found that users
00:27:52.600 | often want two different kinds of things
00:27:55.560 | when they're watching a video.
00:27:57.160 | Sometimes they want to keep watching more on that topic
00:28:02.000 | or more in that genre.
00:28:04.320 | And other times they just are done
00:28:06.600 | and they're ready to move on to something else.
00:28:08.520 | And so the question is, well, what is the something else?
00:28:12.280 | And one of the first things one can imagine is,
00:28:15.800 | well, maybe something else is the latest video
00:28:18.880 | from some channel to which you've subscribed.
00:28:21.520 | And that's gonna be very different for you
00:28:24.440 | than it is for me, right?
00:28:26.200 | And even if it's not something that you subscribe to,
00:28:29.320 | it's something that you watch a lot.
00:28:30.560 | And again, that'll be very different
00:28:32.240 | on a person by person basis.
00:28:34.160 | And so even the watch next, as well as the homepage,
00:28:39.160 | of course, is quite personalized.
00:28:41.800 | - So what, we mentioned some of the signals,
00:28:45.560 | but what does success look like?
00:28:46.880 | What does success look like in terms of the algorithm
00:28:49.360 | creating a great long-term experience for a user?
00:28:52.800 | Or put another way, if you look at the videos
00:28:56.960 | I've watched this month,
00:28:59.280 | how do you know the algorithm succeeded for me?
00:29:01.640 | - I think, first of all,
00:29:04.240 | if you come back and watch more YouTube,
00:29:07.200 | then that's one indication
00:29:08.520 | that you found some value from it.
00:29:10.400 | - So just the number of hours is a powerful indicator.
00:29:13.200 | - Well, I mean, not the hours themselves,
00:29:15.040 | but the fact that you return on another day.
00:29:20.040 | So that's probably the most simple indicator.
00:29:25.520 | People don't come back to things
00:29:26.920 | that they don't find value in, right?
00:29:28.600 | There's a lot of other things that they could do.
00:29:31.600 | But like I said, I mean,
00:29:33.240 | ideally we would like everybody to feel
00:29:35.440 | that YouTube enriches their lives
00:29:37.840 | and that every video they watched
00:29:39.680 | is the best one they've ever watched
00:29:41.880 | since they've started watching YouTube.
00:29:44.040 | And so that's why we survey them and ask them,
00:29:48.440 | like, is this one to five stars?
00:29:52.280 | And so our version of success is
00:29:54.560 | every time someone takes that survey,
00:29:57.160 | they say it's five stars.
00:29:59.040 | And if we ask them,
00:30:00.680 | is this the best video you've ever seen on YouTube?
00:30:02.960 | They say yes, every single time.
00:30:05.200 | So it's hard to imagine
00:30:07.560 | that we would actually achieve that.
00:30:08.960 | Maybe asymptotically we would get there,
00:30:11.120 | but that would be what we think success is.
00:30:15.680 | - It's funny, I've recently said somewhere,
00:30:19.040 | I don't know, maybe tweeted,
00:30:20.480 | but that Ray Dalio has this video on the economic machine.
00:30:26.240 | I forget what it's called, but it's a 30 minute video.
00:30:28.480 | And I said, it's the greatest video
00:30:30.240 | I've ever watched on YouTube.
00:30:31.840 | It's like, I watched the whole thing
00:30:34.200 | and my mind was blown.
00:30:35.360 | It's a very crisp, clean description
00:30:38.040 | of how at least the American economic system works.
00:30:40.760 | It's a beautiful video.
00:30:42.320 | And I was just, I wanted to click on something
00:30:44.800 | to say this is the best thing.
00:30:46.880 | This is the best thing ever, please let me,
00:30:48.720 | I can't believe I discovered it.
00:30:50.480 | I mean, the views and the likes reflect its quality,
00:30:54.920 | but I was almost upset that I haven't found it earlier
00:30:57.720 | and wanted to find other things like it.
00:31:00.400 | I don't think I've ever felt
00:31:01.720 | that this is the best video I've ever watched.
00:31:04.280 | And that was that.
00:31:05.320 | And to me, the ultimate utopia,
00:31:08.000 | the best experiences were every single video.
00:31:10.800 | Where I don't see any of the videos I regret
00:31:12.800 | and every single video I watch is one
00:31:14.800 | that actually helps me grow,
00:31:16.760 | helps me enjoy life, be happy and so on.
00:31:22.480 | - Well, so that's a heck of a,
00:31:26.840 | that's one of the most beautiful and ambitious,
00:31:30.320 | I think, machine learning tasks.
00:31:32.120 | So you've mentioned kind of the YouTube algorithm
00:31:34.840 | that isn't E equals MC squared.
00:31:37.680 | It's not a single equation.
00:31:39.240 | It's potentially sort of more than a million lines of code.
00:31:43.180 | Sort of, is it more akin to what autonomous,
00:31:49.600 | successful autonomous vehicles today are,
00:31:51.400 | which is they're just basically patches
00:31:54.240 | on top of patches of heuristics and human experts
00:31:58.040 | really tuning the algorithm
00:32:00.760 | and have some machine learning modules?
00:32:03.500 | Or is it becoming more and more
00:32:05.440 | a giant machine learning system
00:32:08.280 | with humans just doing a little bit
00:32:10.080 | of tweaking here and there?
00:32:11.200 | What's your sense?
00:32:12.440 | First of all, do you even have a sense
00:32:13.960 | of what is the YouTube algorithm at this point?
00:32:16.280 | And however much you do have a sense,
00:32:19.160 | what does it look like?
00:32:20.820 | - Well, we don't usually think about it as the algorithm
00:32:24.280 | because it's a bunch of systems
00:32:26.680 | that work on different services.
00:32:29.320 | The other thing that I think people don't understand
00:32:31.920 | is that what you might refer to as the YouTube algorithm
00:32:36.960 | from outside of YouTube is actually a bunch of code
00:32:41.960 | and machine learning systems and heuristics,
00:32:44.920 | but that's married with the behavior
00:32:47.660 | of all the people who come to YouTube every day.
00:32:49.800 | - So the people part of the code, essentially.
00:32:51.660 | - Exactly, right?
00:32:52.500 | Like if there were no people who came to YouTube tomorrow,
00:32:54.860 | then the algorithm wouldn't work anymore, right?
00:32:57.580 | So that's a critical part of the algorithm.
00:33:00.520 | And so when people talk about,
00:33:01.820 | well, the algorithm does this, the algorithm does that,
00:33:04.500 | it's sometimes hard to understand,
00:33:06.140 | well, it could be the viewers are doing that
00:33:09.640 | and the algorithm is mostly just keeping track
00:33:12.540 | of what the viewers do and then reacting to those things
00:33:18.140 | in sort of more fine-grained situations.
00:33:20.660 | And I think that this is the way
00:33:23.460 | that the recommendation system and the search system
00:33:26.660 | and probably many machine learning systems evolve
00:33:29.560 | is you start trying to solve a problem
00:33:33.380 | and the first way to solve a problem
00:33:34.940 | is often with a simple heuristic, right?
00:33:38.660 | And you wanna say,
00:33:40.500 | what are the videos we're gonna recommend?
00:33:41.820 | Well, how about the most popular ones, right?
00:33:44.620 | And that's where you start.
00:33:47.520 | And over time, you collect some data
00:33:51.660 | and you refine your situation
00:33:53.200 | so that you're making less heuristics
00:33:55.000 | and you're building a system
00:33:57.220 | that can actually learn what to do in different situations
00:34:00.740 | based on some observations of those situations in the past.
00:34:04.660 | And you keep chipping away at these heuristics over time.
00:34:08.620 | And so I think that just like with diversity,
00:34:13.120 | I think the first diversity measure we took was,
00:34:16.180 | okay, not more than three videos in a row
00:34:18.900 | from the same channel, right?
00:34:20.420 | It's a pretty simple heuristic to encourage diversity,
00:34:24.140 | but it worked, right?
00:34:25.700 | Who needs to see four, five, six videos in a row
00:34:28.220 | from the same channel?
00:34:29.400 | And over time, we try to chip away at that
00:34:33.580 | and make it more fine-grained
00:34:35.220 | and basically have it remove the heuristics
00:34:39.540 | in favor of something that can react
00:34:42.780 | to individuals and individual situations.
00:34:46.340 | - So how do you, you mentioned,
00:34:48.020 | you know, we know that something worked.
00:34:51.620 | How do you get a sense when decisions
00:34:53.940 | of the kind of A/B testing that this idea was a good one,
00:34:57.420 | this was not so good?
00:34:58.880 | How do you measure that and across which time scale,
00:35:03.820 | across how many users, that kind of thing?
00:35:07.260 | - Well, you mentioned that A/B experiments.
00:35:09.540 | And so just about every single change we make to YouTube,
00:35:13.860 | we do it only after we've run a A/B experiment.
00:35:18.700 | And so in those experiments,
00:35:21.500 | which run from one week to months,
00:35:25.360 | we measure hundreds, literally hundreds
00:35:30.000 | of different variables and measure changes
00:35:33.940 | with confidence intervals in all of them.
00:35:35.960 | Because we really are trying to get a sense
00:35:38.500 | for ultimately does this improve the experience for viewers?
00:35:43.260 | That's the question we're trying to answer.
00:35:45.180 | And an experiment is one way
00:35:47.060 | because we can see certain things go up and down.
00:35:50.100 | So for instance, if we noticed in the experiment,
00:35:53.860 | people are dismissing videos less frequently,
00:35:57.540 | or they're saying that they're more satisfied,
00:36:01.940 | they're giving more videos five stars after they watch them,
00:36:04.420 | then those would be indications
00:36:06.260 | of that the experiment is successful,
00:36:09.460 | that it's improving the situation for viewers.
00:36:11.760 | But we can also look at other things,
00:36:14.540 | like we might do user studies
00:36:17.500 | where we invite some people in and ask them,
00:36:19.780 | like, what do you think about this?
00:36:21.060 | What do you think about that?
00:36:22.060 | How do you feel about this?
00:36:23.420 | And other various kinds of user research.
00:36:26.860 | But ultimately, before we launch something,
00:36:29.380 | we're gonna wanna run an experiment.
00:36:31.100 | So we get a sense for what the impact is gonna be,
00:36:34.300 | not just to the viewers,
00:36:35.740 | but also to the different channels and all of that.
00:36:39.880 | (silence)
00:36:42.040 | (silence)
00:36:44.200 | (silence)
00:36:46.360 | (silence)
00:36:48.520 | (silence)
00:36:50.680 | (silence)
00:36:52.840 | (silence)
00:36:55.000 | [BLANK_AUDIO]