back to indexGustav Soderstrom: Spotify | Lex Fridman Podcast #29
Chapters
0:0
3:29 Purpose of Music
21:15 Technical Challenge in Reducing Olli
23:35 Video Content
26:4 How Do You Grow a User Base
26:12 How Do You Grow User Base
27:55 The Access Model versus the Ownership Model
34:15 Can Playlist Be Used as Data
36:21 Collaborative Filtering
46:23 Anchor
64:6 Discover Weekly
72:11 Deep Embedding
79:28 Smart Speakers
94:42 Spotify Model
95:10 Business Model
103:35 Vr
00:00:00.000 |
The following is a conversation with Gustav Sørenstrøm. 00:00:03.920 |
He's the Chief Research and Development Officer at Spotify, 00:00:07.280 |
leading their product design, data technology, and engineering teams. 00:00:11.200 |
As I've said before, in my research and in life in general, 00:00:15.280 |
I love music, listening to it and creating it, 00:00:18.720 |
and using technology, especially personalization through machine learning, 00:00:23.600 |
to enrich the music discovery and listening experience. 00:00:27.920 |
That is what Spotify has been doing for years, continually innovating, 00:00:31.920 |
defining how we experience music as a society in a digital age. 00:00:36.080 |
That's what Gustav and I talk about among many other topics, 00:00:39.280 |
including our shared appreciation of the movie "True Romance," 00:00:43.280 |
in my view, one of the great movies of all time. 00:00:49.360 |
If you enjoy it, subscribe on YouTube, give it five stars on iTunes, support on 00:00:53.760 |
Patreon, or simply connect with me on Twitter at Lex Friedman, spelled F-R-I-D-M-A-N. 00:01:01.280 |
And now, here's my conversation with Gustav Sørenstrøm. 00:01:06.400 |
Spotify has over 50 million songs in its catalog, so 00:01:11.200 |
let me ask the all-important question. I feel like you're the right person to ask. 00:01:16.320 |
What is the definitive greatest song of all time? 00:01:20.960 |
It varies for me, personally. So you can't speak definitively for everyone? 00:01:27.600 |
I wouldn't believe very much in machine learning, 00:01:30.560 |
if I did, right? Because everyone had the same taste. 00:01:34.160 |
So for you, what is... you have to pick. What is the song? 00:01:38.320 |
All right, so it's pretty easy for me. There is this song called 00:01:42.480 |
"You're So Cool" by Hans Zimmer, soundtrack to "True Romance." 00:01:47.360 |
It was a movie that made a big impression on me, and it's kind of been 00:01:51.200 |
following me through my life. Actually, I had it play at my wedding. I sat with the 00:01:56.560 |
organist and helped him play it on an organ, which 00:01:59.200 |
was a pretty interesting experience. That is probably my, 00:02:03.760 |
I would say, top three movie of all time. Yeah, this is an incredible movie. 00:02:08.560 |
And it came out during my formative years, and 00:02:12.160 |
as I've discovered in music, you shape your music taste 00:02:15.520 |
during those years. So it definitely affected me quite a bit. 00:02:18.560 |
Did it affect you in any other kind of way? Well, the movie itself affected me 00:02:23.040 |
back then. It was a big part of culture. I didn't really adopt any characters 00:02:27.040 |
from the movie, but it was a great story of love, 00:02:31.120 |
fantastic actors, and really, I didn't even know who Hans Zimmer was at 00:02:40.560 |
that song has followed me, and the movie actually has followed me throughout my life. 00:02:44.480 |
That was Quentin Tarantino, actually, I think, director of 00:02:48.000 |
"Produce The Hatter". So it's not "Stairway to Heaven" or "Bohemian 00:02:51.520 |
Rhapsody". Those are great. They're not my personal 00:02:54.400 |
favorites, but I've realized that people have different tastes, and 00:02:57.920 |
that's a big part of what we do. Well, for me, I would have to 00:03:01.440 |
stick with "Stairway to Heaven". So, 35,000 years ago, I looked this up on 00:03:08.320 |
Wikipedia. Flute-like instruments started being used in caves 00:03:11.760 |
as part of hunting rituals, in primitive cultural gatherings, things like that. 00:03:16.240 |
This is the birth of music. Since then, we had a few folks, Beethoven, 00:03:21.120 |
Elvis, Beatles, Justin Bieber, of course, Drake. 00:03:26.320 |
So, in your view, let's start high-level philosophical. What is the 00:03:30.640 |
purpose of music on this planet of ours? I think music has many different 00:03:38.240 |
purposes. I think there's certainly a big purpose, which is the same as 00:03:44.160 |
much of entertainment, which is escapism, and to be able to live in some sort of 00:03:50.960 |
other mental state for a while. But I also think you have the opposite 00:03:54.320 |
of escaping, which is to help you focus on something you are actually doing. 00:04:01.440 |
tune the brain to the activities that they are actually doing. 00:04:06.720 |
And it's kind of like, in one sense, maybe it's the rawest signal. If you 00:04:12.960 |
think about the brain as neural networks, it's maybe the most efficient hack we 00:04:16.400 |
can do to actually actively tune it into some state that you want to be. You 00:04:20.640 |
can do it in other ways. You can tell stories to put people in a certain mood. 00:04:23.840 |
But music is probably very effective to get you to a certain mood very fast. 00:04:28.640 |
You know, there's a social component historically to music, where 00:04:32.720 |
people listen to music together. I was just thinking about this, that 00:04:36.480 |
to me, and you mentioned machine learning, but to me 00:04:48.640 |
Almost nobody knows the kind of things I have in my library, 00:04:52.800 |
except people who are really close to me, and they really only know 00:04:56.000 |
a certain percentage. There's some weird stuff that I'm almost probably 00:04:59.360 |
embarrassed by. It's called the guilty pleasures, right? 00:05:02.480 |
Everyone has that. The guilty pleasures, yeah. 00:05:04.560 |
Hopefully they're not too bad. For me, it's personal. Do you think of 00:05:09.280 |
music as something that's social or as something that's personal? 00:05:14.960 |
Or does it vary? I think it's the same answer, that 00:05:21.360 |
you use it for both. We've thought a lot about this 00:05:24.960 |
during these 10 years at Spotify, obviously. In one sense, as you said, music 00:05:29.120 |
is incredibly social. You go to concerts and so forth. 00:05:33.440 |
On the other hand, it is your escape, and everyone 00:05:38.400 |
has these things that are very personal to them. 00:05:48.400 |
Most people claim that they have a friend or two that they are heavily 00:05:51.360 |
inspired by, and that they listen to. I actually think 00:05:54.640 |
music is very social, but in a smaller group setting, it's an 00:05:58.960 |
intimate relationship. It's not something that you 00:06:04.480 |
necessarily share broadly. Now, at concerts, you can argue you do, 00:06:08.000 |
but then you've gathered a lot of people that you have something in common with. 00:06:11.920 |
I think this broadcast sharing of music is something we 00:06:20.960 |
it turns out that people aren't super interested in 00:06:25.120 |
what their friends listen to. They're interested in 00:06:29.840 |
understanding if they have something in common, perhaps, with a friend, but not 00:06:36.960 |
interesting. I was just thinking of it this morning, 00:06:39.840 |
listening to Spotify. I really have a pretty intimate 00:06:43.920 |
relationship with Spotify, with my playlists. I've had them for 00:06:50.160 |
many years now, and they've grown with me together. 00:06:53.840 |
There's an intimate relationship you have with a library of music 00:06:58.640 |
that you've developed, and we'll talk about different ways we can play with that. 00:07:05.920 |
give a history of music listening from your perspective, from before the 00:07:12.400 |
internet and after the internet, and just kind of everything leading up 00:07:16.480 |
to streaming with Spotify and so on? I'll try. It could be a 100-year podcast. 00:07:21.920 |
I'll try to do a brief version. There are some things that 00:07:25.840 |
I think are very interesting during the history of music, which is that 00:07:30.000 |
before recorded music, to be able to enjoy music, you actually had to be 00:07:34.240 |
where the music was produced, because you couldn't record it 00:07:37.280 |
and time shift it. Creation and consumption had to happen at the same 00:07:40.320 |
time, basically concerts. So you either had to get to the 00:07:44.720 |
nearest village to listen to music, and while that was 00:07:48.400 |
cumbersome and it severely limited the distribution of music, 00:07:52.400 |
it also had some different qualities, which was that 00:07:55.280 |
the creator could always interact with the audience. It was always live. 00:07:59.280 |
And also there was no time cap on the music. So I think it's not a coincidence 00:08:03.120 |
that these early classical works, they're much longer than 00:08:06.800 |
the three minutes. The three minutes came in as a 00:08:10.400 |
restriction of the first wax disc that could only contain 00:08:17.200 |
actually the recorded music severely limited the 00:08:21.440 |
or put constraints, I won't say limit, I mean constraints are often good, but it 00:08:26.160 |
format. So you kind of said like instead of doing these opus 00:08:29.760 |
on like many, you know, tens of minutes or something, 00:08:33.120 |
now you get three and a half minutes because then you're out of wax on this 00:08:41.920 |
Just on that point real quick, without the mass 00:08:46.320 |
scale distribution, there's a scarcity component 00:08:50.160 |
where you kind of look forward to it. We had that, it's like the Netflix 00:08:56.720 |
versus HBO Game of Thrones, you like wait for the event because you 00:09:00.960 |
can't really listen to it. So you like look forward to it and then 00:09:04.400 |
it's, you derive perhaps more pleasure because 00:09:07.680 |
it's more rare for you to listen to a particular piece. 00:09:10.560 |
You think there's value to that scarcity? Yeah, 00:09:13.600 |
I think that that is definitely a thing and there's always this 00:09:16.880 |
component of if you have something in infinite amounts, will you value it 00:09:21.120 |
as much? Probably not. Humanity is always seeking some, 00:09:26.080 |
is relative, so you're always seeking something you didn't have and when you 00:09:30.720 |
I think that's probably true, but I think that's why concerts exist, so you can 00:09:34.640 |
actually have both. But I think net, if you couldn't listen 00:09:38.640 |
to music in your car driving, that'd be worse, 00:09:42.960 |
that cost would be bigger than the benefit of the anticipation I think 00:09:46.560 |
that you would have. So yeah, it started with live concerts, 00:09:51.760 |
then it's being able to, you know, the phonograph 00:09:56.720 |
invented, right? You start to be able to record music. 00:10:00.480 |
Exactly, so then you got this massive distribution that made it possible 00:10:04.240 |
to create two things, I think. First of all, cultural 00:10:12.800 |
access to, you know, for a new kind of artist. So you started to have these 00:10:17.600 |
phenomenons like Beatles and Elvis and so forth, that were really 00:10:21.120 |
a function of distribution, I think, obviously of talent and innovation, but 00:10:24.800 |
there was also a technical component. And of course the next big innovation to 00:10:28.400 |
come along was radio, broadcast radio. And I think radio is interesting 00:10:37.200 |
started as an information medium for news and 00:10:41.440 |
then radio needed to find something to fill the time with so that they could 00:10:45.360 |
honestly play more ads and make more money, and music was 00:10:48.800 |
free. So then you had this massive distribution where you could program to 00:10:52.720 |
people. I think those things, that ecosystem, 00:10:56.000 |
is what created the ability for hits. But it was also a very broadcast 00:11:01.760 |
medium, so you would tend to get these massive, 00:11:04.400 |
massive hits, but maybe not such a long tail. 00:11:08.400 |
In terms of choice, of everybody listening to the same stuff. 00:11:11.520 |
Yeah, and as you said, I think there are some social benefits to that. 00:11:15.760 |
I think, for example, there's a high statistical chance that if I talk 00:11:19.680 |
about the latest episode of Game of Thrones, we have something to talk about 00:11:22.800 |
just statistically. In the age of individual choice, maybe some of that 00:11:32.320 |
shared cultural components, but I also obviously love personalization. 00:11:40.800 |
maybe Napster, well first of all, there's like mp3s, 00:11:44.000 |
there's like tape, CDs. There was a digitalization of music with a CD, really. It was 00:11:50.720 |
digital. And so they were files, but basically boxed software, 00:11:55.440 |
to use a software analogy. And then you could start downloading these files. 00:12:00.880 |
And I think there are two interesting things that happened. Back to 00:12:04.240 |
music used to be longer before it was constrained by the distribution medium. 00:12:08.960 |
I don't think that was a coincidence. And then really the only music genre to have 00:12:13.200 |
developed mostly after music was a file again on the 00:12:16.720 |
internet is EDM. And EDM is often much longer than the 00:12:20.080 |
traditional music. I think it's interesting to think 00:12:23.600 |
about the fact that music is no longer constrained in 00:12:26.960 |
minutes per song or something. It's a legacy of an old distribution 00:12:31.360 |
technology. And you see some of this new music that 00:12:33.840 |
breaks the format. Not so much as I would have expected actually by now, 00:12:37.680 |
but it still happens. So first of all, I don't really know what EDM is. 00:12:45.520 |
was one of the biggest in this genre. So the main constraint is of time. Something 00:12:51.040 |
that has three, four, five minutes on. So you could have songs that were eight 00:12:54.960 |
minutes, ten minutes and so forth. Because it started as a digital 00:12:59.840 |
product that you downloaded. So you didn't have 00:13:02.560 |
this constraint anymore. So I think it's something really 00:13:06.400 |
interesting that I don't think has fully happened yet. 00:13:09.280 |
We're kind of jumping ahead a little bit to where we are. But I think there's 00:13:12.960 |
tons of formal innovation in music that should happen now. That couldn't 00:13:18.720 |
happen when you needed to really adhere to the distribution constraints. 00:13:22.080 |
If you didn't adhere to that, you would get no distribution. 00:13:29.280 |
she made a full iPad app as an album. That was very expensive. 00:13:33.920 |
Even though the app still has great distribution, 00:13:37.120 |
she gets nowhere near the distribution versus staying within the three minute 00:13:44.080 |
digital inside these streaming services, there is 00:13:46.800 |
the opportunity to change the format again and allow creators to be much more 00:13:51.520 |
creative without limiting their distribution ability. 00:13:55.120 |
That's interesting that you're right. It's surprising that we don't see that 00:14:00.000 |
taking advantage more often. It's almost like the constraints of the 00:14:04.320 |
distribution from the 50s and 60s have molded the culture to where we 00:14:09.520 |
want the three to five minutes on than anything else. 00:14:14.240 |
So we want the song as consumers and as artists. 00:14:18.720 |
Because I write a lot of music and I never even thought about writing 00:14:26.400 |
It's really interesting that those constraints. Because all your 00:14:29.600 |
training data has been three and a half minute songs. 00:14:36.320 |
led to then MP3s. Yeah, so I think you had this file then 00:14:41.440 |
that was distributed physically. But then you had the components of digital 00:14:45.840 |
distribution. And then the internet happened. 00:14:48.400 |
And there was this vacuum where you had a format that could be digitally shipped, 00:14:52.560 |
but there was no business model. And then all these pirate networks 00:15:01.600 |
which was one of the biggest. And I think from a consumer 00:15:06.640 |
point of view, which kind of leads up to the inception of 00:15:10.160 |
Spotify from a consumer point of view, consumers for the first time had this 00:15:15.120 |
access model to music where they could, without kind of any 00:15:20.160 |
marginal cost, they could try different tracks. 00:15:25.760 |
You could use music in new ways. There was no marginal cost. 00:15:28.880 |
And that was a fantastic consumer experience to have access to all the 00:15:31.680 |
music ever made. I think was fantastic. But it was also horrible for artists 00:15:36.400 |
because there was no business model around it. So they didn't make any money. 00:15:39.680 |
So the user need almost drove the user interface before there was a 00:15:45.280 |
business model. And then there were these download 00:15:47.760 |
stores that allowed you to download files, which was a solution, but it didn't 00:15:53.600 |
solve the access problem. There was still a marginal cost of 99 00:15:56.880 |
cents to try one more track. And I think that that heavily limits how 00:16:00.720 |
you listen to music. The example I always give is 00:16:05.040 |
in Spotify, a huge amount of people listen to music while they sleep, while 00:16:08.960 |
they go to sleep and while they sleep. If that costed you 99 cents per three 00:16:13.200 |
minutes, you probably wouldn't do that. And you would be much less 00:16:16.480 |
adventurous if there was a real dollar cost to exploring music. 00:16:19.280 |
So the access model is interesting in that it changes your music behavior. 00:16:22.880 |
You can be, you can take much more risk because there's no marginal cost to it. 00:16:28.240 |
Maybe let me linger on piracy for a second because I find, 00:16:31.680 |
especially coming from Russia, piracy is something that's very interesting. 00:16:39.120 |
not me, of course, ever, but I have friends who've partook in piracy 00:16:47.040 |
of music, software, TV shows, sporting events. And usually to me what 00:16:54.880 |
that shows is not that they can actually pay the 00:17:01.920 |
They're choosing the best experience. So what to me piracy shows is a business 00:17:08.160 |
opportunity in all these domains. And that's where I think you're right. 00:17:12.480 |
Spotify stepped in, is basically piracy was an experience. You can 00:17:17.760 |
explore, find music you like, and actually the interface of piracy is 00:17:23.920 |
horrible because it's, I mean, it's bad metadata. 00:17:28.000 |
Yeah, bad metadata, long download times, all kinds of stuff. 00:17:31.040 |
And what Spotify does is basically first rewards artists and second 00:17:37.520 |
makes the experience of exploring music much better. I mean, the same is true, 00:17:41.920 |
I think, for movies and so on. Piracy reveals, 00:17:45.600 |
in the software space, for example, I'm a huge user and fan of Adobe products 00:17:50.560 |
and there was much more incentive to pirate Adobe products 00:17:55.360 |
before they went to a monthly subscription plan. 00:17:58.400 |
And now all of the said friends that used to pirate Adobe products that I 00:18:07.280 |
monthly subscription. I think you're right. I think it's a sign 00:18:12.960 |
and that sometimes there's a product market fit 00:18:17.760 |
before there's a business model fit in product development. I think that's 00:18:22.640 |
a sign of it. In Sweden, I think it was a bit of both. 00:18:25.840 |
There was a culture where we even had a political party called 00:18:30.880 |
the Pirate Party and this was during the time when 00:18:34.240 |
people said that information should be free. It was somehow wrong to 00:18:41.600 |
felt that artists should probably make money somehow else 00:18:45.280 |
and concerts or something. So at least in Sweden, it was part 00:18:48.640 |
really social acceptance, even at the political level. 00:18:55.040 |
with free, which I don't think would actually could have happened 00:18:59.360 |
anywhere else in the world. The music industry needed to be 00:19:02.480 |
doing bad enough to take that risk and Sweden was like the perfect testing 00:19:06.480 |
ground. It had government funded high bandwidth, low 00:19:10.000 |
latency broadband, which meant that the product would work 00:19:13.440 |
and it was also there was no music revenue anyway. So they were kind of 00:19:16.880 |
like, I don't think this is going to work but 00:19:19.120 |
why not? So this product is one that I don't think could have happened in 00:19:23.200 |
America, the world's largest music market, for example. 00:19:25.840 |
So how do you compete with free? Because that's an interesting world 00:19:29.520 |
of the internet where most people don't like to pay for things. 00:19:34.400 |
So Spotify steps in and tries to, yes, compete with free. 00:19:39.040 |
How do you do it? So I think two things. One is 00:19:42.320 |
people are starting to pay for things on the internet. I think 00:19:45.760 |
one way to think about it was that advertising was the first 00:19:49.680 |
business model because no one would put a credit card on internet. Transactional 00:19:52.960 |
with Amazon was the second and maybe subscription is the third 00:19:56.080 |
and if you look offline, subscription is the biggest of those. 00:19:59.600 |
So that may still happen. I think people are starting to pay but definitely back 00:20:02.720 |
then we needed to compete with free and the first thing you need to do is 00:20:06.400 |
obviously to lower the price to free and then you need to be better somehow 00:20:12.240 |
and the way that Spotify was better was on the user experience, on the 00:20:16.160 |
actual performance, the latency of, you know, even if you had 00:20:22.880 |
high bandwidth broadband, it would still take you 30 seconds to a minute to 00:20:29.120 |
download one of these tracks. So the Spotify experience of starting 00:20:32.320 |
within the perceptual limit of immediacy, about 250 milliseconds, 00:20:36.480 |
meant that the whole trick was it felt as if you had downloaded all of PirateBay. 00:20:41.680 |
It was on your hard drive. It was that fast even though it wasn't 00:20:45.280 |
and it was still free but somehow you were actually still 00:20:49.040 |
being a legal citizen. That was the trick that Spotify managed to 00:20:56.880 |
say this or write this and I was surprised that I wasn't aware of it 00:21:00.560 |
because I just took it for granted. You know, whenever an awesome thing 00:21:03.760 |
comes along you're just like, "Oh, of course it has to be this way. 00:21:07.360 |
That's exactly right." That it felt like the entire world's libraries at my 00:21:11.120 |
fingertips because of that latency being reduced. 00:21:15.440 |
What was the technical challenge in reducing the latency? 00:21:18.640 |
So there was a group of really, really talented engineers. 00:21:23.360 |
One of them called Ludwig Stregius. He wrote the... 00:21:26.640 |
actually from Gothenburg. He wrote the initial... 00:21:30.720 |
the uTorrent client, which is kind of an interesting backstory to Spotify. 00:21:34.320 |
You know, that we have one of the top developers from 00:21:38.000 |
BitTorrent clients as well. So he wrote uTorrent, the world's smallest 00:21:41.600 |
BitTorrent client. And then he was acquired very early by 00:21:47.520 |
Daniel and Martin, who founded Spotify. And they actually sold the uTorrent 00:21:51.920 |
client to BitTorrent but kept Ludwig. So Spotify had a lot of experience 00:21:57.360 |
within peer-to-peer networking. So the original 00:22:01.600 |
innovation was a distribution innovation, where Spotify built an 00:22:05.440 |
end-to-end media distribution system up until only a few years ago. We actually 00:22:08.800 |
hosted all the music ourselves. So we had both the server side and the 00:22:12.000 |
client and that meant that we could do things such as having 00:22:18.560 |
on the client side, because back then the world was mostly desktop. 00:22:22.000 |
But we could also do things like hack the TCP protocols, 00:22:25.600 |
things like Nagel's algorithm for kind of exponential back-off 00:22:29.440 |
or ramp up and just go full throttle and optimize for latency 00:22:33.520 |
at the cost of bandwidth. And all of this end-to-end control meant that we 00:22:41.520 |
change. These days we actually are on GCP. We don't host our own 00:22:47.280 |
stuff and everyone is really fast these days. So that was the initial 00:22:50.240 |
competitive advantage. But then obviously you have to move on over time. 00:22:53.440 |
And that was over 10 years ago, right? That was in 2008. The product 00:22:58.160 |
was launched in Sweden. It was in a beta, I think, 2007. And it was on the desktop, 00:23:02.400 |
right? So it was desktop only. There's no phone. 00:23:05.120 |
There was no phone. The iPhone came out in 2008, 00:23:09.440 |
but the App Store came out one year later, I think. So the writing was on the 00:23:13.120 |
wall, but there was no phone yet. You've mentioned that people would 00:23:18.560 |
use Spotify to discover the songs they like and then they would 00:23:21.520 |
torrent those songs so they can copy it to their phone. 00:23:26.400 |
Just hilarious. Exactly. Not torrent, pirate. 00:23:30.400 |
Seriously, piracy does seem to be like a good guide for business models. 00:23:36.480 |
Video content. As far as I know, Spotify doesn't have video content. 00:23:40.560 |
Well, we do have music videos and we do have videos on the 00:23:44.480 |
service, but the way we think about ourselves is that we're an audio 00:23:48.560 |
service and we think that if you look at the amount of 00:23:52.480 |
time that people spend on audio, it's actually very similar to the amount of 00:23:56.320 |
time that people spend on video. So the opportunity should be equally 00:24:01.040 |
big, but today it's not at all valued. Video is valued much higher. So we 00:24:05.280 |
think it's basically completely undervalued. We think of 00:24:08.480 |
ourselves as an audio service, but within that audio service, I think 00:24:14.880 |
when you're discovering an artist, you probably do want to see them 00:24:18.320 |
and understand who they are, to understand their identity. 00:24:20.880 |
You won't see that video every time. No, 90% of the time the phone is going to be 00:24:23.920 |
in your pocket. For podcasters, you use video. I think 00:24:27.520 |
that can make a ton of sense. So we do have video, but we're an audio 00:24:30.240 |
service where, think of it as we call it internally 00:24:33.600 |
backgroundable video. Video that is helpful, but isn't 00:24:37.360 |
the driver of the narrative. I think also if we look at 00:24:42.560 |
YouTube, the way people, there's quite a few folks who 00:24:46.160 |
listen to music on YouTube. So in some sense, YouTube is a bit of a competitor 00:24:51.360 |
to Spotify, which is very strange to me that people use YouTube to listen 00:24:56.800 |
to music. They play essentially the music videos, 00:25:00.000 |
right, but don't watch the videos and put it in their pocket. 00:25:03.360 |
Well, I think it's similar to what, strangely, maybe it's similar to 00:25:14.320 |
YouTube, for historical reasons, have a lot of music videos. 00:25:20.720 |
So people use YouTube for a lot of the discovery part of the process, I 00:25:24.480 |
think. But then it's not a really good sort of 00:25:27.280 |
"MP3 player" because it doesn't even background. Then you have to keep 00:25:30.640 |
the app in the foreground. So it's not a good consumption tool, 00:25:34.480 |
but it's a decently good discovery tool. I mean, I think YouTube is a fantastic 00:25:37.520 |
product and I use it for all kinds of purposes. 00:25:40.320 |
That's true. If I were to admit something, I do use YouTube a little bit 00:25:44.160 |
for the discovery, to assist in the discovery process of songs. 00:25:47.280 |
And then if I like it, I'll add it to Spotify. 00:25:51.040 |
But that's OK. That's OK with us. OK, so sorry, we're jumping around a little bit. 00:25:59.040 |
Napster, you look at the early days of Spotify. 00:26:02.480 |
How do you, one fascinating point is, how do you grow a user base? 00:26:10.400 |
I saw the initial sketches that look terrible. 00:26:14.240 |
How do you grow a user base from a few folks to 00:26:17.760 |
millions? I think there are a bunch of tactical answers. 00:26:22.240 |
So first of all, I think you need a great product. I don't think you take a bad 00:26:30.080 |
So you need a great product. But sorry to interrupt, but it's a totally new way to 00:26:33.760 |
listen to music, too. So it's not just... Did people realize immediately that 00:26:37.280 |
Spotify is a great product? I think they did. So back to the point of 00:26:41.280 |
piracy, it was a totally new way to listen to music legally. 00:26:45.760 |
But people had been used to the access model in Sweden 00:26:48.960 |
and the rest of the world for a long time through piracy. So one way to think 00:26:51.520 |
about Spotify, it was just legal and fast piracy. 00:26:54.720 |
And so people have been using it for a long time. So they weren't alien to it. 00:26:59.040 |
They didn't really understand how it could be legal because it would seem too 00:27:02.240 |
fast and too good to be true. Which I think is a great product 00:27:05.040 |
proposition if you can be too good to be true. 00:27:08.080 |
But what I saw again and again was people showing each other, clicking the 00:27:11.440 |
song, showing how fast it started and saying, "Can you believe this?" 00:27:14.080 |
So I really think it was about speed. Then we also had an invite 00:27:20.320 |
program that was really meant for scaling because we hosted our own 00:27:24.240 |
servers. We needed to control scaling. But that built a lot of expectation and 00:27:29.920 |
I don't want to say hype because hype implies that it was 00:27:33.280 |
that it wasn't true. Excitement around the product. And we've 00:27:40.880 |
We also built up an invite-only program first. So lots of tactics. 00:27:44.960 |
But I think you need a great product that solves some problem. 00:27:48.640 |
And basically the key innovation, there was technology, but on a metal 00:27:54.400 |
level, the innovation was really the access model versus the ownership model. 00:27:58.000 |
And that was tricky. A lot of people said that they 00:28:02.400 |
wanted to own their music. They would never kind of rent it or 00:28:06.720 |
borrow it. But I think the fact that we had a free 00:28:08.880 |
tier, which meant that you get to keep this music for life as well, 00:28:13.280 |
helped quite a lot. So this is an interesting psychological point 00:28:17.120 |
that maybe you can speak to. It was a big shift for me. 00:28:20.880 |
It's almost like I had to go to therapy for this. 00:28:25.280 |
I think I would describe my early listening experience, and I think a lot 00:28:30.000 |
of my friends do, is basically hoarding music. It's you're 00:28:33.680 |
like slowly, one song by one song or maybe albums, gathering a collection 00:28:38.720 |
of music that you love. And you own it. It's like often, 00:28:42.880 |
especially with CDs or tape, you like physically had it. 00:28:46.720 |
And what Spotify, what I had to come to grips with, it was kind of 00:28:50.880 |
liberating actually, is to throw away all the music. 00:28:55.600 |
I've had this therapy session with lots of people. 00:28:59.040 |
And I think the mental trick is, so actually we've seen the user data when 00:29:03.120 |
Spotify started, a lot of people did the exact same thing. They started hoarding 00:29:07.040 |
as if the music would disappear, right? Almost the equivalent of downloading. 00:29:11.440 |
And so, you know, we had these playlists that had limits of like 00:29:15.520 |
a few hundred thousand tracks, and we figured no one will ever. Well, they do. 00:29:19.120 |
Hundreds and hundreds and hundreds of thousands of tracks. And to this day, 00:29:22.800 |
you know, some people want to actually save, quote unquote, and play the entire 00:29:26.960 |
catalog. But I think that the therapy session goes 00:29:30.080 |
something like, instead of throwing away your music, 00:29:35.120 |
if you took your files and you stored them in a locker 00:29:38.160 |
at Google, it'd be a streaming service. It's just that in that locker, you have 00:29:42.320 |
all the world's music now for free. So instead of giving away your music, you 00:29:45.280 |
got all the music. It's yours. You could think of it as 00:29:48.480 |
having a copy of the world's catalog there forever. So you actually got 00:29:52.080 |
more music instead of less. It's just that you just took that hard 00:29:56.800 |
disk and you sent it to someone who stored it for you. And once 00:30:00.480 |
you go through that mental journey of like, still my files, they're just over 00:30:03.360 |
there, and I just have 40 million of them, 50 00:30:05.440 |
million of them or something now. Then people are like, okay, that's good. 00:30:09.040 |
The problem is, I think, because you paid us a subscription, 00:30:13.280 |
if we hadn't had the free tier where you would feel like, even if I don't want to 00:30:16.160 |
pay anymore, I still get to keep them. You keep your 00:30:18.960 |
playlist forever. They don't disappear even though you stop paying. 00:30:21.600 |
I think that was really important. If we would have started as, 00:30:25.520 |
you know, you can put in all this time, but if you stop paying, you lose all your 00:30:30.560 |
challenge and was the big challenge for a lot of our competitors. That's another 00:30:33.920 |
reason why I think the free tier is really important. That people need to 00:30:37.280 |
feel the security that the work they put in, it will never disappear, even if they 00:30:40.800 |
decide not to pay. I like it how you put the work you put in. 00:30:44.560 |
I actually stopped even thinking of it that way. I just, 00:30:46.800 |
actually Spotify taught me to just enjoy music. 00:30:49.920 |
That's great. As opposed to what I was doing before, which is like 00:30:55.280 |
in an unhealthy way, hoarding music. Which I found that because I was doing 00:31:00.720 |
that, I was listening to a small selection of 00:31:03.760 |
songs way too much to where I was getting sick of them. 00:31:07.520 |
Whereas Spotify, the more liberating kind of approach is I was just enjoying. 00:31:11.680 |
Of course, I listened to "Stairway to Heaven" over and over, but 00:31:14.800 |
because of the extra variety, I don't get as sick of them. 00:31:22.400 |
Spotify has, maybe you can correct me, but over 50 million songs, 00:31:26.160 |
tracks and over 3 billion playlists. So, 50 million songs and 3 billion 00:31:34.640 |
playlists. 60 times more playlists. What do you make of that? 00:31:43.600 |
from a statistician or machine learning point of view, 00:31:48.320 |
you have all these, if you want to think about reinforcement learning, you 00:31:52.080 |
have this state space of all the tracks and you can 00:31:58.000 |
I think of these as like people helping themselves and each other 00:32:05.200 |
creating interesting vectors through this space of tracks. 00:32:08.720 |
Then it's not so surprising that across many tens of millions of 00:32:12.960 |
atomic units, there will be billions of paths 00:32:16.160 |
that make sense. We're probably pretty quite far away from 00:32:20.400 |
having found all of them. So, kind of our job now 00:32:23.680 |
is users, when Spotify started, it was really 00:32:27.280 |
a search box that was for the time pretty powerful. Then 00:32:30.960 |
I like to refer to this programming language called playlisting, 00:32:34.400 |
where if you, as you probably were pretty good at music, 00:32:37.440 |
you knew your new releases, you knew your back catalog, you knew your "Starry Way 00:32:40.320 |
to Heaven", you could create a soundtrack for 00:32:42.320 |
yourself using this playlisting tool that's like meta programming language for 00:32:45.280 |
music to soundtrack your life. People who were 00:32:48.800 |
good at music, it's back to how do you scale the product. 00:32:51.520 |
For people who are good at music, that wasn't actually enough. If you had the 00:32:54.880 |
catalog and a good search tool, you can create your own sessions, you 00:32:57.840 |
could create really good a soundtrack for your entire life. 00:33:01.760 |
Probably perfectly personalized because you did it yourself. 00:33:05.280 |
But the problem was most people, many people aren't that good at music, they 00:33:08.320 |
just can't spend the time. Even if you're very good at music, it's 00:33:10.960 |
gonna be hard to to keep up. So what we did to try to scale this was to 00:33:16.560 |
essentially try to build, you can think of them as agents, that 00:33:20.000 |
this friend that some people had that helped them navigate this music 00:33:23.760 |
catalog, that's what we're trying to do for you. 00:33:26.160 |
But also there is something like 200 million active users on Spotify. 00:33:35.040 |
So there, okay, so from the machine learning perspective, 00:33:39.760 |
you have these 200 million people plus, they're creating, it's really 00:33:46.400 |
interesting to think of playlists as, I mean, I don't know if you meant it 00:33:52.880 |
that way, but it's almost like a programming language. It's 00:33:56.320 |
or at least a trace of exploration of those individual agents, 00:34:01.760 |
the listeners. And you have all this new tracks coming in. So it's a 00:34:07.520 |
fascinating space that is ripe for machine learning. 00:34:12.720 |
So is there, is it possible, how can playlists be used as data 00:34:19.120 |
in terms of machine learning and to help Spotify organize the music? 00:34:25.120 |
So we found in our data, not surprising, that people who playlisted a lot, 00:34:31.200 |
they retained much better, they had a great experience. And so our first 00:34:34.560 |
attempt was to playlist for users. And so we acquired 00:34:38.320 |
this company called Tunigo of editors and professional playlisters 00:34:42.880 |
and kind of leveraged the maximum of human intelligence 00:34:50.560 |
through the track space for people. And that broadened the product. 00:34:55.920 |
Then the obvious next, and we used statistical means 00:34:59.440 |
where they could see when they created a playlist, how did that playlist 00:35:03.120 |
perform? They could see skips of the songs, they could see how the 00:35:05.920 |
songs perform, and they manually iterated the playlist to maximize 00:35:09.680 |
performance for a large group of people. But there 00:35:12.800 |
were never enough editors to playlist for you personally. So the promise of 00:35:16.720 |
machine learning was to go from kind of group personalization 00:35:19.760 |
using editors and tools and statistics to individualization. And then what's so 00:35:25.520 |
interesting about the three billion playlists we have is, 00:35:29.360 |
we ended, the truth is we lucked out. This was not a priority strategy, as is 00:35:33.760 |
often the case. It looks really smart in hindsight, but 00:35:36.720 |
it was dumb luck. We looked at these playlists and 00:35:41.840 |
we had some people in the company, a person named Eric Bernadson, 00:35:45.520 |
who was really good at machine learning already back then, in like 2007, 00:35:52.560 |
filtering and so forth. But we realized that what this is, is 00:35:58.320 |
people are grouping tracks for themselves that have some semantic 00:36:01.440 |
meaning to them. And then they actually label it with a 00:36:05.200 |
playlist name as well. So in a sense, people were grouping 00:36:08.800 |
tracks along semantic dimensions and labeling them. 00:36:12.080 |
And so could you use that information to find that 00:36:15.840 |
latent embedding? And so we started playing around with 00:36:21.760 |
collaborative filtering and we saw tremendous success with it. 00:36:30.240 |
dimensions. And if you think about it, it's not surprising at all. 00:36:33.760 |
It'd be quite surprising if playlists were actually random, if they had no 00:36:38.240 |
semantic meaning. For most people, they group these 00:36:41.040 |
tracks for some reason. So we just happened across this 00:36:46.960 |
these tens of millions of tracks and grouped them along 00:36:50.240 |
different semantic vectors. And the semantics being outside the 00:36:54.640 |
individual users, so it's some kind of universal. 00:36:57.360 |
There's a universal embedding that holds across 00:37:05.120 |
the embeddings you find are going to be reflective of the people who playlisted. 00:37:08.640 |
So if you have a lot of indie lovers who playlist, 00:37:12.000 |
your embed is going to perform better there. But what we found was that, 00:37:20.560 |
They were very powerful. And we had, it was interesting because 00:37:25.600 |
I think that the people who playlisted the most initially 00:37:28.800 |
were the so-called music aficionados who were really into music. And they often 00:37:37.520 |
geared towards a certain type of music. And so what surprised us, if you look at 00:37:41.760 |
the problem from the outside, you might expect that the algorithms 00:37:46.320 |
would start performing best with mainstreamers first because 00:37:49.120 |
it somehow feels like an easier problem to solve mainstream taste 00:37:52.400 |
than really particular taste. It was the complete opposite for us. 00:37:56.240 |
The recommendations performed fantastically for people who saw 00:37:58.960 |
themselves as having very unique taste. That's probably 00:38:02.720 |
because all of them playlisted and they didn't perform so well for 00:38:05.920 |
mainstreamers. They actually thought they were a bit too 00:38:08.400 |
particular and unorthodox. So we had the complete 00:38:12.000 |
opposite of what we expected. Success within the hardest problem first 00:38:15.440 |
and then had to try to scale to more mainstream recommendations. 00:38:19.040 |
So you've also acquired EchoNest that analyzes song data. 00:38:25.840 |
So in your view, maybe you can talk about, so what kind of data is there from a 00:38:31.680 |
machine learning perspective? There's a huge amount, we're 00:38:35.920 |
talking about playlisting and just user data of what people are 00:38:40.000 |
listening to, the playlist they're constructing 00:38:43.280 |
and so on. And then there's the actual data within a song. 00:38:51.200 |
waveforms. How do you mix the two? How much value is there in each? To me 00:38:57.680 |
it seems like user data is a romantic notion that the song 00:39:03.760 |
itself would contain useful information. But if I were to guess, 00:39:07.680 |
user data would be much more powerful. Like playlists would be much more 00:39:11.120 |
powerful. Yeah, so we use both. Our biggest success 00:39:16.160 |
initially was with playlist data without understanding 00:39:20.480 |
anything about the structure of the song. But when we acquired EchoNest, they had 00:39:24.320 |
the inverse problem. They actually didn't have any 00:39:27.520 |
play data. They were just a provider of recommendations, but they 00:39:30.560 |
didn't actually have any play data. So they looked at the structure of 00:39:34.560 |
songs sonically and they looked at Wikipedia for 00:39:38.880 |
cultural references and so forth, right? And did a lot of NLU and so forth. So we 00:39:51.200 |
content-based. So you can think of it as we were user-based and they were 00:39:54.080 |
content-based in their recommendations. And we combined those two. And for some 00:39:59.600 |
play data, obviously you have to try to go by 00:40:03.360 |
either who the artist is or the sonic information in the song or what 00:40:08.800 |
it's similar to. So there's definitely value in both and 00:40:11.920 |
we do a lot in both. But I would say yes, the user data captures things that 00:40:17.280 |
have to do with culture in the greater society 00:40:19.760 |
that you would never see in the content itself. 00:40:23.520 |
But that said, we have seen, we have a research lab in 00:40:31.360 |
kind of machine learning on the creator side. What it can do for creators, not 00:40:34.080 |
just for the consumers. But where we looked at how does the 00:40:37.840 |
structure of a song actually affect the listening behavior? And it turns out 00:40:41.600 |
that there is a lot of, we can predict things 00:40:44.560 |
like skips based on the song itself. We could 00:40:48.800 |
say that maybe you should move that chorus a bit 00:40:52.560 |
There is a lot of latent structure in the music, which is not surprising 00:40:56.080 |
because it is some sort of mind hack. So there should be structure. That's 00:40:59.920 |
probably what we respond to. You just blew my mind actually 00:41:03.200 |
from the creator perspective. So that's a really interesting topic 00:41:07.520 |
that probably most creators aren't taking advantage of. 00:41:15.600 |
folks, YouTubers, who are like obsessed with this idea of 00:41:22.960 |
what do I do to make sure people keep watching 00:41:26.880 |
the video? And they like look at the analytics of which point do people turn 00:41:31.200 |
it off and so on. First of all, I don't think that's 00:41:34.240 |
healthy because you can do it a little too much. 00:41:38.320 |
But it is a really powerful tool for helping the creative process. 00:41:42.960 |
You just made me realize you could do the same thing for 00:41:46.240 |
creation of music. So is that something you've looked into? 00:41:50.240 |
Can you speak to how much opportunity there is for that? 00:41:58.720 |
and I thought it was fantastic and I reacted to the same thing where he said 00:42:04.960 |
immediately watched the feedback, where the drop-off was and then responded to 00:42:08.160 |
that in the afternoon. Which is quite different from how 00:42:12.000 |
people make podcasts for example. I mean the feedback loop is almost 00:42:15.520 |
non-existent. So if we back out one level, I think 00:42:20.800 |
actually both for music and podcasts, which we also 00:42:24.000 |
do at Spotify, I think there's a tremendous opportunity 00:42:27.200 |
just for the creation workflow. I think it's really interesting speaking 00:42:31.600 |
to you, because you're a musician, a developer 00:42:34.640 |
and a podcaster. If you think about those three 00:42:37.120 |
different roles, if you make the leap as a musician, 00:42:41.840 |
if you think about it as a software tool chain, really, 00:42:45.840 |
your DAW with the stems, that's the IDE, right? That's where you work in source 00:42:54.000 |
Then you sit around and you play with that and when you're happy you compile 00:42:56.560 |
that thing into some sort of AAC or MP3 or something. 00:43:00.400 |
You do that because you get distribution. There are so many run times for that MP3 00:43:05.440 |
you kind of compile this executable and you ship it out in kind of an old-fashioned 00:43:09.200 |
boxed software analogy. And then you hope for the 00:43:18.160 |
you would never do that. First you go on GitHub and you collaborate with other 00:43:21.120 |
creators. And then you think it'd be crazy to 00:43:24.400 |
just ship one version of your software without doing an A/B test, 00:43:31.440 |
Exactly. And then you would look at the feedback loops and try to optimize 00:43:35.040 |
that thing, right? So I think if you think of it as a very 00:43:38.480 |
specific software tool chain, it looks quite arcane. 00:43:43.360 |
The tools that a music creator has versus what a software developer has. 00:43:47.440 |
So that's kind of how we think about it. Why wouldn't a 00:43:52.000 |
music creator have something like GitHub where you could collaborate 00:43:55.520 |
much more easily? So we bought this company called Soundtrap, 00:43:59.120 |
which has a kind of Google Docs for music approach, 00:44:02.960 |
where you can collaborate with other people on the kind of source code format 00:44:06.480 |
with stems. And I think introducing things like 00:44:09.600 |
AI tools there to help you as you're creating music, 00:44:14.000 |
both in helping you put accompaniment to your music, 00:44:21.360 |
like drums or something, help you master and mix automatically, 00:44:27.200 |
help you understand how this track will perform. Exactly what you would expect 00:44:30.880 |
as a software developer. I think it makes a lot of sense. And I 00:44:34.000 |
think the same goes for a podcaster. I think podcasters will expect to 00:44:37.920 |
have the same kind of feedback loop that Zirosh has. 00:44:40.480 |
Like, why wouldn't you? Maybe it's not healthy, but... 00:44:44.480 |
Sorry, I wanted to criticize the fact that you can overdo it. 00:44:48.080 |
Because a lot of the... And we're in a new era 00:44:56.640 |
And therefore, what people say, you become a slave to the YouTube algorithm. 00:45:06.880 |
as opposed to, say, if you're creating a song, 00:45:10.160 |
becoming too obsessed about the intro riff to the song that keeps people 00:45:16.160 |
listening, versus actually the entirety of the creation process. 00:45:19.280 |
It's a balance. But the fact that there's zero... 00:45:22.240 |
I mean, you're blowing my mind right now, because you're 00:45:25.520 |
completely right that there's no signal whatsoever, 00:45:28.960 |
there's no feedback whatsoever on the creation process in music or podcasting, 00:45:34.240 |
almost at all. And are you saying that Spotify is hoping to help create tools 00:45:41.680 |
to... Not tools, but... - No, tools, actually. - Actually tools for creators. 00:45:47.200 |
- Absolutely. So we have... We've made some acquisitions the last few years 00:45:52.400 |
around music creation. This company called Soundtrap, which is a 00:45:55.520 |
digital audio workstation, but that is browser-based. 00:45:59.040 |
And their focus was really the Google Docs approach, where you can collaborate 00:46:02.000 |
with people much more easily than you could in previous tools. So we 00:46:06.400 |
have some of these tools that we're working with that we want to make 00:46:08.800 |
accessible, and then we can connect it with our 00:46:12.240 |
consumption data. We can create this feedback loop where 00:46:15.280 |
we could help you understand, we could help you 00:46:18.560 |
create and help you understand how you will perform. We also 00:46:22.320 |
acquired this other company within podcasting called Anchor, which is one of 00:46:25.600 |
the biggest podcasting tools, mobile-focused, so really focused on 00:46:30.000 |
simple creation or easy access to creation. But that also 00:46:34.000 |
gives us this feedback loop. And even before that, we 00:46:38.240 |
invested in something called Spotify for Artists and Spotify for Podcasters, 00:46:43.440 |
which is an app that you can download, you can verify that you are that creator. 00:46:47.200 |
And then you get things that software developers have had for 00:46:52.720 |
years. You can see where, if you look at your podcast, for example, 00:46:56.000 |
on Spotify or a song that you release, you can see 00:46:59.200 |
how it's performing, which cities it's performing in, who's listening to it, 00:47:02.480 |
what's the demographic breakup. So similar in the sense that you can 00:47:07.040 |
understand how you're actually doing on the platform. 00:47:10.400 |
So we definitely want to build tools. I think you also interviewed the 00:47:15.520 |
head of research for Adobe, and I think that's an, 00:47:19.520 |
back to Photoshop that you like, I think that's an interesting analogy as 00:47:25.840 |
innovative in helping photographers and artists, and I think 00:47:33.200 |
for music creators, where you could get AI assistance, for example, as you're 00:47:37.040 |
creating music, as you can do with Adobe, where you can, 00:47:41.200 |
I want a sky over here, and you can get help creating that sky. 00:47:48.000 |
doesn't have is a distribution for the content you create. 00:47:55.680 |
if I, you know, whatever creation I make in Photoshop or Premiere, 00:48:01.440 |
I can't get like immediate feedback like I can on YouTube, for example, about 00:48:05.920 |
the way people are responding. And if Spotify is creating those tools, 00:48:13.680 |
But let's talk a little about podcasts. So I have trouble talking to one 00:48:24.480 |
hard to fathom, but on average, 60 to 100,000 00:48:29.280 |
people will listen to this episode. Okay, so it's intimidating. 00:48:34.160 |
Yeah, it's intimidating. So I hosted on Blueberry. 00:48:38.800 |
I don't know if I'm pronouncing that correctly, actually. It looks like most 00:48:42.400 |
people listen to it on Apple Podcasts, Castbox, and Pocketcast, and only about 00:48:47.200 |
a thousand listen on Spotify. Just my podcast, right? 00:48:59.920 |
dominate this? So Spotify is relatively new into this. 00:49:10.800 |
How serious is Spotify about podcasting? Do you see a time where everybody would 00:49:15.440 |
listen to, you know, probably a huge amount of people, 00:49:18.480 |
majority perhaps, listen to music on Spotify? Do you see a 00:49:23.040 |
time when the same is true for podcasting? Well, I certainly hope so. 00:49:28.560 |
That is our mission. Our mission as a company is actually to 00:49:31.840 |
enable a million creators to live off of their art and a billion people be 00:49:35.200 |
inspired by it. And what I think is interesting about that mission is 00:49:38.320 |
it actually puts the creators first, even though it started as a consumer-focused 00:49:42.240 |
company, and it says to be able to live off of 00:49:44.480 |
their art, not just make some money off of their art as well. 00:49:55.520 |
we kind of expanded our mission from being music to being 00:49:58.880 |
audio a while back. And that's not so much because 00:50:05.920 |
we think we made that decision. We think that decision was 00:50:10.000 |
was made for us. We think the world made that decision. Whether we like it or not, 00:50:14.960 |
when you put in your headphones, you're going to make a choice between 00:50:18.960 |
music and a new episode of your podcast or something else. 00:50:25.440 |
We're in that world whether we like it or not. And that's how radio works. 00:50:28.960 |
So we decided that we think it's about audio. 00:50:32.320 |
You can see the rise of audiobooks and so forth. We think audio is this great 00:50:35.600 |
opportunity. So we decided to enter it. And obviously 00:50:40.720 |
Apple and Apple Podcasts is absolutely dominating 00:50:44.240 |
in podcasting. And we didn't have a single podcast 00:50:47.840 |
only like two years ago. What we did though was 00:50:51.440 |
we looked at this and said, "Can we bring something to this?" 00:50:56.640 |
We want to do this, but back to the original Spotify, we had to do 00:51:00.240 |
something that consumers actually value to be able to do this. And the reason 00:51:05.600 |
we've gone from not existing at all to being the 00:51:08.080 |
quite a wide margin, the second largest podcast 00:51:12.320 |
consumption, still wide gap to iTunes, but we're growing quite fast. 00:51:17.120 |
I think it's because when we looked at the consumer problem, 00:51:21.040 |
people said surprisingly that they wanted their podcasts and 00:51:24.560 |
music in the same application. So what we did was we took a little 00:51:29.040 |
bit of a different approach where we said instead of building a separate 00:51:31.360 |
podcast app, we thought, "Is there a consumer problem to solve 00:51:34.960 |
here because the others are very successful already?" 00:51:37.280 |
And we thought there was in making a more seamless experience 00:51:40.480 |
where you can have your podcast and your music in the same application. 00:51:45.120 |
Because we think it's audio to you and that has been successful and 00:51:48.640 |
that meant that we actually had 200 million people to 00:51:54.000 |
So I think we have a good chance because we're taking a different approach than 00:51:57.520 |
the competition. And back to the other thing I mentioned 00:52:04.000 |
end-to-end flow, I think there's a tremendous amount of 00:52:06.960 |
innovation to do around podcasts as a format. 00:52:09.920 |
When we have creation tools and consumption, I think we could 00:52:13.840 |
start improving what podcasting is. I mean podcast is this 00:52:17.440 |
this opaque big like one two hour file that you're streaming, which it really 00:52:25.920 |
it's not interactive, there's no feedback loops, nothing like that. 00:52:28.960 |
So I think if we're gonna win it's gonna have to be because we build a better 00:52:32.080 |
product for creators and for consumers. So we'll 00:52:36.000 |
see, but it's certainly our goal. We have a long way to go. 00:52:39.120 |
Well the creators part is really exciting. You already got me 00:52:42.240 |
hooked there. It's the only stats I have. Blueberry just recently added the stats 00:52:51.440 |
And that's like a huge improvement, but that's still 00:52:56.080 |
nowhere to where you could possibly go in terms of statistics. You just download 00:52:59.440 |
the Spotify podcasters app and verify and then 00:53:01.600 |
then you'll know where people dropped out in this episode. Oh wow, okay. 00:53:05.520 |
The moment I started talking, okay. I might be depressed by this. 00:53:10.160 |
But okay, so one other question. The original Spotify for music, 00:53:18.320 |
and I have a question about podcasting in this line, is 00:53:21.760 |
the idea of albums. I have music aficionados, friends who are 00:53:32.880 |
albums, listening to entire albums of an artist. 00:53:36.480 |
Correct me if I'm wrong, but I feel like Spotify has helped 00:53:44.320 |
So you create your own albums. It's kind of the way, at least I've 00:53:48.720 |
experienced music and I really enjoy it that way. 00:53:51.760 |
One of the things that was missing in podcasting for me, 00:53:55.600 |
I don't know if it's missing. I don't know. It's an open question for me. 00:53:59.200 |
But the way I listen to podcasts is the way I would listen to albums. 00:54:02.720 |
So I take Joe Rogan Experience, and that's an album. 00:54:06.240 |
And I listen, you know, I put that on, and I listen one episode after the next, 00:54:11.600 |
then there's a sequence and so on. Is there room for 00:54:17.120 |
doing what you did for music, doing what Spotify did for music, 00:54:20.720 |
but creating playlists, sort of this kind of playlisting idea of 00:54:26.080 |
breaking apart from podcasting, from individual podcasts and creating 00:54:30.480 |
kind of this interplay? Or have you thought about 00:54:34.640 |
that space? It's a great question. So I think in 00:54:38.480 |
music, you're right. Basically, you bought an album. So it was like you bought 00:54:42.400 |
a small catalog of like 10 tracks, right? It was, again, it was actually a 00:54:46.000 |
lot of consumption. You think it's about what you like, 00:54:49.600 |
but it's based on the business model. Right. So you paid for this 10-track 00:54:53.680 |
service, and then you listen to that for a while. And then when everything was 00:54:57.120 |
flat-priced, you tended to listen differently. 00:55:00.080 |
Now, so I think the album is still tremendously important. That's 00:55:03.120 |
why we have it. And you can save albums and so forth. And 00:55:05.440 |
you have a huge amount of people who really listen according to albums. 00:55:08.480 |
And I like that because it is a creator format. You can tell a longer story 00:55:12.320 |
over several tracks. And so some people listen to just one track. Some people 00:55:16.240 |
actually want to hear that whole story. Now, in podcast, I think 00:55:22.560 |
it's different. You can argue that podcasts might be more like shows on 00:55:26.480 |
Netflix. You have like a full season of Narcos, 00:55:30.000 |
and you're probably not going to do like one episode of Narcos and then one of 00:55:33.040 |
House of Cards. There's a narrative there, and you 00:55:38.240 |
love the cast and you love these characters. So I think people will 00:55:46.000 |
listen to those shows. I do think you follow a bunch of shows at the same 00:55:48.880 |
time. So there's certainly an opportunity to bring you the latest episode of 00:55:52.640 |
whatever the five, six, ten things that you're into. 00:56:00.400 |
specific hosts and love those hosts for a long time because I think there's 00:56:12.880 |
audience is actually sitting here right between us. 00:56:15.440 |
Whereas if you look at something on TV, the audio actually would come from, 00:56:18.960 |
you would sit over there, and the audio would come to you from both of us as if 00:56:22.160 |
you were watching, not as you were part of the conversation. 00:56:24.800 |
So my experience is having listened to podcasts like yours 00:56:28.000 |
and Joe Rogan, I feel like I know all of these people. They have no idea 00:56:32.080 |
who I am, but I feel like I've listened to so many hours of them. 00:56:35.040 |
It's very different from me watching a TV show or an interview. 00:56:39.440 |
So I think you kind of fall in love with people 00:56:43.040 |
and experience it in a different way. So I think 00:56:46.560 |
shows and hosts are going to be very important. I don't think 00:56:49.760 |
that's going to go away into some sort of thing where 00:56:51.920 |
you don't even know who you're listening to. I don't think that's going 00:56:54.000 |
to happen. What I do think is, I think there's a 00:57:00.400 |
because the catalog is growing quite quickly. 00:57:04.000 |
And I think podcasts is only a few, like five, six hundred thousand shows 00:57:10.400 |
right now. If you look back to YouTube, that's 00:57:12.800 |
another analogy of creators. No one really knows if you would lift 00:57:16.720 |
the lid on YouTube, but it's probably billions 00:57:19.120 |
of episodes. And so I think the podcast catalog will probably grow 00:57:23.520 |
tremendously because the creation tools are getting easier. 00:57:27.040 |
And then you're going to have this discovery opportunity that I think is 00:57:30.800 |
really big. So a lot of people tell me that they love their shows, 00:57:34.800 |
but discovery in podcasts kind of suck. It's really hard to get into a new show. 00:57:38.800 |
They're usually quite long. It's a big time investment. So I think there's 00:57:45.600 |
Yeah, for sure. A hundred percent. And even the dumbest, 00:57:49.520 |
there's so many low-hanging fruit, too. For example, 00:57:58.480 |
to try out a podcast. Exactly. Because most podcasts don't have an order to 00:58:06.400 |
And sorry to say, some are better than others episodes. 00:58:12.640 |
So some episodes of Joe Rogan are better than others. And it's 00:58:16.480 |
nice to know which you should listen to to try it out. 00:58:20.400 |
And there's, as far as I know, almost no information 00:58:24.400 |
in terms of like upvotes on how good an episode is. 00:58:32.080 |
it's kind of like music. There isn't one answer. People use music for different 00:58:35.600 |
things. And there's actually many different types of music. There's workout 00:58:38.080 |
music and there's classical piano music and focus music and 00:58:41.200 |
and so forth. I think the same with podcasts. Some podcasts are sequential. 00:58:45.360 |
They're supposed to be listened to in order. It's actually 00:58:49.760 |
telling a narrative. Some podcasts are one topic, kind of like 00:58:54.800 |
yours, but different guests. So you could jump in anywhere. 00:58:57.280 |
Some podcasts actually have completely different topics. And for those podcasts, 00:59:00.560 |
it might be that we should recommend one episode 00:59:04.560 |
because it's about AI from someone. But then they talk about 00:59:08.480 |
something that you're not interested in the rest of the episodes. 00:59:10.880 |
So I think what we're spending a lot of time on now is just first 00:59:14.560 |
understanding the domain and creating kind of the knowledge graph 00:59:18.400 |
of how do these objects relate and how do people consume. And I think we'll find 00:59:22.960 |
that it's going to be different. I'm excited. 00:59:27.440 |
Spotify is the first people I'm aware of that are 00:59:32.320 |
trying to do this for podcasting. Podcasting has been like a wild west 00:59:36.800 |
up until now. It's been a very... We want to be very careful though because it's 00:59:41.360 |
been a very good wild west. I think it's this fragile 00:59:49.360 |
don't barge in and say like, "Oh, we're gonna 00:59:52.080 |
internetize this thing." And you have to think about the 00:59:55.760 |
creators. You have to understand how they get distribution today, who 00:59:59.920 |
listens to how they make money today, try to make sure that their 01:00:06.080 |
I think it's back to doing something, improving their products 01:00:09.440 |
like feedback loops and distribution. So jumping back into terms of this 01:00:15.760 |
fascinating world of recommender system and listening to music and using 01:00:19.920 |
machine learning to analyze things, do you think it's 01:00:23.600 |
better to... What currently, correct me if I'm wrong, 01:00:28.240 |
but currently Spotify lets people pick what they listen to for the 01:00:32.720 |
most part. There's a discovery process but you kind of 01:00:35.520 |
organize playlists. Is it better to let people pick what they listen to 01:00:40.800 |
or recommend what they should listen to? Something like Stations by Spotify that 01:00:46.320 |
I saw that you're playing around with. Maybe you can tell me what's the status 01:00:50.480 |
of that. This is a Pandora style app that just kind of... 01:00:54.400 |
As opposed to you select the music you listen to, it kind of 01:00:58.800 |
feeds you the music you listen to. What's the status of Stations by Spotify? 01:01:04.080 |
What's its future? The story of Spotify as we have grown 01:01:09.600 |
to different audiences. Stations is another one of those where 01:01:15.360 |
the question is, some people want to be very specific. They actually want to hear 01:01:19.040 |
"Stairway to Heaven" right now. That needs to be very easy to do. 01:01:24.000 |
Some people or even the same person at some point might say 01:01:27.840 |
"I want to feel upbeat" or "I want to feel happy" or 01:01:31.520 |
"I want songs to sing in the car". So they put in 01:01:34.640 |
the information at a very different level and then we need to 01:01:37.760 |
translate that into what that means musically. So Stations is a test to 01:01:42.800 |
create like a consumption input vector that is much simpler where you can just 01:01:45.920 |
tune it a little bit and see if that increases the overall 01:01:49.360 |
reach. But we're trying to kind of serve the entire gamut of super advanced so-called 01:01:59.520 |
they love listening to music but it's not their number one priority in life. 01:02:03.040 |
They're not going to sit and follow every new release from every new 01:02:05.600 |
artist. They need to be able to influence music 01:02:12.640 |
can think of it as different products and I think when 01:02:14.880 |
one of the interesting things to answer your question on 01:02:19.440 |
if it's better to let the user choose or to play, I think the answer is 01:02:23.760 |
the challenge when machine learning kind of came along 01:02:27.840 |
there was a lot of thinking about what does product development mean 01:02:31.520 |
in a machine learning context. People like Andrew Ng for example 01:02:36.560 |
when he went to Baidu he started doing a lot of practical machine learning, went 01:02:39.600 |
from academia and he thought a lot about this and he 01:02:42.640 |
had this notion that a product manager, designer, an engineer, they used to 01:02:46.320 |
work around this wireframe. Kind of describe what the product should look 01:02:49.280 |
like or something to talk about. When you're doing like a chatbot or a 01:02:52.320 |
playlist, what are you going to say? Like it should be good. 01:02:55.520 |
That's not a good product description. So how do you do that and he came up 01:02:58.880 |
with this notion that the test set is the new wireframe. The 01:03:03.520 |
job of the product manager is to source a good test set that is 01:03:06.080 |
representative of what, like if you say like I want to play this 01:03:09.120 |
that is Songstressing in the car. The job of the product manager is to go 01:03:12.880 |
and source like a good test set of what that means. 01:03:15.440 |
Then you can work with engineering to have algorithms to try to produce that 01:03:22.080 |
structure product development for a machine learning age and what we 01:03:27.040 |
discovered was that a lot of it is actually in the expectation 01:03:35.280 |
Let's say that if you set the expectation with the user that this 01:03:42.640 |
you're actually setting the expectation that most of what we show you will not 01:03:45.760 |
be relevant. When you're in the discovery process 01:03:48.080 |
you're going to accept that actually if you find one gem every 01:03:51.760 |
Monday that you totally love, you're probably going to be happy. 01:03:55.200 |
Even though the statistical meaning one out of ten is terrible or one out of 20 01:03:59.600 |
is terrible from a user point of view because the setting was discovered is 01:04:02.400 |
fine. Can I say to interrupt real quick, I just 01:04:05.840 |
actually learned about Discover Weekly which is a Spotify, 01:04:10.560 |
I don't know, it's a feature of Spotify that shows you 01:04:13.760 |
cool songs to listen to. Maybe I can do issue tracking, I couldn't 01:04:18.480 |
find it on my Spotify app. It's in your library. It's in the library, 01:04:22.640 |
it's in the list of libraries because I was like whoa this is cool I 01:04:25.120 |
didn't know this existed and I tried to find it. 01:04:27.440 |
I will show it to you and feedback to our product team. 01:04:38.800 |
basically that you're going to discover new songs. 01:04:45.440 |
the recommendations you do but we have another product called 01:04:50.400 |
Daily Mix which kind of implies that these are only going to be your 01:04:53.600 |
favorites. So if you have one out of ten that is 01:04:56.240 |
good and nine out of ten that doesn't work for you, 01:04:58.320 |
you're going to think it's a horrible product. So actually a lot of the product 01:05:00.640 |
development we learned over the years is about 01:05:02.800 |
setting the right expectations. So for Daily Mix, you know algorithmically 01:05:07.440 |
we would pick among things that feel very safe in 01:05:10.240 |
your taste space. With Discover Weekly we go kind of wild 01:05:13.360 |
because the expectation is most of this is not gonna. So a lot of 01:05:19.200 |
a lot of should you let the user pick or not it depends. 01:05:23.040 |
We have some products where the whole point is that the user can click play 01:05:26.320 |
put the phone in the pocket and it should be really good music for like 01:05:29.280 |
an hour. We have other products where you probably need to say like no 01:05:37.120 |
I see that makes sense and then the radio product the station's product is 01:05:40.480 |
one of these like click play put in your pocket for hours. 01:05:43.440 |
That's really interesting so you're thinking of different test sets 01:05:47.120 |
for different users and trying to create products that sort of optimize 01:05:53.760 |
optimize for those test sets that represent a specific set of users. 01:05:58.560 |
Yes I think one thing that I think is interesting is 01:06:03.680 |
we invested quite heavily in editorial in people creating playlists 01:06:07.920 |
using statistical data and that was successful for us and then we also 01:06:11.600 |
invested in machine learning and for the longest time you know within 01:06:16.240 |
Spotify and within the rest of the industry there was always this 01:06:18.640 |
narrative of humans versus the machine. Algo versus editorial and editors 01:06:24.160 |
would say like well if I had that data if I could see your 01:06:27.600 |
playlisting history and I made a choice for you I would have 01:06:30.320 |
made a better choice and they would have because they 01:06:32.960 |
understand they're much smarter than these algorithms. The human is 01:06:35.760 |
incredibly smart compared to our algorithms. They can take culture 01:06:39.760 |
into account and so forth. The problem is that they can't make 200 01:06:43.440 |
million decisions you know per hour for every user that 01:06:47.360 |
logs in so the algo may be not as sophisticated but much more 01:06:53.440 |
contradiction but then a few years ago we started 01:06:57.280 |
focusing on this kind of human in the loop thinking around machine learning 01:07:01.280 |
and we actually coined an internal term for it called algotorial 01:07:05.200 |
the combination of algorithms and editors where 01:07:08.800 |
if we take a concrete example you think of the editor 01:07:12.480 |
this paid expert that we have that's really good at something like 01:07:17.920 |
soul, hip-hop, EDM something right there are two experts no one in the industry 01:07:23.520 |
so they have all the cultural knowledge you think of them as the product manager 01:07:27.520 |
and you say that let's say that you want to create a 01:07:31.920 |
you think that there's a there's a product need in the world for something 01:07:35.040 |
like songs to sing in the car or songs to sing in the shower 01:07:37.360 |
I'm taking that example because it exists people love to scream 01:07:40.720 |
songs in the car when they drive right yeah so you want to create that product 01:07:44.640 |
then you have this product manager who's a musical expert 01:07:47.520 |
they create they come up with a concept like I think this is a missing thing in 01:07:51.040 |
humanity like a playlist called songs in the car 01:07:54.720 |
they create the the framing the image the title 01:07:58.480 |
and they create a test set of they create a group of songs like a few 01:08:01.920 |
thousand songs out of the catalog that they manually 01:08:04.320 |
curate that are known songs that are great to sing in the car 01:08:08.080 |
and they can take like true romance into account they understand things that our 01:08:11.440 |
algorithms do not at all so they have this huge set of tracks 01:08:15.120 |
then when we deliver that to you we look at your taste vectors and you 01:08:19.200 |
get the 20 tracks that are songs to sing in the car in your taste 01:08:26.320 |
editorial input in the same process if that makes sense yeah it makes 01:08:30.960 |
total sense and I have several questions around that this is a this is like 01:08:35.280 |
fascinating okay so first it is a little bit surprising to me 01:08:40.640 |
that the world expert humans are outperforming machines 01:08:47.120 |
at specifying songs to sing in the car so maybe you could talk to that a 01:08:54.160 |
little bit I don't know if you can put it into words but 01:09:00.800 |
uh of do you really uh I guess what I'm trying to ask is there 01:09:06.160 |
how difficult is it to encode the cultural references 01:09:13.840 |
all all those things together can machine learning really not do that 01:09:17.920 |
I mean I think machine learning is great at replicating patterns 01:09:22.640 |
if you have the patterns but if you try to write with me a spec of what songs 01:09:26.960 |
greatest song to sing in the car definition is is it is it loud does it 01:09:31.200 |
have many choruses should it have been in movies it's 01:09:33.920 |
it quickly gets incredibly complicated right yeah 01:09:36.960 |
and and a lot of it may not be in the structure of the song or the title it 01:09:41.120 |
could be cultural references because you know it was a history so so the 01:09:45.920 |
definition problems quickly get and I think that was the that 01:09:49.520 |
was the insight of Andrew Ng when he said the job of the product 01:09:54.640 |
that algorithms don't and then define what that looks like and then you have 01:09:59.120 |
something to train towards right then you have kind of the test set 01:10:02.720 |
and then so so today the editors create this pool of tracks and then we 01:10:06.400 |
personalize you could easily imagine that once you have this set you could 01:10:09.920 |
have some automatic exploration of the rest of the catalog 01:10:12.480 |
because then you understand what it is and then the other side of it when 01:10:16.000 |
machine learning does help is this taste vector how hard is it to 01:10:21.440 |
construct a vector that represents the things an 01:10:26.080 |
individual human likes this human preference so you can 01:10:31.600 |
you know music isn't like it's not like amazon 01:10:35.520 |
like things you usually buy music seems more amorphous like it's this 01:10:41.200 |
thing that's hard to specify like what what is well you know if you look at my 01:10:46.320 |
playlist what is the music that I love it's harder 01:10:49.360 |
it seems to be uh much more difficult to specify concretely 01:10:57.200 |
it is very hard in the sense that you need a lot of data 01:11:00.720 |
and I think what we found was that so it's not 01:11:04.400 |
so it's not a stationary problem it changes over time 01:11:07.840 |
um and so we've gone through the journey of if if um 01:11:14.240 |
you've done a lot of computer vision obviously I've done a bunch of computer 01:11:17.280 |
vision in my past and we started kind of with the 01:11:19.840 |
handcrafted heuristics for you know this is kind of in the music 01:11:24.880 |
this is this and if you consume this you probably like this 01:11:27.520 |
so we we have we started there and we have some of that still 01:11:31.280 |
then what was interesting about the playlist data was that you could find 01:11:34.240 |
these latent things that wouldn't necessarily even make sense to 01:11:42.960 |
things that that wouldn't have appeared kind of mechanistically either in the 01:11:56.080 |
I think the core assumption is that there are patterns 01:12:01.120 |
in in almost everything and if there are patterns 01:12:05.040 |
these these embedding techniques are getting better and better now now 01:12:08.400 |
as everyone else we're also using kind of deep embeddings where you can 01:12:12.880 |
encode binary values and and so forth um and and what I think is 01:12:17.520 |
interesting is is this process to try to find things 01:12:24.480 |
actually have have guessed so it is very hard in a in a in an 01:12:28.880 |
engineering sense to find the right dimensions it's an 01:12:31.760 |
incredible scalability problem to do for hundreds of millions of users and to 01:12:44.880 |
the fact that you try to find some principal components or something like 01:12:47.920 |
that dimensionality reduction and so forth so 01:12:52.000 |
is very very hard and it's a it's a huge engineering challenge but fortunately we 01:12:56.800 |
have some amazing both research and engineering teams in 01:13:00.320 |
in this space yeah I guess the the question is all 01:13:05.280 |
I mean it's similar I deal with it with an autonomous vehicle space is the 01:13:14.160 |
basically the question is of edge cases uh so embedding probably works 01:13:22.720 |
not probably but I would imagine works well in a lot of cases 01:13:27.760 |
so there's a bunch of questions that arise then so do 01:13:31.280 |
song preferences does your taste vector depend on 01:13:34.720 |
context like mood right so there's different moods and 01:13:41.200 |
absolutely so how does that take in it is it is it possible to take that as a 01:13:47.600 |
consideration or do you just leave that as a interface 01:13:51.600 |
problem that allows the user to just control it 01:13:54.000 |
so when I'm looking for a workout music I kind of specify it by 01:13:58.320 |
choosing certain playlists doing certain search yeah 01:14:01.520 |
so that's a great point it's back to the product development 01:14:04.800 |
you could try to spend a few years trying to predict which mood you're in 01:14:08.560 |
automatically when you open Spotify or you create a tab which is happy and 01:14:12.240 |
sad right and you're going to be right 100% of the time with one click 01:14:15.600 |
now it's probably much better to let the user tell you if they're happy or sad 01:14:19.440 |
or if they want to work out on the other hand if your user interface become 2000 01:14:25.760 |
no one will use the product so then you have to get better 01:14:28.560 |
so it's this thing where I think maybe it was 01:14:32.480 |
I remember who coined it but it's called fault tolerant uis right you build a ui 01:14:35.760 |
that is tolerant to being wrong and then you can be much less right in 01:14:40.400 |
your in your in your algorithms so we you know 01:14:44.160 |
we've had to learn a lot of that building the right ui that 01:14:50.240 |
and and and a great discovery there which is which was by the teams during 01:14:55.120 |
uh one of our hack days was this thing of taking discovery packaging it 01:14:59.600 |
into a playlist and saying that these are new tracks 01:15:03.840 |
that we think you might like based on this and setting the right expectation 01:15:07.280 |
made it made it a great product so I think we 01:15:10.080 |
have this benefit that for example Tesla doesn't have that we can we can 01:15:15.440 |
we can change the expectation we can we can build a fault tolerant 01:15:18.320 |
setting it's very hard to be fault tolerant when you're driving at a 01:15:21.200 |
you know 100 miles per hour or something and and we we have the luxury of 01:15:26.160 |
being able to say that of being wrong if we have the right 01:15:29.680 |
ui which gives us different abilities to take more risk so I actually think 01:15:34.720 |
the self-driving problem is is much harder oh yeah 01:15:38.400 |
for sure it's much less fun because people die exactly 01:15:45.200 |
and since Spotify uh it's such a more fun problem because 01:15:51.280 |
failure will I mean failure is beautiful in a way it leads to exploration so it's 01:15:56.640 |
it's a really fun reinforcement learning problem the worst case scenario is you 01:15:59.760 |
get these wtf tweets like how the hell did I get this this song 01:16:03.280 |
which is which is a lot better than the self-driving failure 01:16:07.040 |
so what's the feedback that a user what's the signal 01:16:12.080 |
that a user provides into the system so the the you mentioned skipping 01:16:19.360 |
what is like the strongest signal is uh you didn't mention clicking like 01:16:24.800 |
so so we have a few signals that are important obviously 01:16:28.240 |
playing playing through so so one of the benefits of music actually even compared 01:16:32.880 |
to podcast or or movies is the object itself is really 01:16:37.760 |
only about three minutes so you get a lot of chances to recommend 01:16:41.360 |
and the feedback loop is is every three minutes instead of every 01:16:50.880 |
and so you can see if people played through or if the which is you know the 01:16:53.760 |
inverse of skip really that's an important signal on the other 01:16:57.040 |
hand much of the consumption happens when your phone is in your pocket maybe 01:17:00.480 |
you're running or driving or you're playing on a speaker 01:17:03.040 |
and so you not skipping doesn't mean that you love that song it might be that 01:17:06.240 |
it wasn't bad enough that you would walk up and skip so it's a noisy signal 01:17:10.560 |
then then we have the equivalent of the like which is you saved it to your 01:17:15.440 |
affection and then we have the more explicit signal of 01:17:20.640 |
playlisting like you took the time to create a playlist you put it in there 01:17:24.000 |
there's a very little small chance that if you took 01:17:27.520 |
all that trouble this is not a really important track to you 01:17:30.480 |
and then we understand also what other tracks it relates to so we have 01:17:34.800 |
we have the playlisting we have the like and then we have the listening or skip 01:17:39.120 |
and and you have to have very different approaches to all of them because at 01:17:42.720 |
different levels of of noise one one is very voluminous but 01:17:46.080 |
noisy and the other is rare but you can you can probably trust it yeah 01:17:50.720 |
it's interesting because uh i i think between those signals captures 01:17:54.960 |
all the information you'd want to capture i mean there's a feeling 01:17:58.800 |
a shallow feeling for me that there's sometimes i'll hear a song that's like 01:18:02.320 |
yes this is you know this is the right song for 01:18:05.040 |
the moment but there's really no way to express 01:18:08.160 |
that fact except by listening through it all the way 01:18:11.680 |
yeah and maybe playing it again at that time or something yeah 01:18:15.280 |
there's no need for a button that says this was the best song could have heard 01:18:19.680 |
at this moment well we're playing around with that with 01:18:22.480 |
kind of the thumbs up concept saying like i really like this 01:18:25.200 |
just kind of talking to the algorithm it's unclear if that's 01:18:28.720 |
the best way for humans to interact maybe it is maybe they should think of 01:18:32.160 |
spotify as a person an agent sitting there trying to serve you and you can 01:18:35.920 |
say like bad spotify good spotify right now the 01:18:39.360 |
analogy we've had is more you shouldn't think of of us we should 01:18:43.280 |
be invisible and the feedback is if you save it 01:18:46.640 |
kind of you work for yourself you do a playlist because you think is great and 01:18:49.920 |
we can learn from that it's kind of back to back to tesla how 01:18:53.680 |
they kind of have this shadow mode they sit in what you drive 01:18:56.800 |
we kind of took the same analogy we sit in what you playlist 01:19:00.400 |
and then maybe we can we can offer you an autopilot where you can take over for 01:19:03.360 |
a while or something like that and then back off if you say like that's 01:19:06.720 |
not that's not good enough but but i think it's interesting to figure 01:19:09.840 |
out what your mental model is if spotify is an ai that you talk to 01:19:15.200 |
which i think might be a bit too abstract for for many 01:19:18.880 |
consumers or if you still think of it as it's my music app 01:19:22.560 |
but it's just more helpful and depends on the device it's 01:19:31.040 |
so i have a lot of the spotify listening i do is on 01:19:35.360 |
things that on devices i can talk to whether it's from amazon google or 01:19:42.320 |
devices how do you think of it differently than 01:19:44.800 |
on the phone or on the desktop there are a few things to say about the 01:19:50.960 |
first of all it's incredibly exciting they're growing like 01:19:53.360 |
crazy especially here in the in the in the u.s 01:19:57.360 |
and it's solving a consumer need that i think is 01:20:08.400 |
just remote interactivity you can control this thing from from from across 01:20:11.840 |
the room and it may feel like a small thing but 01:20:14.720 |
it turns out that friction matters to consumers being 01:20:17.920 |
able to say play pause and so forth from across 01:20:20.960 |
the room is is very powerful so basically you made you made the 01:20:29.040 |
what we see in our data is that the number one use case for these speakers 01:20:33.600 |
is music music and podcast so fortunately for us it's been important 01:20:39.200 |
to these companies to have those use case covered so they 01:20:42.720 |
want to spotify on this we have very good relationships with 01:20:45.520 |
with them and we're seeing we're seeing tremendous 01:20:50.000 |
success with them what what i think it's interesting about them is 01:20:55.200 |
it's already working we we we kind of had this epiphany 01:21:01.360 |
many years ago back when we started using sonos if you went through all the 01:21:05.280 |
trouble of setting up your sonos system you had this magical experience where 01:21:08.800 |
you had all the music ever made in your living room and and we we we 01:21:13.440 |
made this assumption that the the home everyone used to have a cd 01:21:16.720 |
player at home but they never managed to get their files 01:21:19.440 |
working in the home having this network attached storage was too cumbersome for 01:21:22.880 |
most consumers so we made the assumption that the home 01:21:25.840 |
would skip from the cd all the way to the streaming box 01:21:29.040 |
where where you would get you would buy the stereo and have all the music built 01:21:32.000 |
in that took longer than we thought but with the voice speakers that was the 01:21:35.040 |
unlocking that made kind of the connected speaker 01:21:38.480 |
happen in the home so so it really it really exploded and 01:21:43.600 |
we saw this engagement that we predicted would happen 01:21:47.040 |
what i think is interesting though is where it's going from now 01:21:50.320 |
right now you think of them as voice speakers but i think if you look at 01:21:54.480 |
uh google io for example they just added a camera 01:21:58.480 |
to it where you know when the alarm goes off instead of saying 01:22:06.320 |
so i think they're going to think more of it as a 01:22:09.440 |
as an agent or as a as an assistant truly an assistant and an assistant that 01:22:14.320 |
can see you it's going to be much more effective than 01:22:16.880 |
than a blind assistant so i think these things will morph and we won't 01:22:20.160 |
necessarily think of them as quote-unquote voice speakers anymore 01:22:26.320 |
interactive access to the internet in the home 01:22:30.080 |
but i still think that the biggest use case for those will be 01:22:34.240 |
will be audio so for that reason we're investing heavily in it 01:22:37.600 |
and we built our own nlu stack to be able to the the challenge here is 01:22:43.680 |
how do you innovate in that world it's it's it lowers friction for consumers 01:22:47.280 |
but it's also much more constrained there you have no pixels to play with 01:22:53.280 |
vocabulary that is the interface so we started 01:22:56.880 |
investing and playing around quite a lot with that trying to understand 01:22:59.680 |
what the future will be of you speaking and gesturing and 01:23:03.200 |
waving at your music and actually uh you're actually nudging 01:23:06.880 |
closer to the autonomous vehicle space because from everything i've seen the 01:23:11.520 |
level of frustration people experience upon failure 01:23:14.640 |
of natural language understanding is much higher 01:23:17.760 |
than failure in other contexts people get frustrated really fast 01:23:21.680 |
so if you screw that experience up even just a little bit they give up really 01:23:26.240 |
quickly yeah and i think you see that in the data 01:23:29.680 |
while while it's tremendously successful the most common interactions are play 01:23:38.320 |
you compare it to taking up your phone unlocking it bringing up the app and 01:23:41.200 |
skipping clicking skip yeah it was it was much 01:23:44.400 |
lower friction but then uh for for longer more 01:23:48.160 |
complicated things like can you find me that song 01:23:50.640 |
people still bring up their phone and search and then play it on their speaker 01:23:53.360 |
so we tried again to build a fault tolerant ui where for the more for the 01:23:57.280 |
more complicated things you can still pick up your phone have 01:24:00.400 |
powerful full keyboard search and then try to optimize for where there 01:24:04.880 |
is actually lower friction and try to it's it's kind of like the 01:24:08.160 |
test autopilot thing you have to be at the level where 01:24:11.440 |
you're helpful if you're too smart and just in the way people are going to get 01:24:15.360 |
frustrated and first of all i'm not obsessed with 01:24:18.480 |
stairway to heaven it's just a good song but let me mention that as a use case 01:24:22.320 |
because it's an interesting one i've literally told 01:24:26.000 |
one of i don't want to say the name of the speaker because it'll when people 01:24:29.120 |
are listening to it it'll make their speaker go off but i talk to the 01:24:32.560 |
speaker and i say play stairway to heaven and every time 01:24:37.840 |
it like not every time but a large percentage of the time plays the wrong 01:24:41.200 |
stairway to heaven it plays like some cover of the and 01:24:47.040 |
that part of the experience i actually wonder from a business perspective does 01:24:56.320 |
it seems like the nlu the the natural language stuff 01:24:59.840 |
is controlled by the speaker and then spotify stays at a layer below that 01:25:04.720 |
it's a good and complicated question some of which is 01:25:08.800 |
dependent on the on the partner so it's hard to comment on the on the specifics 01:25:19.680 |
personalization i mean we know which stairway to heaven 01:25:22.400 |
and and the truth is maybe for for one person it is exactly the cover that they 01:25:28.880 |
plays i i think we i think we default to the right version but 01:25:32.880 |
but you actually want to be able to do the cover for the person that just play 01:25:35.840 |
the cover 50 times or spotify is just going to seem stupid 01:25:39.440 |
so you want to be able to leverage the personalization but you have this stack 01:25:43.040 |
where where you have the the asr and this thing called the end best list of 01:25:47.600 |
the end best guesses here and then the person comes in at the 01:25:50.960 |
end you actually want the personalization to be here when you're 01:25:53.280 |
guessing about what they actually meant so we're working with these partners um 01:25:57.840 |
and it's a complicated it's a complicated thing where 01:26:02.240 |
you want to you want to be able so first of all you want to be very careful with 01:26:05.920 |
your users data you don't want to share your users data without their permission 01:26:09.200 |
but you want to share some data so that their experience gets better 01:26:12.240 |
um so that these partners can understand enough but not too much and so forth 01:26:16.400 |
so it's really the the trick is that it's like a business 01:26:20.720 |
driven relationship where you're doing product development across companies 01:26:25.840 |
complicated but this is exactly why we built our own 01:26:29.360 |
nlu so that we actually can make personalized guesses because this is the 01:26:34.160 |
biggest frustration from a user point of view they don't 01:26:39.280 |
and business deals they're like how hard can it be i've told this thing 01:26:42.720 |
50 times this version and still it plays the wrong thing it can't it can't be 01:26:48.800 |
the user the user is not going to understand the 01:26:51.280 |
complications of business we have to solve it let's talk 01:26:55.280 |
about sort of a complicated subject that i myself i'm quite 01:27:11.920 |
2018 over 11 billion dollars were paid to rights holders 01:27:17.200 |
so and further distributed to artists from spotify 01:27:21.280 |
so a lot of money is being paid to artists first of all 01:27:25.520 |
the whole time as a consumer for me when i look at spotify 01:27:29.680 |
i'm not sure i'm remembering correctly but i think you said exactly how i feel 01:27:38.400 |
when i started using spotify i assumed you guys would go bankrupt in like a 01:27:47.360 |
it's like this is amazing uh so one question i have is sort of the 01:27:52.960 |
bigger question how do you make money in this complicated world 01:27:56.320 |
how do you deal with the relationship with record labels who 01:28:02.400 |
are complicated uh these big you're essentially in have the task 01:28:09.440 |
of herding cats but like rich and powerful cats 01:28:16.080 |
and also have the task of paying artists enough and paying 01:28:20.080 |
those labels enough and still making money in the internet space where people 01:28:24.320 |
are not willing to pay hundreds of dollars a month so how do 01:28:29.200 |
you navigate the space how do you navigate that's a beautiful 01:28:32.160 |
description herding rich cats yeah i've never heard that before 01:28:39.760 |
certainly actually betting against spotify has been statistically a very 01:28:44.080 |
smart thing to do just looking at the at the line of roadkill in music 01:28:48.640 |
streaming services um it's it's kind of i think if i had 01:28:54.160 |
understood the complexity when i joined spotify 01:28:57.440 |
unfortunately fortunately i didn't know enough about 01:29:00.800 |
the the music industry to understand the complexities because then i would have 01:29:03.920 |
made a more rational guess that it wouldn't work 01:29:11.200 |
there have been a few distinct challenges i think as i said one of the 01:29:15.760 |
things that made it work at all was that sweden and the nordics 01:29:19.040 |
was a lost market so um there were you know there was there was no risk 01:29:23.760 |
for labels to try this i don't think it would have worked if 01:29:27.680 |
if the market was uh was healthy so so that was the initial condition then 01:29:34.480 |
then we had this tremendous challenge with the model itself so 01:29:38.400 |
now most people were pirating but for the people who bought a download or a cd 01:29:43.600 |
the artists would get all the revenue for all the future plays 01:29:47.600 |
then right so you got it all up front whereas the streaming model was like 01:29:51.360 |
almost nothing day one almost nothing day two 01:29:53.440 |
and then at some point this curve of incremental revenue 01:29:57.600 |
would intersect with your day one payment and that took a long time to 01:30:01.520 |
play out before before um the music labels they understood 01:30:06.080 |
that but on the artist side it took a lot of time to understand that actually 01:30:09.920 |
if i have a big hit that is going to be played for for for many years this is a 01:30:13.040 |
much better model because i get paid based on how much 01:30:16.000 |
people use the product not how much they thought they would use 01:30:18.880 |
it day one or so forth so it was a complicated model to get 01:30:23.040 |
across and but time helped with that right and 01:30:28.720 |
actually are bigger again then you know it's gone through this 01:30:31.760 |
incredible dip and now they're back up and so we're 01:30:36.960 |
part of that um so there have been distinct problems 01:30:45.200 |
we have taken the painful approach some of our competition at the time they kind 01:30:52.640 |
companies and said if we just if we just ignore the rights 01:30:55.840 |
we get really big really fast we're going to be too big for the 01:30:59.600 |
for the labels to kind of too big to fail they're not going to kill us we 01:31:03.040 |
didn't take that approach we went legal from day one 01:31:06.080 |
and we we negotiated and negotiated and negotiated it was very slow it's very 01:31:09.680 |
frustrating we were angry at seeing other companies 01:31:12.240 |
taking shortcuts and seeming to get away with it 01:31:14.720 |
it was this this this game theory thing where over many rounds of playing the 01:31:18.800 |
game this would be the right strategy and even 01:31:24.720 |
at times during renegotiations there is this there is this weird trust 01:31:29.200 |
where we have been honest and fair we've never screwed them they've never 01:31:34.960 |
screwed us it's tenuous but there's this trust and like they know 01:31:42.800 |
of people do not want to listen to music and want to pay for it 01:31:45.360 |
spotify has no business model so we actually are incredibly 01:31:48.640 |
aligned right other companies not to be tennis but other companies have other 01:31:53.200 |
business models where even if they made no music from no money for music 01:31:56.800 |
they'd still be profitable companies but spotify won't so and i think the 01:32:00.400 |
industry sees that we are actually aligned business-wise 01:32:05.200 |
so there is this this trust that allows us to 01:32:11.920 |
um you know taking risks the free model itself 01:32:15.840 |
was an incredible risk for the music industry to take that they should get 01:32:19.680 |
credit for now some of it was that they had nothing to lose in sweden but 01:32:22.400 |
frankly a lot of the labels also took risk and so i 01:32:26.160 |
think we built up that trust with it with the i think uh hurting 01:32:30.000 |
with cats sounds a bit what's the word it sounds 01:32:35.920 |
no every cat mattered they're all beautiful and very important 01:32:39.360 |
exactly they've taken a lot of risks and certainly it's been frustrating a lot of 01:32:46.400 |
playing it's it's game theory if you play the 01:32:49.360 |
if you play the game many times then you can have the statistical outcome that 01:32:53.760 |
you bet on and it feels very painful when you're in the middle of that 01:32:57.040 |
thing i mean there's risk there's trust there's relationships 01:33:00.560 |
from uh just having read the biography of steve jobs 01:33:05.040 |
similar kind of relationships were discussed in itunes 01:33:08.400 |
the idea of selling a song for a dollar was very uncomfortable 01:33:12.080 |
for labels and exactly and there was no it was the same kind of thing it was 01:33:20.160 |
relationships that had to be built and uh it's really a terrifyingly 01:33:24.560 |
difficult process that apple could go through a 01:33:28.720 |
little bit because they could afford for that process to fail for 01:33:32.960 |
spotify it seems terrifying because uh you can't initially i think a lot of it 01:33:39.600 |
comes out comes down to you know honestly daniel and his tenacity 01:33:43.120 |
in in negotiating which seems like an impossible 01:33:45.920 |
it's a fun task because you know he was completely unknown and so forth but 01:33:50.880 |
maybe that was also the reason that that it worked 01:33:59.280 |
yeah i think game theory is probably the best way to think about it you could 01:34:03.360 |
straight go straight for this like nash equilibrium that 01:34:06.320 |
someone is going to defect or or you play many times you try to actually 01:34:10.400 |
go for the top left the corporations sell is there any magical reason why 01:34:16.880 |
spotify seems to have won this so a lot of people have tried to do 01:34:22.080 |
what spotify tried to do and spotify has come out well so the 01:34:26.320 |
answer is that there's no magical reason because i don't believe in magic 01:34:33.520 |
and i think some of them are that people have 01:34:41.440 |
the actual the actual spotify model is very complicated they've looked at the 01:34:45.360 |
premium model and said it seems like you can you can 01:34:48.800 |
charge 9.99 for music and people are going to pay but that's 01:34:52.400 |
not what happened actually when we launched the original mobile product 01:34:55.520 |
everyone said they would never pay what happened was they started on the on 01:34:59.280 |
the free product and then their engagement grew so much 01:35:02.640 |
that eventually they said maybe it is worth 9.99 right 01:35:06.720 |
it's uh it's your propensity to pay grows with your engagement 01:35:10.080 |
so we have this super complicated business model where you operate two 01:35:13.600 |
different business model advertising and premium at the same time 01:35:16.800 |
and i think that is hard to replicate i have i struggle to think of other 01:35:20.240 |
companies that run large-scale advertising and 01:35:22.960 |
subscription products at the same time so i think the business model is 01:35:29.040 |
think it is and and so some people went after just the premium part without the 01:35:33.360 |
free part and ran into a wall where no one wanted 01:35:39.440 |
music should be free just ads which doesn't give you enough revenue and 01:35:42.800 |
doesn't work for the music industry so i think that combination is um it's 01:35:46.960 |
kind of opaque from the outside so maybe i shouldn't say it here and 01:35:49.920 |
reveal the secret but that that turns out to be harder to 01:35:53.200 |
replicate than you would think so there's a lot of 01:35:57.120 |
brilliant business strategy here brilliance or luck probably more luck 01:36:02.480 |
but it doesn't really matter it looks brilliant in retrospect 01:36:05.520 |
let's call it brilliant yeah when the books are written it'll be brilliant 01:36:10.480 |
you've uh mentioned that your philosophy is to embrace change 01:36:16.560 |
so how will the music streaming and music listening world change over the 01:36:25.600 |
far future what do you think i think that music and 01:36:30.480 |
for that matter audio podcasts audio books i think it's 01:36:34.960 |
one of the few core human needs i think it there is no good reason to 01:36:39.440 |
me why it shouldn't be at the scale of something like 01:36:42.320 |
messaging or social networking i don't think it's a niche thing 01:36:46.000 |
to listen to music or news or something so i think scale is obviously one of the 01:36:49.680 |
things that i really hope for i think i hope that it's going to be billions of 01:36:54.080 |
users i hope eventually everyone in the world gets access to all 01:36:57.200 |
the world's music ever made so obviously i think it's going to be a 01:37:00.320 |
much bigger business otherwise we we wouldn't be betting this big 01:37:04.000 |
uh now if you if you look more at how it is consumed what i'm hoping is back 01:37:15.920 |
where i think i sometimes uh internally i make this analogy to 01:37:21.840 |
to text messaging text messaging was also based on 01:37:26.800 |
standards in the in the area of mobile carriers you had the sms 01:37:30.640 |
the 140 character 120 carat sms and it was great because everyone 01:37:36.160 |
agreed on the standard so as a consumer you got a lot of distributions and 01:37:39.280 |
interoperability but it was a very constrained format 01:37:42.560 |
and and when the industry wanted to add pictures to that format to do the mms 01:37:46.320 |
i looked it up and i think it took from the late 80s to early 2000s this is like 01:37:52.640 |
into that now once that entire value chain of 01:37:58.080 |
creation and consumption got wrapped in one software stack 01:38:04.560 |
like the first week they added disappearing messages like then two 01:38:07.680 |
weeks later they added stories like the pace of 01:38:12.160 |
and you can you can you can affect both creation and consumption 01:38:16.000 |
i think it's going to be rapid so with these streaming services we now for the 01:38:19.360 |
first time in history have enough i hope people on one of these 01:38:24.560 |
services actually whether it's spotify or amazon or apple or youtube 01:38:28.000 |
and hopefully enough creators that you can actually start working 01:38:31.440 |
with the format again and and that excites me 01:38:33.760 |
i think being able to change these constraints from 100 years 01:38:37.120 |
that could really that could really do something interesting i don't i really 01:38:40.640 |
hope it's not just going to be the iteration on on the same thing for the 01:38:45.360 |
next 10 to 20 years as well yeah changing the creation of music a 01:38:49.520 |
creation of audio creation of podcast is a really fascinating possibility i 01:38:54.640 |
myself don't understand what it is about podcasts that's so 01:38:58.560 |
intimate it just is i listen to a lot of podcasts i think 01:39:07.120 |
for connection that people do feel like they're connected 01:39:11.520 |
to when they listen i don't understand what the psychology of that is 01:39:15.840 |
but in this world is becoming more and more disconnected 01:39:20.720 |
it feels like this is fulfilling a certain kind of need 01:39:24.800 |
and uh empowering the creator as opposed to just the listener 01:39:29.200 |
it's really interesting that's a this i'm really excited that you're working 01:39:34.000 |
on this yeah i think one of the things that is inspiring for our teams to work 01:39:36.960 |
on podcast is exactly that whether you think like i 01:39:40.720 |
like i probably do that it's something biological about 01:39:44.320 |
perceiving to be in the middle of the conversation that makes you listen in a 01:39:47.120 |
different way it doesn't really matter people seem to 01:39:49.280 |
perceive it differently and uh there was this narrative for a long 01:39:52.480 |
time that you know if you look at video everything kind of in the foreground it 01:39:56.480 |
got shorter and shorter and shorter because of financial pressures and 01:40:00.000 |
monetization and so forth and eventually at the end there's always 01:40:06.320 |
something and and uh i'm really i feel really good 01:40:14.080 |
interpreted that as people have no attention span anymore 01:40:16.880 |
they don't want to listen to things they're not interested in deeper stories 01:40:21.040 |
like you know people are people are getting dumber but then podcast came 01:40:24.160 |
along and it's almost like no no the need still existed 01:40:27.440 |
once but maybe maybe it was the fact that you're not prepared to look at your 01:40:31.520 |
phone like this for two hours but if you can drive at the same time it 01:40:37.280 |
and they want to hear like the more complicated version so to me that is 01:40:40.720 |
very inspiring that that podcast is actually long form it 01:40:43.840 |
gives me a lot of hope for for humanity that people seem really 01:40:47.440 |
interested in hearing deeper more complicated conversations 01:40:50.640 |
this is uh i don't understand it it's fascinating so the majority 01:40:55.920 |
for this podcast listen to the whole thing this whole conversation we've been 01:40:59.920 |
talking for an hour and 45 minutes and somebody will i mean 01:41:04.640 |
most people will be listening to these words i'm speaking right now you 01:41:07.520 |
wouldn't have thought that 10 years ago with where the world seemed 01:41:10.720 |
to go that's very positive i think that's really exciting and 01:41:14.160 |
empowering the creator in there is is really exciting 01:41:18.400 |
last question you also have a passion for just 01:41:22.000 |
mobile in general how do you see the smartphone world 01:41:27.200 |
this the digital space of uh of smartphones and just everything 01:41:34.320 |
that's on the move whether it's uh internet of things and 01:41:44.800 |
is that computing might be moving out of these 01:41:49.840 |
multi-purpose devices the computer we had in the phone 01:41:53.440 |
into specific you know specific purpose devices and you know it will be ambient 01:42:02.160 |
shout something at someone and there's always like one of these speakers close 01:42:04.960 |
enough and so you start behaving differently it's as 01:42:09.520 |
if you have the internet ambient ambiently around you and you can 01:42:15.680 |
so i think computing will kind of get more integrated and we 01:42:20.480 |
won't necessarily think of it as as connected to a device in the same 01:42:24.880 |
thing in the same way that we do today i don't know the the path to that maybe 01:42:30.240 |
we used to have these desktop computers and then we partially replaced that 01:42:34.800 |
with the with the laptops and left you know we 01:42:37.440 |
had desktop at home and at work and then we got these phones and we started 01:42:40.640 |
leaving the the laptop at home for a while and maybe the 01:42:44.560 |
maybe for stretches of time you're going to start using the watch and you can 01:42:47.280 |
leave your your phone at home like for a run or 01:42:49.680 |
something and you know we're on this progressive path where 01:42:54.720 |
you i think what what is happening with the voice 01:43:01.760 |
interaction paradigm that doesn't require as large physical devices so i 01:43:07.200 |
definitely think there's a future where you can have your your airpods and and 01:43:12.160 |
your watch and you can do a lot of computing and 01:43:16.720 |
i i don't think it's going to be this binary thing i think it's going to be 01:43:20.720 |
like many of us still have a laptop we just use it less 01:43:24.000 |
and so you shift your your consumption over and 01:43:28.400 |
i don't know about ar glasses and so forth i'm excited about i spent a lot 01:43:33.200 |
of time in that area but i still think it's quite far away 01:43:35.760 |
ar vr all yes vr is is happening and working i think the the recent 01:43:41.360 |
oculus quest is quite impressive i think ar is further away at least that type of 01:43:50.800 |
your phone or watch or glasses understanding where you are and maybe 01:43:54.560 |
what you're looking at and being able to give you audio cues about that or you 01:43:57.200 |
can say like what is this and it tells you what it is that i 01:44:01.200 |
think might happen you know you use your your watch or your glasses as a as a 01:44:06.320 |
mouse pointer on reality i think it might be a while before i 01:44:09.600 |
might be wrong i hope i'm wrong but i think it might be a while before we walk 01:44:14.480 |
project things i agree with you there's a it's actually really difficult when you 01:44:18.160 |
have to understand the physical world enough to 01:44:22.400 |
uh project onto it well i lied about the last question uh 01:44:28.240 |
because i just thought of audio and my favorite topic which is the 01:44:37.760 |
whether it's part of spotify or not we'll have 01:44:41.760 |
i don't know if you've seen the movie her absolutely 01:44:45.680 |
and uh their audio is the primary form of interaction 01:44:51.360 |
and the connection with another entity that you can actually have a 01:44:57.680 |
based on voice alone audio alone do you how far do you think that's possible 01:45:02.640 |
first of all based on audio alone to fall in love with somebody 01:45:05.360 |
somebody or well yeah let's go with somebody just 01:45:08.640 |
have a relationship based on audio alone and second 01:45:12.480 |
question to that can we create an artificial intelligence system 01:45:24.080 |
personal answer uh speaking for me as a person 01:45:28.400 |
the answer is quite unequivocally yes on on both i think what we just said 01:45:33.840 |
about podcasts and the feeling of being in the middle of 01:45:36.480 |
a conversation if you could have an assistant where 01:45:41.360 |
and we just said that feels like a very personal setting so if you walk around 01:45:44.720 |
with these headphones and this thing you're speaking 01:45:47.120 |
with this thing all of the time that feels like it's in your brain i 01:45:50.720 |
think it's it's going to be much easier to fall in 01:45:53.520 |
love with than something that would be on your screen 01:45:55.440 |
i think that's entirely possible and then from the you can probably answer 01:45:59.120 |
this better than me but from the concept of if it's going to be 01:46:02.640 |
possible to build a machine that that can achieve 01:46:06.720 |
that i think whether you whether you think of it as a if you can 01:46:10.480 |
fake it the philosophical zombie that it assimilates it enough or it somehow 01:46:14.640 |
actually is i think there's it's only question if you if you ask 01:46:19.040 |
me about time i'd have a different answer but if you say i've given 01:46:22.320 |
some half infinite time absolutely i think it's just 01:46:26.560 |
atoms and arrangement of information well i personally think that love is a 01:46:31.760 |
lot simpler than people think so we started with true romance and 01:46:36.560 |
ended in love i don't see a better place to end beautiful 01:46:40.320 |
gustav thanks so much for talking today thank you so much it was a lot of fun