Back to Index

A Brief History Of How Giant Internet Companies Print Coin | Deep Questions With Cal Newport


Chapters

0:0 Cal's intro
3:55 The Old Model
8:34 Trying to make money from online content
10:5 The Network model
14:46 Extranalities
15:2 The Loop

Transcript

All right, segment number one, deep dive. I call this deep dive, loops, networks, and links. What I'm gonna talk about here is something you probably never knew you should care about, which is distributed curation of user generated content online. That is the most boring title you have probably heard.

This is why most people don't think about it, but it is actually a subject that is incredibly important for understanding the dynamics of the current internet economy. So let me start with the backdrop to this discussion, which is this idea that right now, if we look at the internet economy, there are companies making a huge amount of money monetizing content generated by users, largely unpaid users generating content, making lots of money.

And I'm underscoring the word lots here. We're talking about some of the biggest corporations in the world right now are using this model. So it's not just, here's a nice niche where some people made some money. Monetizing user generated content on the internet has become a massive industry. Here's the thing, that is new.

That is relatively new. That's less than 20 years old, this idea that you could make a lot of money off of this type of content. So quick beats of the timeline leading up to this current state of our economy, before the web, before the consumer facing web became available in the 1990s, there was basically no way to make a lot of money off user generated content.

The model was, if you're a media company, pay a small number of talented people to create content to be consumed by a lot of consumers. It's not user generated content, it's highly paid professional generated content. This created various economies. So if you were trying to reach a very broad audience, like you're a national television network, there could be a lot of competition for this talent because you wanted the best television writer, you wanted the best actor, and they could be really highly paid.

But this model didn't require the superstar economics because of localization. This is why you had conglomerates like Gannett become really, really big because they found that was a good model to buy up local papers. Local papers used to be a lucrative model. You pay people who are as good as anyone else who's writing for your particular market, and then you can make money off of everyone who lives in that market.

All right, Web 1.0 comes along, '96, '97, '98. Now it is possible for individuals to create content that can be consumed by anyone else. We gotta emphasize that this was a major transformation in the history of media production, that now anyone could produce content that could be generated by anyone else.

We're talking, of course, almost primarily about written content here. This did not change the main media economics yet. It was too hard to do. It was technically demanding. You had to hand code HTML often to try to put stuff online. It didn't really look that good. Most of the leveraging of the first web revolution was actually by media companies that already were using the old model, small number of highly paid writers serving lots of people, to reduce distribution cost.

So you didn't get brand new creators. You got existing creators like Time Magazine realizing if we release on the web, it's cheaper than printing things on paper. Then we get Web 2.0. This is the major turning point in the economics of content. Web 2.0, which happens once we get to the new millennium, is where we made it easier for people to publish information on the web.

Now, instead of hand coding HTML, you can type into a box and click Submit or click Post. For the old school web geeks among you, there is small technical innovations that were critical here, like AJAX, asynchronous JavaScript, that made it possible for you to send information from a website to a server, have that server update the website without having to reload the whole page.

These little innovations that made Web 2.0 possible. Now it was easy enough that almost anyone could generate content. This made it possible to generate a ton of content. But before we could mine this new resource, this information resource into massive companies and into massive monetizations, curation had to be solved.

And this is what I wanna talk about, is the evolution of curation once Web 2.0 came along. 'Cause it turns out having a lot of people generating content does you no good. If you're trying to make money, selling content does you no good if you can't select for your audience stuff they actually want to see.

So this goes back to the title of my deep dive loops, networks, and links. These are the three dominant models of curation of user generated content that emerged that's in reverse chronological order. Links were first, then the network model, then the loop model. I wanna walk through these three models briefly, how they work and their advantages and disadvantages.

And I think this will help clarify a lot of what we see going on. All right, so the first effective model for curating user generated content in the Web 2.0 era was the link model. When I say link, I'm talking about hyperlinks. This is how the blogosphere worked during those early years of this content production revolution.

It is a distributed curation method that is very human. It is based on human webs of trust that are augmented with digital technology. So here's roughly how it works. If I'm gonna enter the world of blogs and websites that are linked to each other, I'm gonna enter it in a place where I have a pre-established trust relationship.

Okay, I know this person. This person has a foot in traditional media maybe. I've seen their newspaper column. They write books I care about. Friends of mine have really pushed. This is the smart person that you need to read. Okay, so I enter into this web of trust relationships through an entryway of pre-existing trust.

I then see who are the people I already trust linking to. If sufficient people link to a new source of information, a new blog or a new website, and that website has sufficient aesthetic capital that it's trustworthy, it's not a weird gray background website with animated gifs of eagles and what have you, then that will enter into my web of trust.

I will trust that too, and I will begin to consume that content. Now that new site, when it's linking to something else, again, will help convey trust into these new targets and help expand that web. So ultimately this is humans building trust, and then using that trust to expand where you get your content from.

This was remarkably effective. It actually works really good at, if you actually stick with it, excavating really interesting quality sources of information that you might not have otherwise had access to, and more importantly, filtering out the weirdness, 'cause it's very difficult in this model. It's very difficult to get into someone's web of trust.

So weird conspiratorial work, blatant misinformation, just general emotional outrage and ickiness could not gain a lot of traction in the link model of curation because it would never get the entry point. It is very hard, in other words, to see something like QAnon gain a lot of traction in, let's say, 2006 online ecosystem because for one of these blogs, if you're one of these initial somewhat eccentric QAnon conspiracy theorists, how is that gonna get into a web of trust that's gonna intersect mine?

It probably won't. It probably won't. And so it worked pretty well. The disadvantages were two. One, it was hard to monetize this type of world of user-generated information. The blogosphere was famously hard to monetize, both for large networks and individual content creators. There just wasn't the model there. So that was an issue.

You had the individual writers. It was hard to aggregate them. You can look at Nick Denton and Gawker. There's a lot of interesting oral history on trying to make money off that model. It was difficult. And two, it's hard work. So if I wanna consume content, I have to do a lot of work.

You actually have to spend a lot of time online. You have to see, build trust, expand this web of trust. This takes a lot of time surfing, following links, being exposed. So it was biased towards heavy tech users. If you wanted to create content, it was even harder. It was very difficult to gain a foothold.

So this was the flip side of the filtering and curation being very effective. It allowed a lot more voices than existed in the world of newspapers, TV, and radio only. But it was really hard, if you're starting from scratch, to gain access to these webs of trust. I mean, I remember in the early days of my blog at calnewport.com, when I used to focus only on student advice, I specifically remember being very frustrated when I would see links from more established, trusted blogs.

Lifehacker was one that comes to mind, to other student advice sources that I thought I was better than. It's like, look, I publish these books. My advice is better. Why aren't they linking to me? It's 'cause it's a very slow moving system. Eventually I gathered enough trust to get linked to a lot by those types of sources, but it could take years.

So it was not very exciting for content creators. It was very difficult, very difficult. All right, that led to model number two. We'll call this the network model. Facebook was the innovator here. They figured out, okay, if we have a social network where we make it easy for anyone to create content, so now we can greatly increase the pool of possible content out there by having these very slick web interfaces, so you don't have to worry about what you look like.

You don't have to worry about doing the hard effort of gaining aesthetic capital to convince people that you're someone legitimate. Everyone looks the same. Take that off the table. You don't have to set up a blog, WordPress account somewhere. Take that off the table. You just sign up for this account, click these buttons.

It looks great. And if we can get people, they realize, to do the work on their own of teaching us who their friends are, we can leverage that underlying social graph to do the curation. So now, instead of people having to do the individual hard work of being on the internet a lot and following links and building up this web of trust, you can have a newsfeed.

The newsfeed will fill in with what's interesting to you. And what Facebook realized is, well, if we see what your self-declared friends are interested in, we will guess you're probably interested in that too. We can use friend relationships plus a little bit of magic secret sauce and to try to keep redundant information and keep things fresh to curate for you stuff based off of your friend relationship.

And that worked out really well. So Facebook innovated that model. Instagram followed up that model, but with images. And it was actually really successful. So now everyone can be involved in producing content and you can get a pretty well curated stream of stuff that's interesting to you without having to do too much work.

So it lowered the bar. Twitter had a twist on this model. This is the retweet model. Facebook eventually copied this model as well where they added their share button. The retweet model says, let's make it really easy for you to share a piece of content with everyone you're directly connected to.

And in those people who you're directly connected to that really liked the content will do the same thing. Now, if you model this out mathematically, what you see is that the most compelling content on the network at any one point can dramatically, with dramatic speed, spread to huge swaths of the network.

So this was an even more dynamic and aggressive source of distributed curation. And it became the core of Twitter's success. You carefully set up who you follow. You do the work of propagating stuff you like with this low friction retweet, or in Facebook's case, share. And the resulting fierce viral dynamics will become an incredibly effective distributed selection mechanism for things that will engage people.

And that's why Twitter became so powerful. You know, Facebook was interesting. You see what your friends are up to and sharing, but Twitter, man, it would come out of left field with things. It was almost magical in the trends it would unearth. And that was all distributed. That's not a super clever algorithm.

That's not Hal 2000 sitting somewhere learning about the human psyche. It's a hundred million users making hundreds of retweet or not decisions every day. So that's the network model. Leverage, homogenize interfaces and leverage these networks, these social networks to help curate the content created within these closed garden networks.

Again, advantages, much easier to use. Much more people could be involved. You can make a lot more money off it. Very easy to monetize because these networks work within closed gardens. Disadvantage, you homogenize all the aesthetics of the content and the curation becomes obfuscated. You just get this feed of stuff that's interesting and all looks the same.

Now, suddenly the QAnon, the proverbial QAnon conspiracy theorist who would never be able to enter the web of trust in 2005 can easily spread and gain traction in 2015. Because all content looks the same. Curation is happening more behind the scenes. It's not based off of these more natural, deeply human trust relationships.

Other disadvantage of course is the viral dynamics, especially the retweet share dynamics led to a lot of unexpected externalities, tribalism, outrage culture, mob, swarms, heavy feedback influence on, for example, media outlets where then you have reporters. So fearful of the fierce pushback possible that can happen overnight because of these fierce dynamics.

Starting to really start to tailor what they say or don't say. Then you get the balkanization of media coverage itself. And there's all of these externalities that no one could have guessed. Twitter was not Dr. No with his cat on his island off of Jamaica with an evil plot to bring down democracy.

They just wanted people to spend time on their service. These were all unexpected side effects. All right, moving quickly now, model number three is the loop. This is personified I think best by TikTok. So now what we do with the loop is you basically take the human out of the equation and you use simple, but devastatingly effective machine learning loops to just select for you as an individual from the whole pool of potential content, what to show you.

No shares required, no retweets required, no you going through and telling the network who your friends are, none of that's required. And again, to the technicalities of this, what really happens with these machine learning loops is that all of the content is embedded in some sort of multidimensional statistical space.

It then feeds you items from this space. It looks at how long you watch each video to try to assess your preference towards that particular region. This gives it some weighted cores in this multidimensional space that it can then weight its selections of future videos by what's gonna be closer to one of these cores, blah, blah, blah, nerd, nerd, nerd, math, math, math.

It works eerily well. You start watching videos, scrolling up and down, it gathers that data, do this for half a day, and it seems like TikTok knows you better than the people who are closest to you. So it was an incredibly effective way of doing this. Of course, services like YouTube do something similar, but YouTube is more complicated.

It has to serve many different purposes. It doesn't purify this model nearly as well as TikTok, which is just this model purified. Videos, full screen, swipe when you're done, we'll send you the next, that's it. And when you purify this model, you saw it was probably one of the most effective curation methods we've ever seen.

So again, the advantages, no social graph needed, anyone can compete in this space. You just need a reasonable pool of content and a machine learning loop, and you can be titillating people in a very effective way. Disadvantages, this is like the fentanyl of distraction. It's too purified. It can take over your whole life.

It is distraction now completely purified by any even attempt to connect it to community relationships or being up on the news or exposure to interesting people. It is just, let's go straight to the brainstem and inject that chemical. So it is all humanity is now being stripped out of the curation loop.

So we started with 2005, rich humanity, but hard to monetize, hard to use. 2015, now you have this sort of, we're exploiting human things like our friendship networks and our retweet decisions that produce this, let's call this, we're gonna use a drug metaphor, kind of cocaine of distraction. This is Twitter, this is Facebook, this is Instagram.

And then we get to TikTok and we purify down, get the human out of the loop altogether, purify the curation down to its strongest form. And we are living in a tent city, drooling out of the side of our mouth, waiting to overdose. All right, so that is the history of distributed curation of user generated content.

Two takeaways, once we understand this, a lot of the recent history of the internet economy makes sense. So like here, for example, is two practical takeaways this framework can help you come up with. One, we lost something special when we left the link-based curation. Now I understand we can't go back to a world where the only type of user generated content is curated in a link-based manner.

We can't go back to a pure 2005 blogosphere world, but couldn't we add this world back to what we have today? Isn't there a market out there for this more human web, a trust-based, slower, harder, but better quality connection, better quality information, really effective filtering of the weird and the conspiratorial and the based and loose foundations?

Isn't there some sort of revivification of the blog that at least the sort of expert class or sub expert class could be participating in? Maybe podcasts are doing this, but there's not a lot of, we don't have the same links. Anyways, I think that's interesting. Two, once you understand distributed curation, you see that it is difficult to fix the negative side effects of in particular the network and loop-based curation models through human intervention.

We're mixing too much, two different things here. So if you think you can go in and solve the negative of the, let's say, Twitter-based retweet, fierce viral dynamics by having humans in the loop, trying to kick people off of Twitter, good luck. These are two completely different types of dynamics going on, you're mixing and matching.

Same thing with TikTok, this fiercely effective machine learning loop. What are you gonna do when you don't like all the outcomes of that is like have a human come in and try to intervene. You're mixing two different modalities, it doesn't work. If you want to get away from the negative side effects of these distributed curation models, you have to actually change the cultural zeitgeist to push people onto other sources of interaction, other sources of distraction, other sources of engagement.

I don't know that you can come in and fix something so cybernetically effective as the TikTok machine learning loop or Twitter retweets with a board of safety. What we need to do is convince people that they shouldn't really be on Twitter that much or not. But anyways, two takeaways just to show you that once you have these frameworks, you can actually make some useful conclusions.

(upbeat music) (upbeat music)