So there are a lot of concerns and excitements and confusions surrounding our current moment in artificial intelligence technology. Perhaps one of the most fundamental of these concerns is this idea that in our quest to train increasingly bigger and more capable AI systems that we might accidentally create something smarter than we expected.
I want to address this particular concern from the many concerns surrounding AI in my capacity as a computer science professor and one of the founding members of Georgetown's Center for Digital Ethics. I have been thinking a lot from a sort of technological perspective about this idea of runaway or unexpected intelligence and AI systems.
I have some new ideas I want to preview right here. These are in rough form, but I think they're interesting. And I hopefully will give you a new way and a more precise and hopefully comforting way of thinking about the possibility of AI getting smarter than we hope. All right, so where I want to start with here is the fear.
Okay, so one way to think of the fear that I want to address is what I call the alien mind fear, that we are training these systems, most popularly as captured by large language model systems like the GPT family systems, and we don't know how they work. They're big.
They sit in big data centers and do stuff for months and hundreds of millions of dollars of compute cycles. We get this thing afterwards and we engage with it and say, "What can this thing do now?" And so we are creating these minds. We don't understand how they're going to work.
That's what sets up this fear of these minds getting too smart. I want to trace some of the origins of this particular fear. I'm going to load up on the screen here for people who are watching instead of just listening, an influential article along these lines. This comes from the New York Times in March of 2023.
So this was pretty soon after CHAT-GPT made its big late 2022 debut. The title of this opinion piece is, "You can have the blue pill or the red pill, and we're out of blue pills." This is co-authored by Yuval Harari, who you know from his book, Sapiens, as well as Tristan Harris, who you know as the sort of whistleblower on social media who now runs a nonprofit dealing with the harms of technology, and Asa Raskin, who works with Tristan at his nonprofit.
There's a particular quote. So this was essentially a call for we need to be worried about what we're building with these large language model systems like CHAT-GPT. There is a particular quote in here that I want to pull out, and I'll read this off of my page here. "We have summoned an alien intelligence.
We don't know much about it, except that it is extremely powerful and offers us bedazzling gifts but could also hack the foundations of our civilization." So we get this alien terminology, this notion of we don't really know how this thing works, and so we don't really know what this thing might be capable.
Let me give you another example of this thinking. This is an academic paper that was from this same time. This is April 2023. This is coming out of Microsoft Research. I wrote about this paper, actually, in a New Yorker piece that I wrote earlier this year about AI. But the title of this paper is important.
This was very influential. The title of this influential paper is important. "Sparks of Artificial General Intelligence, Early Experiments with GPT-4." The whole structure of this paper is that these researchers, because they're at Microsoft, had early access to GPT-4. This was before it was publicly released, and they ran intelligence tests, sort of human intelligence tests that had been developed for humans.
They were running these intelligence tests on GPT-4 and were really surprised by how well it did. So this sort of glimmers of AGI, this glimmers of artificial general intelligence, the sort of theme of this is, "My God, these things are smarter than we thought. They can do reasoning. These machines are becoming rapidly powerful." So it was sort of, "Hey, if you were worried about GPT-3.5, the original chat GPT language model that they were writing about in the New York Times op-ed, wait until you see what's coming next." There's a general rational extrapolation to make here.
The original GPT worried people, like Yuval Harari. This new one, GPT-4, seemed even better. We keep extrapolating this curve, the GPT-5, GPT-6. It's going to keep getting more capable in ways that are unexpected and surprising. It's very rational to imagine this extrapolation bringing these alien minds to abilities where our safety is at stake.
We're uncomfortable about how smart they are. They can do things we don't even really understand. This is this origin of this, "These things are going to get smarter than we hoped." I had a conversation with a friend of mine about this who's really interested in AI, has been reading a lot about it.
The way he conceptualized this is he just said, "Look, we're going to keep building bigger models. One of these days, probably pretty soon, as we keep 10x-ing the size of these models, we are going to be very uncomfortable with what we build." All right, so that is our setup for the concern.
To address this concern, the first thing I want to do is start with a strong but narrow observation. A large language model in isolation can never be understood to be a mind. All right, so let's be really clear. I'm being very precise about this, okay? And what I'm saying here is actually very narrow.
If we actually take a specific large language model like GPT-4, that by itself, even if we make it bigger, even if we train it on many more things, cannot by itself be something that we imagine as an alien mind with which we have to contend like we might another mind.
The reason is, in isolation, what a large language model does is it takes an input. This information moves forward through layers. It's fully feed-forward. And out of the other end comes a token, which is a part of a word. In reality, it's a probability distribution over tokens, but whatever, a part of a word comes out the other end.
That's all a language model can do. Now how it generates what token to spit out next can have a huge amount of sophistication, right? It's difficult to come up with the proper analogies for describing this. But I think a somewhat reductive but useful way for understanding how these tokens are produced is the following analogy that I used in a New Yorker piece from a few months ago.
You can imagine what happens is when you have your input, which is like the prompt or the prompt plus the part of the answer you've generated already, as this is going through the large language model, it can come up with candidates for like the next word or part of a word to output next, right?
Like, okay, that's not too hard to do. This is known as Ngram prediction in some sense, except for here, it's a little bit more sophisticated because with self-attention, it can look at multiple parts of the input to figure out what to come next. But it's not too hard to be like, this is kind of the pool of grammatically correct, semantically correct next words that we could output.
How do we figure out which of those things to output to actually match what's being asked or what's actually being discussed in the prompt? Well, this is where these models go through something like complex pattern recognition. I like to use the metaphor of a massive checklist, a checklist that has billions of possible properties on it.
This is a discussion of chess. We're in the middle of producing moves for a chess game. This is like a middle of a chess game move that's being produced. This is a discussion of ancient history. This is a discussion of Rome. This is a discussion of buildings, right? Whatever.
Huge checklist. We're sort of understanding as it goes to these recognizers, this is what we're trying, this is what we're in the middle of talking about. And then you can imagine, again, this is a rough analogy that you have these really complex rule books. It looks at the combination of different properties that describes what we're talking about.
The rule books are combinatorial. They combine these properties to say, okay, given this combination of properties of what we're talking about, which of these possible correct, grammatically correct next word or tokens we could output, which of these makes the most sense, right? But then it's combining, okay, it's a chess game and here's the recent chess moves and we're supposed to be describing a middle game move.
And the rules might say, these are legal moves given like this current situation. So of the different things we could output here that looks like the move in a chess game, these are actually legal moves and so let's choose one of these, right? So you have possible next words, you have checklist of properties, you have combinatorial combinations of those properties with rules that then help you influence which of these correct words to output next.
And all of this sort of happens in this sort of feed forward manner. Those checklists and the rules in particular can be incredibly complicated. The rules can have novel combinations of properties. So combinations of properties that were never seen in the training data, but you have rules that just combine properties and that's how you can produce output with these models that don't directly match anything that ever solved before.
So there's this nice generalization. This is all very sophisticated. This is all very impressive. But in the end, this is still, you can imagine it like a giant metal machine with dials and gears, and you're turning this big crank and hundreds of thousands of gears are all cranking and turning.
And at the very end, at the far end of the machine, there's a dial of letters. These dials turn to spell out one word. Like in the end, that's what's happening. A word or a piece of the word is what comes out on the other side after you've turned these dials for a long time.
It can be a very complicated apparatus, but in the end, what it does at the end is it can spit out a word or a piece of a word. All right. So it doesn't matter how big you make this thing. It can, it spits out parts of words, no matter how sophisticated its pattern recognizers and combinatorial rule generators, no matter how sophisticated these are, it's a word spitter router.
Hey, it's Cal. I wanted to interrupt briefly to say that if you're enjoying this video, then you need to check out my new book, Slow Productivity, The Lost Art of Accomplishment Without Burnout. This is like the Bible for most of the ideas we talk about here in these videos.
You can get a free excerpt at calnewport.com/slow. I know you're going to like it. Check it out. Now let's get back to the video. Okay. That's true. But where things get interesting, as people like to tell me when I talk to people, is when you begin to combine this really, really sophisticated word generator with control layers, something that sits outside of and works with the language model, that's really where everything interesting happens.
Okay. This is what I want to better understand. It's better understanding the control logic that we place outside of the language models that we get a better understanding of the possible capabilities of artificial intelligence, because it's the combined system, language model plus control logic that becomes more interesting. Because what can control logic do?
It can do two things. It chooses what to activate the model with, what input to give it, and it can then second actuate in the real world or the digital world based on what the model says. So it's the control logic that can put input into the model and then take the output of the model and actuate that, like take action.
Do something on the internet, move a physical thing. So it's that control logic with its activation actuation capability that when combined with a language model, which again is just a word generator, that's when these systems begin to get interesting. So something I've been doing recently is sort of thinking about the evolution of control logic that can be appended to generative AI systems like large language models.
And I want to go through like what we know right now. I'm going to draw this on the screen. For people who are watching instead of just listening, you can watch me draw this on the screen and see my beautiful handwriting. All right. So there's different layers to this.
I'll actually draw this out. So we'll start with down here. I'm going to call this layer zero. Oh man, Jesse, my handwriting is only getting worse. People are like, "Oh my God, Cal's having a stroke." No, I just have really bad handwriting. All right. So layer zero control logic is actually what we got right away with the basic chatbots like ChatGPT.
So I'm going to label this like, for example, ChatGPT. So level zero control logic basically just implements what's known as auto regression. So large language model spits out a single word or part of a word, but when you type a query into ChatGPT, you don't want just a one word answer.
You want a whole response. So there's a basic what I'm calling layer zero control logic that takes your prompt, submits it to the underlying large language model, gets the answer of the language model, which is a single word or part of word that expands the input in a reasonable way.
It appends it to the input. So now the input is your original prompt plus the first word of the answer. It then inputs fresh, fresh copy of the model, inputs a slightly longer input. It generates the next word of the answer. The control logic adds that and now submits the slightly longer input to the model.
And it sort of keeps doing this until it judges this as a complete answer. And then it returns that answer to you, the user who are typing into the ChatGPT interface, right? That's called auto regression. That's how we just repeatedly keep using the same language model to get very long answers, right?
So this is a control logic. The model by itself can just spit out one thing. We add some logic. Now we can spit out big answers. Another thing that we got in early versions and contemporary versions of chatbots is the other thing level layer zero control logic might do is append previous parts of your conversation to the prompt, right?
So you know how when you're using ChatGPT or you're using Cloud or something like this or perplexity, you can sort of ask a follow-up question, right? So there's a little bit of control logic here where what it's really doing is it's not just submitting your follow-up question by itself to the language model.
Remember, the language models have no memory. It's the exact same snapshot of this model, trained whenever it was trained, that's used for every word generated. What the control logic will do is take your follow-up question, but then also take all of the conversation before that and paste that whole thing into the input, right?
So this is simple logic, but it makes the token generators useful. All right. So we already have some control logic and even the most basic generative AI tools. All right. Now let's go up to what I'm going to call layer one. All right. So with layer one, now we get two things we didn't have in layer zero.
We're still taking input from a user, like you're typing some sort of prompt, but now we might get a substantial transformation of what you typed in for whatever is actually put into the language model. So what you type in might go through a substantial transformation by the control logic before it's passed on to the actual language model.
The other key part of layer one is there's actuation. So it might also do some actions on behalf of you or the language model based on the output of language model, instead of just sending text back to the user, it might actually go and take some other action. All right.
So an example of this, for example, would be the web enabled chatbots like Google's Gemini, right? So Google's Gemini, you can ask it a question where it's going to do a contemporary web search, like stuff that's on the internet now, not what it was trained with when they changed the original model, but it can actually look at stuff on the web and then give you an answer based on stuff that actually found contemporaneously on the web.
This is layer one control logic. What's really happening here is when you ask something like Gemini or something like perplexity, a question about, you know, a current, a web search, an advanced web search, the control logic before the language model is ever involved, actually just goes and does a Google search and it finds like, these are relevant articles.
It then takes the text of these articles and it puts it together into a really long prompt, which it then submits to the language model. I'm simplifying this, but this is basically what's going on. So the language model doesn't know about the specific articles necessarily in advance. It wasn't trained on them, but it gets a really long prompt that the prompt written by the control logic might say something like, please look at the following, you know, text that's pasted in this prompt and summarize from it, you know, an answer to the following question, which is then your original question.
And then below it is, you know, 5,000 words of web results, right? So the prompt that's actually being submitted under the covers to the language model here is not what you typed in. It's a much bigger, substantially transformed prompt, right? We also see actuation. So if we consider like OpenAI's original plugin, you know, so these are these things you can turn on in GPT-4 that can do things, for example, like generate a picture for you or book airline flights or show you the schedules of airlines.
You can talk to it about things. In the new Microsoft Copilot integrations, you can have the model take action on your behalf and tools like Microsoft Excel or in Microsoft Word. So there's actual action happening in the software world based on the model. This is also being done by the control logic, right?
So you're saying like, help me find a flight to, you know, whatever, this place at this time. The control logic is going to, before we get to a language model, you know, it might make some queries of a flight booking service. Or what it might do is actually create a prompt to give to the language model and says, hey, please take this question about, you know, flight request and summarize it in the following format for me, which is like a very, you know, flight day destination.
The language model then returns to the control logic a better, more consistently formatted version of the query you originally had. Now the control logic, which can understand this really well, format a request, talk over the internet to a flight booking service, get the results, and then it can pass those results to the language model and say, okay, take these flight results and please like write a summary of these in like a polite English.
And then it returns that to you. And so what you see as the user is like, okay, I asked about flights and then I got back like a really nice response, like here's your various options for flights. And then maybe you say, hey, can you book this flight for me?
The control logic takes that and say, hey, can you take this request from the user? And again, put it into this really precise format, you know, flight number, flight, whatever. And the language model does that. And now the control logic can take that and talk over the internet to the flight booking service and make the booking on your behalf.
So this sort of actuation that happens in the sort of our current level of plugins. Same thing if you're asking co-pilot, Microsoft co-pilot to do something, build a table in Microsoft Word or something like this, it's taking your request. It's asking the language model to essentially reformat your request into something much more systematic and canonical, and then the control logic talks to Microsoft Word.
These language models are just giant tables of numbers in a data warehouse somewhere being simulated on GPUs. They don't talk to Microsoft Word in your computer, the control logic does as well. So that's layer one control logic. So now we have substantial transformation of your prompts and some actuation on the responses.
Okay. All right. So now we move up and things begin to get more interesting. Layer two is where the action is right now. I've been writing some about this for the New Yorker among other places. So in layer two, we now have the control logic able to keep state and make complex planning decisions.
So it's going to be highly interactive with the language model, perhaps making many, many queries to the language model on route to trying to execute whatever the original request is. So this is where things start to get interesting. A less well-known, but illustrative example of this is that the meta put out this bot called Cicero, which I've talked about on the show before.
Cicero can play the game diplomacy, the strategy war game diplomacy very well. The way Cicero works is we can actually think about it as a large language model combined with layer two control intelligence. So diplomacy is a board game, but it has lots of interpersonal negotiation with the other players.
The way this diplomacy plane AI system works is the language model, the control logic will use the language model to take the conversations happening with the players and explain to the control program, the control logic in a very consistent systematic way, what's being proposed by the various players in a way that like the control program understands without having to be a natural language processor.
Then the control program simulates lots of possible moves, but what if we did this, right? And what it's really doing here is simulating possibilities. If this person is lying, like they're trying to, but these two are honest and we do this, what would happen? Well, what if this person was lying, but they're being honest, which move would be best?
What if they're all being honest? It kind of figures out all these possibilities for what's really happening to figure out what play gives it its best chance of being successful. And then it tells the language model, okay, here's what we want to do now. Please like talk to this player.
Give me a message that's in this player that would be convincing to get them to do the action we want them to do. And the language model actually generates the text that then the control logic sends. So in Cicero, we have much more complicated control logic where now we're simulating moves, we're simulating the mind of other people.
The logic might have multiple queries of the language model to actually implement a turn. We also see this in Devon. So Devon AI has been building these agent-based systems to do complicated computer programming tasks. And the way it works is you give a more complicated computer programming task to the Devon and it has control logic that's going to continually talk to a language model to generate code, but it can actually keep track of there's multiple steps to this task that we're trying to do.
We're now on step two. We need code that does this. All right, let me get some code from the language model that we think does this. Let me test this code. Does it actually do this? Okay, great. Now we're on the step two of this task. Okay, we need code that integrates this into this system.
Let me ask the language model for that code. So again, it's keeping track of a complex plan, the control logic, and using a language model as the actual production of a specific code that solves a specific request. A language model can't keep track of a long-term plan like this.
It can't simulate novel futures because again, it's just a token generator. The control logic can't. So that's layer two. This is where a lot of the energy is in AI right now, is these sort of complex control layers. The layer that doesn't exist yet, but this is the layer that we speculate about, I call it layer three.
And this is where we get closer to something like a general intelligence. So I'll put AGI here. And this is where, and I'm going to put a question mark, it's unclear exactly how close we can get to this. But now we have a very complicated, this is hypothetical, we'd have a very complicated control logic that keeps track of intention and state and understanding of the world.
It might be interacting with many different generative models and recognizers. So it has a language model to help understand the world of language and produce texts, but it might have other types of models as well. If this was a fully actuated, like robotic artificial general intelligence, you would have something like visual recognizers that really can do a good job of saying, here's what we're seeing in the world around us.
It might have some sort of like social intention recognizer where just trained to take recent conversations and try to understand what people's intent are. And then you have all of this being orchestrated by some master control logic that's trying to keep a sort of stateful existence and interaction in the world of some sort of simulated agent.
So that's how you get to something like artificial general intelligence. So, here's the critical observation. In all of these layers, the control logic is not self-trained. The control logic, unlike a language model, is not something where we just turn on the switch and it looks at a lot of data and trains itself and then we have to say, how does this thing work?
I don't know. At least in the layers that exist so far, layers two through layer zero, the control logics are hand-coded by humans. We know exactly what they do, right? Here's what's something interesting about Cicero. In the game Diplomacy, one of the big strategies that's common is lying, right?
You make an alliance with another player, but you're backstabbing them and you have a secret alliance with another player. That is very common. The developers of Cicero were uncomfortable with having their computer program lie to real people. So they said, okay, though other people are doing that, our player, Cicero, will not lie.
That's really easy to do because the control logic where that simulates moves, this is not some emergent thing they don't understand. They coded it themselves. It's a simulator that simulates moves. They just don't consider moves with lies. So we have this reality about the control plus generative AI combination.
We have this reality that at least so far, the control is just hand-coded by people to do what we want it to do. There is no way for the intelligence in these cases of the language model, no matter how sophisticated its checklist and rules get at being able to produce tokens using very, very sophisticated digital contemplations, that cannot control the control logic.
It can't break through and control the control logic. It can just generate tokens. The control logic we build. We don't want to lie. It doesn't want to lie. We don't want it to produce versions of us that are smarter. We just don't have that coded into the control logic.
It's actually relatively straightforward. We have this with plugins. The plugins, there's a lot of control over these things of like, okay, we have gotten a request. We've asked for a formatted request, the book of flight from the LLM. Let's just look at this because we're not going to spend more than this much money and we're not going to fly to places that aren't on this list we think are appropriate places to fly or whatever it is.
The control logic is just programmed right there. So I think we've extrapolated the emergent, hard to interpret reality of generative models to these full systems. But the control logic in these systems right now is not at all difficult to understand because we're creating them. All right. There's a couple of caveats here.
One, this doesn't mean that we have nothing to be practically concerned about, but the biggest practical concern, especially about layer two or below artificial intelligence systems of this architecture is exceptions, right? Our control logic didn't think to worry about a particular opportunity. We didn't put the right checks and something that is like practically damaging happens.
What do I mean by that? Well, for example, we're doing flight booking and our control logic doesn't have a check that says make sure the flight doesn't cost more than X and don't book it if it costs more than that. We forgot to put that check in and the LLM gives us a first class flight on Emirates that costs $20,000 or something.
It's like, whoops, we spent a lot of money, right? Or we have a Devon type setup where it's giving us a program to run and we don't have a check that says make sure that it doesn't use more than its computational resources and that program actually is like a giant resource consuming infinite loop and it uses a hundred thousand dollars of Amazon cloud time before anyone realizes like what's going on here, right?
So that's certainly a problem. Like your control logic doesn't check for the right things. You can have excessive behaviors, sure, but that's a very different thing than the system itself is somehow smarter than we expected or in taking intentional actions that we don't expect. So that we need to be worried about.
Yeah, too, in theory, when we get to layer two, these really complicated control layers, in theory, one could imagine hand coding control logic that we completely understand that is working with LLMs to produce computer code for a better control logic. And it may be then you could get this sort of runaway superintelligence scenario of Nick Bostrom.
But here's the thing, A, we're nowhere close to being knowing how to do that, how to write a control program that can talk to a coding machine like LLMs and get a better version of the control program. There's a lot of CS to be done there that quite frankly, no one's really working on.
And two, there's no reason to do that. That won't accidentally happen. You would have to build a system to do that and then to start executing the new program. And so maybe we just don't build those types of systems. I call this whole way of thinking about things and I'll write this on here.
I call this whole way of thinking about things, I'll use a different color text here, IAI, right, lowercase I capital AI for intentional artificial intelligence. The idea being that there can be tons of intention in the control logic, even if we can't interpret very well the generative models like the language models that these control logics use.
And we should really lean into the control we have in the control logics to make sure that this is how we keep sort of predictability on what these systems actually do. There might actually be a legislative implication here, one way or the other, making sure that we do not develop a legal doctrine that says AI systems are unpredictable.
So it's not your fault as the developer of an AI system for what it does once actuated. We say it is, you're liable. That would put a lot of emphasis on this control layers. We really want to be careful here. And exactly what we put in these control layers matter, especially once there's actuation, this is on us.
And so we got to be really careful. The language model can be as smart as we want, but we're gonna be very careful on the actions that our control logic is willing to take on its behalf. Anyways, this is super nerdy. But I do think it is interesting. And I do think it is important that we separate the emergent, hard to predict, uninterpretable intelligence of self trained generative models, we separate that from the control logics that use them.
The control logics aren't that complicated. We are building them. This is where the actuation happens. This is where the activation happens. If we go back to our analogy of the giant machine, the Babbage style machine of meshing gears and dials, that when you turn it, great sophistication happens inside the machine.
And at the very end, a word comes out on these dials on the other end of this massive city block size machine. We're not afraid of a machine like that in that analogy. We do worry about what the people who are running the machine do with it. So that's where we should keep our focus is the people who are actually running the machine, you know, what they do should be constrained.
Don't let them spend money without constraint. Don't let them fire missiles without constraint. Don't let the control logic have full access to all computational resources. Don't let the control logic be able to install an improved version automatically of its own control logic. We code the control logic, we can tell it what to do and what not to do.
And let's just make it clear, whatever people do with this big system, like you are a liable, the whole systems you build, you're liable for it. So you'll be very careful about who you let in in this metaphor to actually turn those cranks and take action on the other end.
So that's IAI, that's intentional AI. This is early thinking, just putting it out there for comment, but hopefully it diffuses a little bit of the sort of incipient idea that GPT-6 or 7 is going to become HAL. That's not the way this actually works. What do you think, Jesse, is that sufficiently nerdy?
That was solid. For our return to highly technical topics? What do you think the comments will be for those that think the other way, that don't necessarily agree with you? It's interesting, you know, when I first pointed out in my article last year, the language model is just a feed-forward network.
It has no state, it has no recursion, it has no interactivity. All it can do is generate a token. So this is not a mind in any sort of self-aware way. What a lot of people came back to me with is like, yeah, yeah, but it's, they were talking about, back then they were talking about auto-GPT, which was one of these very early, very early layer-2 control logics.
Yeah, but people are writing programs that sort of keep track of things outside of the language model, and they talk back to the language model, and that's where the sophistication is going to come out. So in some sense, I'm reacting. Look at this. By the way, I'm looking at our screen here.
Let's correct this. Look what I did. That should be a three. I wrote layer-2 twice. Sorry, for those who are listening, I realized that for all the precision of my speech, I wrote three, I wrote two instead of three. So I, you know, I think that diffuses that. I think some of the more just philosophical thinkers who just sort of tackle these issues of like superintelligence from an abstract perspective, like an abstract logical perspective, I think their main response will be like, yeah, but all it takes is one person to write layer-3 control logic that says write control logic program and then install it, replace myself with that program, and like that's what could allow sort of like runaway whatever.
But I think that's a very hard problem. We don't know how to write a control program. If we think of the language model like a coder, we can tell it to write code that does something. Very constrained, but we can write this function, write that function. That's a very hard problem to sort of work with a language model to produce a different type of control program, right?
It's a hard problem, and there's no reason to write that program, and I think it's not just one, you could, again, it's just a very hard problem. We don't even know if it's possible to write a significantly smarter control program, or, you know, the control program is limited by the intelligence of what the language model can produce.
We don't have any great reason to believe that a language model trained on a bunch of existing code, and what it does is predict code that matches the type of things it can see, can produce code that is somehow better than any code a human has ever produced. We don't know that a language model can do that.
What it does is it's been trained to try to expand text based on the structures it's seen in text it's already seen. So do we know that even with the right control program? So I think that whole thing is more messy than people think, and we're nowhere near there.
No one's working on it. What I care about mainly is layer zero through two, and layer zero through two, we're in control here. Nothing gets out of control. I think it's very hypothetical to think about like a control layer that's trying to write a better control layer. It's just unclear what you can even do.
Eventually the control layer's value is stuck on like what the language model can do, and the language model can only do so much, and, you know, there's a lot of interesting debates at layer three, but they're also very speculative right now. They're not things we're going to stumble into the next six months or so.
And you went to the OpenAI headquarters like a year ago, right? Yeah, I've been there. Yeah, in the mission district. Did you guys talk about any of this stuff? No, they're not worried about this stuff. They're worried about just the practicality of how do you actually have a product that a hundred million people use around the world.
That's just like a very complicated software problem. And just figuring out all the different things they have to worry about, like there's copyright law in this country that like affects this in a way, and it's just, you know, it's just a practical problem. Like OpenAI, it's not based on my visit, but based on just listening to interviews with Sam Altman recently, they care more right now I think about, for example, getting smaller models that can fit on a phone and can be much more responsive.
I think they see a future in which their models can be a very effective voice interface to software. Like that's a really effective future. Like it's very practical what the companies are thinking about. This is more the philosophers and the P. Doomers in San Francisco that are thinking about mad scientists like recursive self-improvement.
But anyways, it's just important. The control is not emergent, the control we code. And that's why I think the core tenet of IAI is if you produce a piece of software, you're responsible for its actuation. And that's what's going to keep you very careful about your control layers, like what you allow them to do or not do, no matter how smart the language model is that they're talking to.
And again, I keep coming back to the language model is inert. The control logic can autoregressively keep calling it to get tokens out of it, but it is inert. The language model is not an intelligence that can sort of take over. It's just the giant collection of gears and dials that if you turn long enough, a word comes out the other side.
I like your IAI. IAI. IAI. Easy to say, right? Yeah. IAI. It's like zoktok.com. Hopefully zoktok.com gets in some IAI. Oh man, I keep things difficult. All right. We got some good questions. A lot of them are very techie, so we'll kind of keep this nerd thing going. But first I want to briefly talk about one of our sponsors.
I thought it was appropriate after a discussion of AI to talk about one of our very first sponsors who has integrated language model-based AI in a very interesting way into its product. And that is our friends at Grammarly, right? Grammarly, quite simply, is an AI writing partner that helps you not only get your work done faster, but communicate more clearly.
96% of Grammarly users report that Grammarly helps them craft more impactful writing. It works across over 500,000 apps and websites, right? So when you subscribe and use Grammarly, it's there for wherever you're already doing your writing, in your word processor, in your email client. Grammarly is there to help you make that writing better.
The ways it can do this now continue to expand. So it's not just, hey, you said this wrong, or here's a more grammatically correct way to say it. It can now do sophisticated things, for example, like tone detection. Hey, what's the tone of this? Can you rewrite what I just wrote to sound more professional, to sound more friendly, to sound less friendly, right?
It can help you get the tone just right. It can help you now with suggestions. Can you give me some ideas for this? Can you take this outline and write like a draft of a summary of these points, right? So it can generate, not just correct or rewrite, but generate in ways that as you get more used to it, helps you.
And again, the key thing with Grammarly is that it integrates into all these other tools. You're not just over at some separate website typing into a chat interface, like Grammarly is where you're already doing your writing. It's the gold standard of responsible AI in the sense that they have for 15 years, have best in class communication trusted by tens of millions of professionals in IT development.
It is a secure AI writing partner that's going to make you or your team make their point better, write more clearly and get your work done more faster. So get AI writing support that works where you work. Sign up and download for free at Grammarly.com/podcast, that's G-R-A-M-M-A-R-L-Y.com/podcast, easier said done, all right?
Now wherever it is you happen to be doing this work, you want to feel comfortable while you do it. So I want to talk about our friends at Roan and in particular their commuter collection, the most comfortable, breathable and flexible set of products known to man. All right, here's the thing.
The commuter collection has clothing for any situation from the world's most comfortable pants to dress shirts to quarter zips and polos so you never have to worry about what to wear when you have the commuter collection. It's four way stretch fabric makes it very flexible, it's lightweight, it's very breathable.
It even has gold fusion anti-odor technology and wrinkle release technology so you can travel with this as you wear it, the wrinkles go away. It's very useful if you're on like a, you have an active day or you're on a trip, you can just throw this on and be speaking at conferences or teaching classes or on let's say a four month book tour, I know well.
And having the commuter collection means you're going to look good but it's going to be breathable, you're not going to overheat, it's not going to look wrinkled, it's going to whisk away the odor. It's going to look fantastic if you are active. Just a great clothes. So the commuter collection can get you through any workday and straight into whatever comes next.
Head to rhone.com/cow and use that promo code cow to save 20% off your entire order. That's 20% off your entire order when you head to rhone.com/cow and use that code cow when you check out, it's time to find your corner offer office comfort. All right Jesse, let's do some questions.
Hi, first question is from Bernie. You often give advice on methods to consume news. With the advent of chat GPT and other tools, should I be worried about the spread of disinformation on a grand scale? If so, how should I manage this? Yeah, this is a common concern. So when people are trying to say what are we worried about with these large language models that are good at generating text, one of the big concerns is you could use it to generate misinformation, right?
Generate text that's false, but people might believe, and of course it could then therefore be used equally for disinformation where you're doing that for particular purposes. I want to influence the way people think about it. I have two takes on this. I think in the general sense, I'm not as worried and let me explain why.
What do you need for, let's just call it negative, high impact negative information. What do you need for these type of high impact negative information events? Well, you need a combination of two things, a tool that is really good at engendering viral spread of information that hits just like the right combination of stickiness.
And you need a pool of this sort of available negative information that's potentially viral. So you have this big pool and then a selection algorithm on that pool that can find the thing that clicks and then let that really spread. That's what allows us to be in our current age of sort of widespread mis or disinformation is that there's a lot of information out there.
And because in particular of social media curation algorithms, which are engagement focused, this tool exists that's basically surveying this pool of potential viral spreading information that can take this negative information and expand it everywhere, right? That's what makes our current moment different than, say, like 25 years ago, where the viral spread of information is hard.
So it could be a lot of people with either malintended or just wrong and they don't realize it thoughts, you know, hey, I think the earth is flat. It's hard to spread it. Right. But when we added the viral spread potential of recommendation algorithms in the social media world, we got this current moment where mis or disinformation has the potential spreading really wide.
All right. So what is generative AI change in this equation? It makes the pool of available bad information bigger. It is easier to generate information about whatever you want. For most topics we care about, that doesn't matter, right? Because what matters only is if AI can create content in this pool that is stickier than the stickiest stuff that's already there.
There's only so many things that can spread and have a big impact. Right. And it's going to be the stickiest, the perfectly calibrated things that get identified by these recommendation algorithms. If large language models are just generating a lot of mediocre, bad information, that doesn't really change the equation much.
Probably the stickiest stuff, the stuff that's going to spread best in the small number of slots that each of our attention has to be impacted, it's going to be like very carefully crafted by people. Like I really have a sense of like, this is going to work and we already have enough of that.
And most of our slots of ideas that can impact us are filled. The exception to this would be very niche topics for that pool of potential bad information is empty because it's so niche. It's just nothing that there's no information about it. That's the case where language models could come into play because if that pool is empty, because it's a very specific topic, like this election in this like county, you know, it's not something that people are writing a lot about.
Now someone can come in who otherwise maybe before, because they didn't have like the language skills, wouldn't be able to produce any text that could get virally spread here, could use a language model to produce it. The stickiest things spread, but if the pool is empty, almost anything reasonable you produce has the capability of being sticky.
So that's the impact I see most immediately of missing disinformation in large language models is hyper-targeted mis- or disinformation. When it comes to big things like a national election or the way we're thinking about a pandemic or conspiracies about major figures or something like this, there's already a bunch of information adding more mediocre, bad information is not gonna change the equation.
But in these narrow instances, that's where we have to be more wary about it. Unfortunately, like the right solution here is probably the same solution that we've been promoting for the last 15 years, which is increased internet literacy. Just we keep having to update what by default we trust or don't trust.
We have to keep updating that sort of sophisticated understanding of information. But again, it's not changing significantly what's possible. It's just, it's allowing, it's simplifying the act of producing this sort of bad information of which there's already a lot of it that already exists. All right, what do we got next?
- Next question is from Alyssa. Is the lack of good measurement and evaluation for AI systems a major problem? Many AI companies use vague phrases like improved capabilities to describe how their models differ from one version to the next. As most tech companies don't publish detailed release notes, how do I know what changes?
- Yeah, I mean, it's a problem right now in this current age of what is happening is like an arms race for these mega Oracle models. This is not however the long-term business model of these AI companies. So the mega Oracle models, think of this as the chat GPT model.
Think about this as the clod model, where you have a Oracle that you talk to through a chat bot about anything and you ask it to do anything and it can do whatever you ask it. And so we build these huge models, GPT-3 went to GPT-35, which went to GPT-4, which went to GPT, whatever it is, 4S or 5S or whatever they're calling it.
And you are absolutely right, Alyssa, it's not always really clear what's different, what can this do that the other model can't. Often that's discovered, like, I don't know, we trained this on 5X more parameters. Now let's just go mess around with it and see what it does better than the last one.
So the sort of the release notes are emergently created in a distributed fashion over time. But it's not the future of these companies because it's not very profitable to have these massive right now, the biggest models, like a trillion parameter, Sam Altman's talking about a potential 10 trillion parameter model.
This is something that's going to cost on the orders of multiple hundreds of millions of dollars to train. These models are not profitable. They're computationally very expensive to train. They're computationally very expensive to run, right? It's like having a Bugatti supercar to drop your kids off at school five blocks away, you know, to be using a trillion or 10 trillion parameter model to, you know, do a summary of this page that you got on a Google search is just way over provisioned and it's costing like a lot of money.
It's a lot of computational resources, it's expensive. What they want, of course, is smaller customized models to do specific things. We're seeing this move. GitHub Copilot's a great example. Computer programmers have an interface to a language model built right into their integrated development environments. So they can just right there where they're coding, ask for a code to be finished or another function to be added or ask it, what is the library that does this?
And it will come back like, this is the library you mean, and here's the description. It's integrated right there. Microsoft Copilot, which again is confusingly named in an overlapping way, is trying to do something similar with Microsoft Office tools. You kind of have this universal chat interface to ask for actuated help with their Microsoft Office tools.
Can you create a table for this? Can you reformat this? And it's going to work back and forth using layer one control with those products. So it's gonna be more of this. Again, OpenAI has this dream of having a better, like a voice interface to lots of different things.
Apple Intelligence, which they've just added to their products is, you know, they're using chat GPT as a backend to sort of more directly deal with specific things you're doing on your phone. Like, can you take a recording of this phone conversation I just had and get a transcript of it and summarize that transcript of it and email it to me?
So this is where these tools are going to get more interesting when they're doing specific, what I call actuated behavior. So they're actually like taking action on your behalf, you know, in typically the digital world. Now release notes will be more relevant. What can this now do? Okay, it can summarize phone calls, it can produce computer code, it can help me do formatting queries on my Microsoft Word documents.
So I think as these models get more specialized and actuated and integrated into specific things we're already doing in our digital lives, the capabilities will be much more clearly enumerated. This current era of just, we all go to chat.openai.com and like, what can this thing do now? This is really just about, it's the equivalent of the car company having the Formula One racer.
They're not planning to sell Formula One racers to a lot of people. But if they have a really good Formula One race car, people think about them as being a really good car company and so then they buy the car that's actually meant for their daily life. And so I think that's what these big models are right now.
The bespoke models, their capabilities I think will be more clearly enumerated. And that's where we're going to begin to see more disruptions. I mean, notice we're at the year and a half mark of the chat GPT breakthrough, hasn't been a lot of major disruptions. The chat interface to a large language model, it's really cool what they can do, but right away they were talking about imminent disruptions to major industries.
And we're still playing this game of like, well, I heard about this company over here who their neighbor's cousin replaced six of their customer service representatives. Like we're sort of still in that sort of passing along like a small number of examples. Because I don't think these models are in the final form in which they're going to have their key disruption.
They haven't found their, if we can use a biological metaphor, the viral vector that's actually able to propagate really effectively. So stay tuned. But that's the future of these models. And I think their capabilities will be much more clearly enumerated when we're actually using them much more integrated into our daily workflow.
I didn't know there was two co-pilots. Yeah. So Microsoft is calling their Microsoft office integration co-pilot as well. So it's very confusing. It is confusing. Yeah. All right. Next question is from Frank. Is the development of AI the biggest thing that happened in technology since the internet? Maybe. And we'll see.
We'll see. I mean, what are the disruptions of the last 40 years? Personal computing, number one, because that's what actually made computing capable of being integrated into our daily lives. Next was the internet, democratized information and information flows, made that basically free. That's a really big deal. After that came mobile computing slash the rise of a mobile computing assisted digital attention economy.
So this idea that the computing was portable and that the main use, the main economic engine of these portable computing devices would be monetizing attention, hugely disruptive on just like the day-to-day pattern of what our life is like. AI is the next big one. The other big one that's lurking, of course, I think is augmented reality and the rise of virtual screens over actual physical screens that you hold in real life.
That's going to be less disruptive for our everyday life because that's going to be simulating something we're doing now in a way that's better for the companies. But the whole goal will be just to kind of take what we're doing now and make it virtual. But that's going to be hugely economically disruptive because so much of the hardware technology market is based on building very sleek individual physical devices.
So I think that and AI are vying to be like, what's going to be the next biggest disruption. How big will it be compared to those prior disruptions? There's a huge spectrum here, right? On one end of the spectrum, it's going to be, you know, there's places where it has a part of our daily life where it wasn't there before.
Like basically, maybe like email, right? Email really changed the patterns of work, but didn't really change what work was. On the other end of the spectrum, it could be much more comprehensive, maybe something like personal computing, which just sort of changed how the economy operated. You know, pre-computers, after computers fundamentally just changed the way that we interact with like the world and the world of information.
It could be anywhere on the spectrum. Of course, there's the off-spectrum options as well as like, no, no, it like comes alive and completely, it's so smart that it either takes over the world or it just takes over all work and we all just live on UBI. I tend to call those off-spectrum because of what I talked about in the deep dive.
Like we just, I don't see us having the control logic to do that yet. So I think really the spectrum is like personal computer on one end, email on the other. I don't really know where it's going to fall, but I do go back to saying the current form factor, I think we have to admit this, the current form factor of generative AI talking to a chat interface through a web or phone app has been largely a failure to cause the disruption that people predicted.
It has not changed most people's lives. There's heavy users of it who like it, but it really has a novelty feel. They'll really get into detail about these really specific ways that they're, I'm getting ideas for my articles and having these interactions with it, but it really does have that sort of early internet novelty feel where you had the mosaic browser and you're like, this is really cool, but most people aren't using it yet.
It's going to have to be another form factor before we see its full disruptive potential. And I think we do have to admit most things have not been changed. We're very impressed by it, but we're not impressed by its footprint on our daily life yet. So I guess this is like a dot, dot, dot, stay tuned.
Unless your students just use it to put pass in papers, right? Maybe. So look, I have a New Yorker article I'm writing on that topic that's still in editing. So stay tuned for that. But the picture about what's happening with students and paper ride in AI, that's also more complicated than people think.
What's going on there might not be what you really think, but I'll hold that discussion until my next New Yorker piece on this comes out. All right. Next question is from Dipta. How do I balance a 30 day declutter with my overall technology use? I'm a freelance remote worker that uses Slack, online search, stuff like that.
All right. So Dipta, when talking about the 30 day declutter, is referencing an idea from my book, Digital Minimalism, where I suggest spending 30 days not using personal, optional personal technologies, get reacquainted with what you care about and other activities that are valuable. And then in the end, only add back things that you have a really clear case of value.
But Dipta is mentioning here, work stuff, right? She's a freelance worker, use Slack, use online search, et cetera. My book, Digital Minimalism, which has the declutter is a book about technology in your personal life. It's not about technology at work, deep work, a world without email and slow productivity.
Those books really tackle the impact of technology on the workplace and what to do about it. So digital knowledge work is one of the main topics that I'm known for. It's why I'm often cast, I think, somewhat incorrectly as a productivity expert. I'm much more of a like, how do we actually do work and not drown and hate our jobs in a world of digital technology?
And it looks like productivity advice, but it's really like survival advice. How do we do work in an age of email and Slack without going insane? Digital minimalism is not about that. That was my book where I said, hey, I acknowledged there's this other thing going on, which is like, we're looking at our phones all the time in work, outside of work, unrelated to our work.
We're on social media all the time. We're watching videos all the time. Why are we doing this? What do we do about it? So digital declutter is what to do with the technology in your personal life. When it comes to the communication technologies in your work life, read a world without email, read slow productivity, read deep work.
That's sort of a separate issue. So I'll just use that as a roadmap for people who are struggling with the promises and peril of technology. Use my minimalism book for like the phone, the stuff you're doing on your phone that's unrelated to your work. My other books will be more useful for what's happening in your professional life.
That often gets mixed up, Jesse, actually. I think in part because the symptoms are similar. I look at social media on my phone all the time more than I want to. I look at email on my computer at work all the time more than I want to. These feel like similar problems and the symptoms are similar.
I am distracted in some sort of abstract way from things that are more important, but the causes and responses are different. But you're looking at your phone too much and social media too much because these massive, massive attention economy conglomerates are producing apps to try to generate exactly that response from you to monetize your attention.
You're looking at your email so much, not because someone makes money if you look at your email more often, but because we have evolved this hyperactive hive mind style of on-demand digital aided collaboration in the workplace, which is very convenient in the moment, but just fries our brain in the long term.
We have to keep checking our email because we have 15 ongoing back and forth timely conversations and the only way to keep those balls flying in the air is to make sure I see each response in time to respond in time so that things can keep unfolding in a timely fashion.
It's a completely different cause and therefore the responses are different. So if you want to not be so caught up in the attention economy in your phone and in your personal life, well, the responses there have a lot to do with like personal autonomy, figuring out what's valuable, making decisions about what you use and don't use.
In the workplace, it's all about replacing this collaboration style with other collaboration styles that are less communication dependent. So it's similar causes, but very different, I mean, similar symptoms, but very different causes and responses. Little known fact, Jesse. So I sold digital minimalism and a world without email together.
It was a two book deal, like I'm going to write one and then the other. One of the, and it went to auctions. We talked to a bunch of editors about it. One of the editors was like, this is the, which was an interesting point, but I think gets to this issue.
He's like, these are the, this is the same thing. We're just like looking at stuff too much in our, in our, uh, digital lives. This should be one book. These two things should be combined. And I was really clear, like, no, they shouldn't because actually it confuses the matter because they already seem so similar, but it's so different.
Yeah. A world without email and slow productivity are such different books than digital minimalism. The causes are so different and the responses are so different that they can't be one book. It's, it's, it's like two fully separate issues. The only thing to commonality is they involve screens and they involve looking at the screens too much.
Yeah. And so I was like, I think you're wrong about that. And we kept those books separate. Other little known fact about that, it was originally supposed to be the other order. The slope, uh, a world without email was the direct followup to deep work was the idea, but the issues in digital minimalism became so pressing so quickly that I say, no, no, no.
We got, I got to write that book first. And so that's why slope, um, a world without email did not directly follow deep work is because in 2017 and 18, these issues surrounding our phone and social media, mobile, like that's when that really took off. When you were writing deep work, did you know you were going to write a world without email or it kind of happened?
No, I just wrote, I just wrote deep work. Yeah. And then after I wrote deep work, I was thinking about what to write next. And the very next idea I had was ruled without email. And it was basically a response to the question of like, well, why is it so hard to do deep work?
Yeah. Right. In the book, deep work, I don't get too much into it. I was like, we know it's technology. We know we're distracted all the time. Um, I'm not going to get into why we're in this place. I just want to emphasize focus is diminishing, but it's important and here's how you can train it.
And then I got more into it after that book was written. Why did we get here? And it was a pretty complicated question, right? Like why did we get to this place where, uh, we check email 150 times a day? Yeah. It's a long book. Who thought this was a good idea?
Right. So it was its own sort of like Epic investigation. Yeah. Yeah. I really liked that book. Um, yeah, it didn't sell the same as like digital minimalism or deep work because it's less just let me give you this image of a lifestyle that you can shift to right now.
It's much more critical. It's much more, how did we end up in this place? Is this really a problem? It's much more of like social professional commentary. I mean, it has solutions, but they're more systemic. There's no easy thing you can do as an individual. I think intellectually it's a very important book and it's had influence that way, but it's hard to make a book like that be like a million copy seller.
Atomic habits. It's not atomic habits. Atomic habits is easier to read than a world without email. I will. I will say that with confidence. Let's see what we got here. Uh, we've got another question. Ooh, it's just a slow productivity corner. It is. Do we play the music before we asked a question or do we play the music after?
I forgot. Usually we play it twice. All right. Before and after. Let's get the before. All right. What do we got? Hi, this question is from Hanzo. I work at a large tech company as a software engineer and I'm starting to feel really overwhelmed by the number of projects getting thrown at us.
How do I convince my team that we should say no to more progress projects when everyone has their own agenda? For example, pushing their next promotion? Well, okay. So this is a great question for the corner because the whole point of the slow productivity corner segment is that we ask a question that's relevant to my book, slow productivity, which as we announced the beginning of the show, the number one business book of 2024 so far is chosen by the Amazon editors.
Uh, is this appropriate? Because I have an answer that comes straight from the book. So in chapter three of slow productivity, where I talk about the principle of doing fewer things, I have a case study that I think is very relevant to what you should, your team should consider Hanzo.
So this case study comes from the technology group at the Brood Institute, a joint Harvard and MIT Brood Institute in Cambridge, Massachusetts. This is like a large sort of interdisciplinary genomics research Institute that has all these sequencing machines. But I give a profile of a team that worked at this Institute.
These were not biologists. It was basically, it's not the IT team, but it's a team that like what they do is they build tech stuff that other scientists and people in the Institute need. So you come to this team and they're like, Hey, could you build us a tool to do this?
It's a bunch of programmers and they'll let's do this, let's build that. They had a very similar problem as what you're describing, Hanzo. They, all these ideas would come up. Some of them would be their own. Some of them would be suggested by other stakeholders, you know, other scientists or teams in the Institute.
And they'd be like, okay, let's work on this. You do this. I'll do this. Well, can you do this as well? And people are getting overloaded with all these projects and just things were getting gummed up, right? I mean, it's the classic idea from this chapter of the book is that if you're working on too many things at the same time, nothing makes progress.
You put too many logs down the river, you get a log jam. None of them make it to the mill. So they were having this problem. So what they did is they went to a relatively simple pull based agile inspired project management workload system where whenever an idea came up, here's a project we should do.
They put it on an index card and they put it on the wall and they had a whole section of the wall for like things we should, or at least consider working on. Then they had a column on the wall for each of the programmers. The things that each programmer were working on were put under their name.
So now you have like a really clear workload management thing happening. If you had four or five cards under your name, they're like, this is crazy. We don't want you doing four or five things. That's impossible. You're going to log jam. You should just do one or two things at a time.
And when you're done, we can decide as a team, okay, there's now space here for us to pull something new onto this person's column. And as a team, you could look at this big collection on the wall of stuff that you've identified or has been proposed to you and say, which of these things is most important.
Equally important here as well is during this process of selecting what you're going to work on next, because everyone is here, it's a good time to say, well, what do I need to get this done? And you can talk to the people right there. I'm going to need this from you.
I'm going to need that from you. When are we going to do this? You sort of build your contract for execution. So one of the things they did here is, okay, so first of all, this prevented overload. Each individual person can only have a couple of things in their column.
So you didn't have people working on too many things at once. So you got rid of the logjam problem. But number two, this reminds me of your question, Hanzo. They noted that this also made it easier for them to, over time, weed out projects that they might've at some point been excited about, but are no longer excited about to weed those out.
And the way they did it was they would say, this thing has been sitting over here in this pile of things we could work on. This has been sitting over there for months, and we're consistently not pulling it onto someone's plate. Let's take it off the wall. And so this allowed them to get past that trap of momentary enthusiasm.
Like, this sounds awesome. We got to do this. We have those enthusiasms all the time, because here, that would just put something on the wall. But if it didn't get pulled over after a month or so, they would take it off the wall. So they had a way of sort of filtering through which projects should we actually work on.
Anyways, this prevented overload. This is almost always the answer here. We need transparent workload management. We can't just push things on people's plates in an obfuscated way and just sort of try to get as much done as possible. We need to know what needs to be done. Things need to exist separate from individuals' obligations.
And then we need to be very clear about how many things each individual should work on at the same time. So, Hanzo, you need some version of this sort of vaguely Kanban Agile-style workload management pull-based system. It could be very simple, like I talk about. Read the case study in chapter three of Slow Predictivity to get details.
That will point you towards a paper from the Harvard Business Review that does an even more detailed case study on this team. Read that in detail. Send that around to your team or send my chapter around to your team. Advocate for that. And I think your team's going to work much better.
All right. Let's get that music. All right. Do we have a call this week? We do. Let's hear it. Hey, Cal. Jason from Texas. Long-time listener and reader. First-time caller. For the last couple of episodes, you've been talking about applying the distributed trust model to social media. There's a lot that I like about that, but I'd like to hear you evaluate that thought in light of Fogg's behavioral model, which says that for an action to take place, motivation, prompt, and ability have to converge.
I don't see a problem with ability, but I'm wondering about the other two. So if someone wants to follow, say, five creators, they're going to need significant motivation to be checking those sources when they're not curated in one place. Secondly, what is going to prompt them to go look at those five sources?
I think if those two things can be solved, this has a real chance. One last unrelated note, somebody was asking about reading news articles. I use Send to Kindle, and I send them my Kindle and read them later. Works for me. Thanks. Have a great day. All right. So it's a good question.
So I think what's key here is separating discovery from consumption. So the consumption problem is once I've discovered, let's say, a creator that I'm interested in, you know, how do I then consume that person's information in a way that's not going to be insurmountably high friction, right? So if there's a bunch of different people I've discovered one way or the other, put aside how I do that, how do I consume their information?
That's the consumption problem, and that's fine. We've had solutions to that before. I mean, this is what RSS readers were. If I discovered a syndicated blog that I enjoyed, I would subscribe to it. Then that person's content is added to this sort of common list of content in my RSS reader.
This is what, for example, we currently do with podcast. Podcast players are RSS readers. The RSS feeds now are describing podcast episodes and not blog posts, but it's the exact same technology, right? So when you have a podcast, you host your MP3 files on whatever server you want to.
This is what I love about podcasts. It's not a centralized model like Facebook or like Instagram, where everything is stored on the servers of a single company that makes sense of all of it and helps you discover it. No, we have our servers on, our podcast are on Buzzsprout server somewhere, right?
It's just a company that does nothing but host podcast. We could have our podcast, like in the old days of podcast, on a Mac studio in our HQ. It doesn't matter, right? But what you do is you have an RSS feed that every time you put out a new episode, you update that feed to say, here's the new episode.
Here's the location of the MP3 file. Here's the title of the episode. Here's the description of the episode. All a podcast listener is, like a podcast app, is an RSS reader. You subscribe to a feed. It checks these feeds. When it sees there's a new episode of a show because that RSS feed was updated, it can put that information in your app.
It can go and retrieve the MP3 file from whatever server you happen to be serving it on, and then it can play it on your local device. So we still use something like RSS. So consumption's fine. We get very nice interfaces for where do I pull together and read in a very nice way or listen in a very nice way or watch in a very nice way.
Because by the way, I think video RSS is going to be a big thing that's coming. You make really nice readers. Now we go over to the discovery problem. Okay, well, how do I find the things that subscribe to in the first place? This is where distributed trust comes into play.
It's the way we used to do this pre-major social media platforms. How did I discover a new blog to read? Well, typically it would be through these distributed webs of trust. I know this person. I've been reading their stuff. I like their stuff. They link to this other person.
I trust them, so I followed that link. I liked what I saw over there, and so now I'm going to subscribe to that person. Or three or four people that I trust are in my existing web of trust have mentioned this other person over here. That now builds up this human to human curation, this human to human capital of this is a person who is worthy of attention.
So now I will go and discover them, and I like what I see, and then I subscribe, and the consumption happens in like a reader. So we've got to break apart discovery and consumption. It's the moving discovery away from algorithms and back towards distributed webs of trust. That's where things are interesting.
That's where things get interesting. That's where we get rid of this feedback cycle of production, recommendation algorithm, feedback to producers about how popular something was, which changes how they produce things into the feedback algorithm, feedback. That cycle is what creates this sort of hyper palatable, lowest common denominator, amygdala plucking highly distractible content.
You get rid of the recommendation algorithm piece of that, that goes away. It also solves problems about disinformation and misinformation. I mean, I argued this early in the COVID pandemic. I wrote this op-ed for Wired, where I said like the biggest thing we could do for both the physical and mental health of the country right now would be to shut down Twitter.
I said what we should do instead is go back to an older web two model, where information was posted on websites, like blogs and articles posted on websites, and yeah, it's going to be higher friction to sort of discover which of these sites you trust. But this distributed web of trust is going to make it much easier for people to curate the quality of information, right?
Like this blog here is being hosted by a center of a major university. I have all of this capital in me trusting that more than trusting johnnybananas.com/covidconspiracies. I don't trust that as much, right? Or I'm going to have to follow old-fashioned webs of trust to find my way to sort of like a new commentator on something like this.
And this is not really an argument for, yeah, but you're going to fall back to unquestioning authority. Webs of trust work very well for independent voices. They work very well. They're very useful for critiques of major voices. It is slower for people to find independence or critical voices. But if you find them through a web of trust, they're much more powerful and it filters out the crank stuff, which is really bad for independent and critical voices because it can get pushed in.
That's the same. This person here critiquing this policy, that's the same as this other person over here who says it's the lizard people. Webs of trust, I think, are a very effective way to navigate information in a low friction information environment like the internet. So I think distributed webs of trust, I really love that model.
It's what we're doing with podcasts. It's also what we're doing with newsletters. So this is not like a model that is retroactive or reactionary. It's not regressive. It's not let's go back to some simpler technological age to try to get some... We're doing it right now in some sectors of online content and it's working great.
Podcast or digital trust. Algorithms don't show us what podcast to listen to. They don't spread virally and then we're just shown it and it catches our attention. We have to hear about it. We probably have to hear about it multiple times from people we trust before we go over and we sample it, right?
That's distributed webs of trust. Email newsletters are the same thing. It's a vibrant online content community right now. How do people discover new email newsletters? People they know forward them individual email newsletters like, "You might like this." And they read it and they say, "I do and I trust you and so now I'm going to consider subscribing to this," right?
That's webs of trust. It's not an algorithm as much as Substack is trying to get into the game of algorithmic recommendation or be like the Netflix of text. Right now that model works. So anyways, that's where I think we go. I like to think of the giant monopoly platform social media age as this aberration, this weird divergence of the ultimate trajectory of the internet as a source of good.
And the right way to move forward on that trajectory is to continually move away from the age of recommendation algorithms in the user-generated content space and return more to distributed webs of trust. Recommendation algorithms themselves, these are useful, but I think they're more useful when we put them in an environment where we don't have the user-generated content and feedback bit of that loop.
They're very useful on like Netflix. "Hey, you might like this show if you like that other show." That's fine. They're very useful Amazon to say, "This book is something you might like if you like that book." That's fine. I'm happy for you to have recommendation algorithms in those contexts.
But if you hook them up with user-generated content and then feedback to the users about popularity, that's what in a Marshall McLuhan way sort of evolves the content itself in the ways that are, I think, undesirable and as we see have really negative externalities. So anyways, we've gone from geeking out on AI to geeking out on my other major topic, which is distributed webs of trust.
But I think that is the way to discover information. Hopefully that's the future of the internet as well. And I love your idea, by the way, of the Send a Kindle cool app. You send articles to your Kindle, and then you can go take that Kindle somewhere outside under a tree to read news articles.
No ads, no links, no rabbit holes, no social media. It's a beautiful application. Send a Kindle. I highly recommend. All right. I think we have a case study. This is where people send in a description of using some of my ideas out there in the real world. Are we been asking people to send these to you, Jesse?
Yeah. Yeah. Jesse@CalNewport.com. Yeah. So if you have a case study of putting any of these ideas into action, send those to Jesse@CalNewport.com. If you want to submit questions or calls, just go to thedeeplife.com/listen. Yeah. And there's also a section in there if they go to that website where they can put in a case study.
Yeah. Okay. And we have links there for submitting questions. We have a link there where you can record a call straight from your phone or browser. It's real easy. All right. Our next case study comes from Salim who says, "I work at a large healthcare IT software company in our technical solutions division.
Our work is client-based, so we'll always work with the same analyst teams as our assigned clients. While I enjoy the core work, which is problem-solving based, I was struggling with a large client load and specifically with one organization that did not align well with my communication style and work values.
This was a constant problem in my quarterly feedback, and I was struggling with convincing the staffing team to make reassignment. Around this time, our division had recently rolled out a work plan site for employees to plan out their weekly hours in advance. The issue here was that it was communicated as a requirement.
So most of us saw this as upper micromanagement. The site itself is also unstructured, so we didn't see the utility in doing this since we already log our time retroactively anyways. At this point, I had already read Deep Work and was using the time block planner, but was lacking a system for planning at a weekly timescale.
This is where I started leveraging our work plan site and structured it in terms of what I was working on during any given week. This included itemizing my recurring calls, office hours with clients, and a general estimate of how much time I would spend on client work per client.
I incorporated sections for a top priority list and a pull list backlog so I could quickly go in and reprioritize as new ideas came in or as I had some free time. I also added a section to track my completed tasks so that I could visually get a sense of my progress as the week went by.
After I made this weekly planning a habit, my team lead highlighted my approach at a monthly team meeting, and we presented on how I leveraged the tool into something useful for managing my work. I spoke to how this helped me organize me week to week so that I can take a proactive approach and slow down versus being at the mercy of a hive mind mentality, constantly reacting to incoming emails and team messages, and he goes on to mention some good stuff that happened after that.
All right, it's a great case study, Salim. What I like about it is that it emphasizes there are alternatives to what I call the list reactive method. The list reactive method says you kind of just take each day as it comes, reacting the stuff that's coming in over the transom through email and Slack, trying to make progress on some sort of large to-do list as well.
Like, okay, what should I work on next? I'll react to things and try to make some progress on my to-do list. It is not a very effective way to make use of your time and resources. You get caught up in things that are lower value. You lose the ability to give things the focus work required to get them done well and fast.
You fall behind on high priorities and get stuck on low priorities, so you have to be more proactive about controlling your time. Control, control, control is a big theme about how I talk about thriving in digital age knowledge work. So I love this idea that the weekly plan discipline I talk about could be a big part of that answer.
Look at your week as a whole and say, what do I want to do with this week? Where are my calls? Where's my client office hours? When am I working on this client? Why don't I consolidate all this time into this time over here surrounding this call we're already going to have?
Why don't I cancel these two things because they're really making the rest of the week not work? When you plan your week in advance, it really helps you have a better week than if you just stay at the scale of what am I doing today or even worse, the scale of just what am I doing next?
So multi-scale planning is critical for this control, control, control rhythm that I preach. That's the only way really to survive in digital area knowledge work. So what a cool example of weekly planning, helping you feel like you actually had some autonomy once again over your schedule. All right. So we've got a cool final segment.
I want to react to an article in the news, but first let's hear from another sponsor. Look, the older I get and trust me, my birthday's in a few days. So I'm thinking about this. The more I find myself wanting to be more intentional about the way I live.
And we talk about this all the time, my month, my birthday life planning, but I also want to make sure I'm being intentional about how I eat and take care of my body. This is why I really like our sponsor Mosh, M-O-S-H, Mosh bars. I love these things. It's one of my sort of go-to when I have these around, it's really like a go-to snack that I like to turn to.
I love my Mosh bars. Let me tell you a little bit more why Mosh, which was started by actually interestingly Maria Shriver and her son, Patrick Schwarzenegger, with the mission of creating a conversation about brain health because Maria's father suffered from Alzheimer's. So they care a lot about this.
They developed Mosh bars by joining forces with the world's top scientists and functional nutritionist that give you a protein bar that goes beyond what you normally get in a protein bar product. It has 10 delicious flavors, including three that are plant-based. It is made with ingredients that support brain health, like ashwagandha, lion's mane, collagen, and omega-3.
Mosh bars come with a new look and a new formulation featuring a game-changing brain-boosting ingredient you won't find at any other bar. It is the first and only food brand boosted with Cognizant, a premium nootropic that supplies the brain with a patented form of cytokoline or cytokoline. Here's the thing about the Mosh bars, they taste good.
I like them because they are soft with a little bit of crunch inside of them. So you really, really crave eating these things. A lot of protein sort of gives you what you need, and it has these sort of brain-boosting ingredients as well, which really comes from Maria and her son, Patrick's real concern about brain and brain's health.
That concern is also why Mosh donates a portion of all proceeds to fund gender-based brain health research through the Women's Alzheimer's Movement. Why gender-based? Because Maria and Patrick noted that two-thirds of all Alzheimer's patients are women. So Mosh is working closely to close the gap between women's and men's health research, especially on this particular topic.
All right, so great tasting protein bars that have all this really cool stuff in them, built by cool people pushing a good cause. So if you want to find ways to give back to others and fuel your body and your brain, Mosh bars are the perfect choice for you.
Head to moshlife.com/deep to save 20% off plus free shipping on either the Best Sellers Trial Pack or the new Plant-Based Trial Pack. That's 20% off plus free shipping on either the Best Sellers or Plant-Based Trial Pack at moshlife.com/deep. Thank you, Mosh, for sponsoring this episode. I also want to talk about our friends at Shopify.
Whether you're selling a little or a lot, Shopify helps you do your things however you cha-ching. If you sell things, you need to know about Shopify. It's the global commerce platform that helps you sell at every stage of your business. From the launch your online shop stage to the first real life store stage, all the way to the "did we just hit a million order" stage, Shopify is there to help you grow.
They have an all-in-one e-commerce platform, which makes checking out online an absolute breeze with super high conversion. They also have in-person POS systems right there in the store that what the people use to actually do their credit card and do their transactions. However you're selling, Shopify is really best in class at what they're doing.
They've even added, matching the theme of today's program, an AI feature called Shopify Magic that helps you sell even more to your customers, be more successful at conversions. It's a no-brainer. You're selling something, you do need to check out Shopify. The good news is I can help you do that with a good deal.
You can sign up for a $1 per month trial period at shopify.com/deep. You got to type that all lowercase, but if you go to shopify.com/deep now, you can grow your business no matter what stage you're in. That's shopify.com/deep. All right, Jesse, let's do our final segment. All right, this article was sent to me a lot, and I guess it's because I'm mentioned in it or because it feels like it's really important.
I brought it up here on the screen for people who are watching instead of just listening. The article that most people sent me on this issue came from Axios. Emily Peck wrote it. The title of the Axios articles is "Why Employers Wind Up With Mouse-Jiggling Workers." All right, so they're talking about mouse jigglers, which I had to look up.
But it is software you can run on your computer that basically moves your mouse pointer around. So it simulates like if you're actually there jiggling your formal mouse. Well, it turns out a bunch of mouse jigglers got fired at Wells Fargo. They discovered that they were using the mouse jigglers, and they fired workers from their wealth and investment management unit.
So we're kind of looking into this. There's a couple of reasons why the mouse jiggling is useful for remote workers. One of them is the fact that common instant message agents like Slack and Microsoft Teams puts this little status circle next to your name. So if I'm looking at you in Slack or Teams, there's a status circle that says whether you're active or not.
The idea being like, "Hey, if you're not active, then I won't text you. I won't send a message. And if you are, like, if I know you're there working on your computer, I will." Well, if your computer goes to sleep, your circle turns to inactive. So the mouse jigglers keeps your circles active.
So if your boss is just like, "Hey, what's going on with Cal over here?" They just sort of see like, "Oh, he must be working all very hard because his circle is always green. So he's there on your computer." When in reality, you could be away from your computer, but the mouse jiggler is making it seem active.
All right. So there's been a kind of a lot of outrage about the mouse jigglers and about this type of surveillance. So what do I feel about it? Well, I'm cited in this Axios article, so we can see what they think I feel about it. Let's see here. All right.
Here is how my take is described by Axios, and I'll see if I agree with this. Remote surveillance is just the latest version of a boss looking out at the office floor to check that there are butts in seats. These kinds of crude measures are part of a culture of pseudoproductivity that kicked off in the 1950s with the advent of office work, as Cal Newport writes in his latest book with a link to slow productivity.
With technology-enabled 24-hour connection to the workplace, pseudoproductivity evolved in ways that wound up driving worker burnout, like replying to emails at all hours or chiming in on every Slack message. And with the rise of remote work, this push for employees to look busy and for managers to understand who's actually working got even worse, Newport told me in a recent interview.
It just spiraled completely out of control. Well, you know what? I agree with this Cal Newport character. This is the way I see this, and I think this is the right way to see this. There's a smaller argument, which I think is too narrow, which is the argument of bosses are using remote surveillance, we should tell bosses to stop using remote surveillance.
I think this is the narrower thing here. Digital tools are giving us ways to do this privacy-violating surveillance, and we should push back on that. Fair enough. It's not the bigger issue. The bigger issue is what's mentioned here, this bigger trend. This is what I outline in chapter one of my book, Slow Productivity.
It's what explicitly puts this book in the tradition of my technology writings, why this book is really a technology book, even though it's talking about knowledge work. And here's the argument. For 70 years, knowledge work has depended on what I call pseudo-productivity, this heuristic that says visible activity will be our proxy for useful effort.
We do this not because our bosses are mustache twirlers or because they're trying to exploit us, but because we didn't have a better way of measuring productivity in this new world of cognitive work. There's no widgets I can point to. There's no pile of Model Ts lined up in the parking lot that I can count.
So what we do is like, well, to see you in the office is better than not. So come to the office, do factory shifts, be here for eight hours, don't spend too much time at the coffee machine. So we had this sort of crude heuristic because we didn't know how else to manage knowledge workers.
And as pointed out in this article, that way of crudely managing productivity didn't play nicely with the front office IT revolution. And this mouse jiggler is just the latest example of this reality. When we added 24-hour remote internet-based connectivity through mobile computing that's with us at all times to the workplace, pseudo-productivity became a problem.
When pseudo-productivity meant, okay, I guess I have to come to an office for eight hours like I'm putting steering wheels on a Model T, that's kind of dumb, but I'll do it. And that's what pseudo-productivity meant. And also like, if I'm reading a magazine at my desk, keep it below where my boss can see it.
Fair enough. But once we got laptops and then we got smartphones and we got the mobile computing revolution, now pseudo-productivity meant every email I reply to is a demonstration of effort. Every Slack message I reply to is a demonstration of effort. I could be doing more effort at any point.
In the evening, I could be doing it. At my kid's soccer game, I could be showing more effort. This was impossible in 1973, completely possible in 2024. This is what leads us to things like, I'm going to have a piece of software that artificially shakes my mouse because that circle being green next to my name in Slack longer is showing more pseudo-productivity.
So the inanity of pseudo-productivity becomes pronounced and almost absurdist in its implications once we get to the digital age. That's why I wrote Slow Productivity Now. That's why we need slow productivity now, because we have to replace pseudo-productivity with something that's more results oriented and that plays nicer with the digital revolution.
So this is just like one of many, many symptoms of the diseased state of modern knowledge work that's caused by us relying on this super vague and crude heuristic of just like doing stuff is better than not doing stuff. We have to get more specific. Slow productivity gives you a whole philosophical and tactical roadmap to something more specific.
It's based on results. It's not based on activity. It's based on production over time, not on busyness in the moment. It's based on sequential focus and not on concurrent overload. It's based on quality and not activity, right? So it's an alternative to the pseudo-productivity that's causing problems like this mouse jiggler problem.
So that's the bigger problem. New technologies requires us to finally do the work of really updating what we think about knowledge work. That's why I wrote that most recent book about it. It's also why I hate that status light in Slack or Microsoft Teams. Of course that's going to be a problem.
Of course that's going to be a problem. And even the underlying mentality of that status light, which is like, if you're at your computer, it's fine for someone to send you a message. Why is that fine? What if I'm at my computer? What if I'm doing something cognitively demanding?
It's a huge issue for me to have to turn over to your message. So it also underlines the degree to which the specific tools we use completely disregard the psychological realities of how people actually do cognitive effort. So we have such a mess in knowledge work right now. It's why, whatever, three of my books are about digital knowledge work.
It's why we talk about digital knowledge work so much on this technology show is because digital age knowledge work is a complete mess. The good news is that gives us a lot of low hanging fruit to pick. That's going to cause advantages, delicious advantages. So there's a lot of good work to do.
There's a lot of easy changes we could make, but anyways, I'm glad people sent me this article. I'm glad I'm appropriately quoted here. This is accurate. This is the way I think about it. And this is the big issue. Not narrow surveillance, but broad pseudo productivity plus technology is an unsustainable combination.
All right, well, I think that's all the time we have for today. Thank you everyone who sent in their questions, case studies and calls. Be back next week with another episode, though it will probably be an episode filmed from an undisclosed location. I'm doing my sort of annual retreat into the mountains for the summer.
No worries. The show will still come out on its regular basis, but just like last year, we'll be recording some of these episodes with Jesse and I in different locations and I'll be in my undisclosed mountain location. I think next week might be the first week that is the case, but the shows will be otherwise normal and I'll give you a report from what it's like from wherever I end up.
I'll tell you about my sort of deep endeavors, whatever deep undisclosed location I find, but otherwise we'll be back and I'll see you next week. And until then, as always, stay deep. Hey, if you liked today's discussion about diffusing AI panic, you might also like episode 244 where I gave some of my more contemporaneous thoughts on chat GPT right around the time that it first launched.
Check it out. That is the deep question I want to address today. How does chat GPT work and how worried should we be about it?