The following is a conversation with Jim Keller, his second time in the podcast. Jim is a legendary microprocessor architect and is widely seen as one of the greatest engineering minds of the computing age. In a peculiar twist of space-time in our simulation, Jim is also a brother-in-law of Jordan Peterson.
We talk about this and about computing, artificial intelligence, consciousness, and life. Quick mention of our sponsors. Athletic Greens All-in-One Nutrition Drink, Brooklyn and Sheets, ExpressVPN, and Belcampo Grass-Fed Meat. Click the sponsor links to get a discount and to support this podcast. As a side note, let me say that Jim is someone who on a personal level inspired me to be myself.
There was something in his words on and off the mic, or perhaps that he even paid attention to me at all, that almost told me, "You're all right, kid. "A kind of pat on the back that can make the difference "between a mind that flourishes "and a mind that is broken down "by the cynicism of the world." So I guess that's just my brief few words of thank you to Jim, and in general, gratitude for the people who have given me a chance on this podcast, in my work, and in life.
If you enjoy this thing, subscribe on YouTube, review it on Apple Podcasts, follow on Spotify, support it on Patreon, or connect with me on Twitter, Alex Friedman. And now, here's my conversation with Jim Keller. What's the value and effectiveness of theory versus engineering, this dichotomy, in building good software or hardware systems?
- Well, good design is both. I guess that's pretty obvious. By engineering, do you mean, you know, reduction of practice of known methods? And then science is the pursuit of discovering things that people don't understand, or solving unknown problems. - Definitions are interesting here, but I was thinking more in theory, constructing models that kind of generalize about how things work.
And engineering is actually building stuff, the pragmatic, like, okay, we have these nice models, but how do we actually get things to work? Maybe economics is a nice example. Like, economists have all these models of how the economy works, and how different policies will have an effect, but then there's the actual, okay, let's call it engineering, of like, actually deploying the policies.
- So, computer design is almost all engineering, and reduction of practice of known methods. Now, because of the complexity of the computers we built, you know, you could think you're, well, we'll just go write some code, and then we'll verify it, and then we'll put it together, and then you find out that the combination of all that stuff is complicated.
And then you have to be inventive to figure out how to do it, right? So, that definitely happens a lot. And then, every so often, some big idea happens, but it might be one person. - And that idea is in what, in the space of engineering, or is it in the space of-- - Well, I'll give you an example.
So, one of the limits of computer performance is branch prediction. So, and there's a whole bunch of ideas about how good you could predict a branch. And people said, there's a limit to it, it's an asymptotic curve, and somebody came up with a better way to do branch prediction.
It was a lot better. And he published a paper on it, and every computer in the world now uses it. And it was one idea. So, the engineers who build branch prediction hardware were happy to drop the one kind of training array and put it in another one. So, it was a real idea.
- And branch prediction is one of the key problems underlying all of sort of the lowest level of software, it boils down to branch prediction. - Boils down to uncertainty. Computers are limited by, single-thread computers are limited by two things. The predictability of the path of the branches and the predictability of the locality of data.
So, we have predictors that now predict both of those pretty well. So, memory is a couple hundred cycles away, local cache is a couple cycles away. When you're executing fast, virtually all the data has to be in the local cache. So, a simple program says, add one to every element in an array, it's really easy to see what the stream of data will be.
But you might have a more complicated program that says, get an element of this array, look at something, make a decision, go get another element, it's kind of random. And you can think, that's really unpredictable. And then you make this big predictor that looks at this kind of pattern and you realize, well, if you get this data and this data, then you probably want that one.
And if you get this one and this one and this one, you probably want that one. - And is that theory or is that engineering? Like the paper that was written, was it a asymptotic kind of discussion or is it more like, here's a hack that works well? - It's a little bit of both.
Like there's information theory, I think, somewhere. - Okay, so it's actually trying to prove-- - Yeah, but once you know the method, implementing it is an engineering problem. Now, there's a flip side of this, which is, in a big design team, what percentage of people think their plan or their life's work is engineering versus inventing things?
So lots of companies will reward you for filing patents. Some many big companies get stuck because to get promoted, you have to come up with something new. And then what happens is everybody's trying to do some random new thing, 99% of which doesn't matter, and the basics get neglected.
And, or they get to, there's a dichotomy. They think like the cell library and the basic CAD tools, or basic software validation methods, that's simple stuff. They wanna work on the exciting stuff. And then they spend lots of time trying to figure out how to patent something, and that's mostly useless.
- But the breakthroughs are on the simple stuff. - No, no, you have to do the simple stuff really well. If you're building a building out of bricks, you want great bricks. So you go to two places to sell bricks, one guy says, "Yeah, they're over there in an ugly pile." And the other guy is like, "Lovingly tells you about the 50 kinds of bricks, "and how hard they are, and how beautiful they are, "and how square they are, and you know, "which one are you gonna buy bricks from?
"Which is gonna make a better house?" - So you're talking about the craftsman, the person who understands bricks, who loves bricks, who loves the variety? - That's a good word. You know, good engineering is great craftsmanship. And when you start thinking engineering is about invention, and you set up a system that rewards invention, the craftsmanship gets neglected.
- Okay, so maybe one perspective is the theory, the science overemphasizes invention, and engineering emphasizes craftsmanship, and therefore, so if you, it doesn't matter what you do, theory, engineering-- - Well, everybody does. Like, read the tech rags. They're always talking about some breakthrough, or innovation, and everybody thinks that's the most important thing.
But the number of innovative ideas is actually relatively low. We need them, right? And innovation creates a whole new opportunity. Like when some guy invented the internet, right? Like, that was a big thing. The million people that wrote software against that were mostly doing engineering software writing. So the elaboration of that idea was huge.
- I don't know if you know Brendan Eich, he wrote JavaScript in 10 days. And that's an interesting story. It makes me wonder, and it was, you know, famously for many years considered to be a pretty crappy programming language. Still is, perhaps. It's been improving sort of consistently. But the interesting thing about that guy is, you know, he doesn't get any awards.
(laughs) You don't get a Nobel Prize, or a Fields Medal, or-- - For inventing a crappy piece of, you know, software code that-- - Well, that is currently the number one programming language in the world, and runs, now is increasingly running the back end of the internet. - Well, does he know why everybody uses it?
Like, that would be an interesting thing. Was it the right thing at the right time? 'Cause like, when stuff like JavaScript came out, like there was a move from, you know, writing C programs in C++ to, let's call it, what they call managed code frameworks. Where you write simple code, it might be interpreted, it has lots of libraries, productivity is high, and you don't have to be an expert.
So, you know, Java was supposed to solve all the world's problems, it was complicated. JavaScript came out, you know, after a bunch of other scripting languages. I'm not an expert on it, but-- - Yeah. - But was it the right thing at the right time? - The right thing at the right-- - Or was there something, you know, clever?
'Cause he wasn't the only one. - There's a few elements. - And maybe if he figured out what it was. - No, he didn't-- - Then he'd get a prize. (laughs) Like that's-- - Constructive theory. - Yeah, you know, maybe his problem is he hasn't defined this. - Or he just needs a good promoter.
(laughs) - Well, I think there was a bunch of blog posts written about it, which is like, wrong is right, which is like doing the crappy thing fast, just like hacking together the thing that answers some of the needs, and then iterating over time, listening to developers, like listening to people who actually use the thing.
This is something you can do more in software. - Yep. - But the right time, like you have to sense, you have to have a good instinct of when is the right time for the right tool, and make it super simple, and just get it out there. The problem is, this is true with hardware, this is less true with software, is there's backward compatibility that just drags behind you as you try to fix all the mistakes of the past.
But the timing-- - Was good. - There's something about that, and it wasn't accidental. You have to like give yourself over to the, you have to have this broad sense of what's needed now, both scientifically and the community, and just like, it was obvious that there was no, the interesting thing about JavaScript is everything that ran in the browser at the time, like Java and I think other like scheme, other programming languages, they were all in a separate external container.
And then JavaScript was literally just injected into the webpage, it was the dumbest possible thing running in the same thread as everything else. And like, it was inserted as a comment. So JavaScript code is inserted as a comment in the HTML code. And it was, I mean, it's either genius or super dumb, but it's like-- - Right, so it has no apparatus for like a virtual machine and container, it just executed in the framework of the program that's already running.
- Yeah, that's cool. - And then because something about that accessibility, the ease of its use, resulted in then developers innovating of how to actually use it. I mean, I don't even know what to make of that, but it does seem to echo across different software, like stories of different software.
PHP has the same story, really crappy language. They just took over the world. - Well, we used to have a joke that the random length instructions, variable length instructions, that's always won, even though they're obviously worse. Like, nobody knows why. x86 is arguably the worst architecture on the planet, it's one of the most popular ones.
- Well, I mean, isn't that also the story of risk versus sys? I mean, is that simplicity? There's something about simplicity that us in this evolutionary process is valued. If it's simple, it spreads faster, it seems like. Or is that not always true? - Not always true. Yeah, it could be simple is good, but too simple is bad.
- So why did risk win, you think, so far? - Did risk win? - In the long arc of history. - We don't know. - So who's gonna win? What's risk, what's sys, and who's gonna win in that space, in these instruction sets? - AI software's gonna win, but there'll be little computers that run little programs like normal all over the place.
But we're going through another transformation, so. - But you think instruction sets underneath it all will change? - Yeah, they evolve slowly. They don't matter very much. - They don't matter very much, okay. - I mean, the limits of performance are, you know, predictability of instructions and data. I mean, that's the big thing.
And then the usability of it is some, you know, quality of design, quality of tools, availability. Like right now, x86 is proprietary with Intel and AMD, but they can change it any way they want independently. Right, ARM is proprietary to ARM, and they won't let anybody else change it.
So it's like a sole point. And RISC-V is open source, so anybody can change it, which is super cool, but that also might mean it gets changed in too many random ways that there's no common subset of it that people can use. - Do you like open or do you like closed?
Like if you were to bet all your money on one or the other, RISC-V versus it? - No idea. - It's case dependent? - Well, x86, oddly enough, when Intel first started developing it, they licensed it to like seven people. So it was the open architecture. And then they move faster than others and also bought one or two of them.
But there was seven different people making x86 'cause at the time there was 6502 and Z80s and 8086. And you could argue everybody thought Z80 was the better instruction set, but that was proprietary to one place. Oh, and the 6800. So there's like four or five different microprocessors. Intel went open, got the market share 'cause people felt like they had multiple sources from it.
And then over time it narrowed down to two players. - So why, you as a historian, why did Intel win for so long with their processors? I mean- - They were great. Their process development was great. - So it's just looking back to JavaScript and like is a Microsoft and Netscape and all these internet browsers.
Microsoft won the browser game because they aggressively stole other people's ideas. Like right after they did it. - You know, I don't know if Intel was stealing other people's ideas. They started making- - In a good way. Stealing in a good way, just to clarify. - They started making RAMs, random access memories.
And then at the time when the Japanese manufacturers came up, they were getting out competed on that. And they pivoted the microprocessors and they made the first integrated microprocessor and ran programs. It was the 404 or something. - Who was behind that pivot? That's a hell of a pivot.
- Andy Grove, and he was great. - That's a hell of a pivot. - And then they led semiconductor industry. Like they were just a little company, IBM. All kinds of big companies had boatloads of money and they out-innovated everybody. - Out-innovated, okay. - Yeah, yeah. - So it's not like marketing, it's not any of that stuff.
- Their processor designs were pretty good. I think the Core 2 was probably the first one I thought was great. It was a really fast processor. And then Haswell was great. - What makes a great processor in that? - Oh, if you just look at its performance versus everybody else, it's the size of it, the usability of it.
- So it's not specific, some kind of element that makes it beautiful, it's just like literally just raw performance. Is that how you think about processors? It's just like raw performance? - Of course. It's like a horse race. The fastest one wins. Now-- - You don't care how. (laughing) So as long as it wins.
- Well, there's the fastest in the environment. Like for years you made the fastest one you could and then people started to have power limits. So then you made the fastest at the right power point. And then when we started doing multi-processors, like if you could scale your processors more than the other guy, you could be 10% faster on like a single thread, but you have more threads.
So there's lots of variability. And then Arm really explored like, you know, they have the A series and the R series and the M series, like a family of processors for all these different design points from like unbelievably small and simple. And so then when you're doing the design, it's sort of like this big palette of CPUs.
Like they're the only ones with a credible, you know, top to bottom palette. And-- - What do you mean a credible top to bottom? - Well, there's people that make microcontrollers that are small, but they don't have a fast one. There's people make fast processors, but don't have a medium one or a small one.
- Is it hard to do that full palette? That seems like a-- - Yeah, it's a lot of different-- - So what's the difference between the Arm folks and Intel in terms of the way they're approaching this problem? - Well, Intel, almost all their processor designs were, you know, very custom high end, you know, for the last 15, 20 years.
- The fastest horse possible. - Yeah. - In one horse sense. - Yeah, and they architecturally are really good, but the company itself was fairly insular to what's going on in the industry with CAD tools and stuff. And there's this debate about custom design versus synthesis and how do you approach that.
I'd say Intel was slow on the, getting to synthesize processors. Arm came in from the bottom and they generated IP, which went to all kinds of customers. So they had very little say in how the customer implemented their IP. So Arm is super friendly to the synthesis IP environment.
Whereas Intel said, we're gonna make this great client chip or server chip with our own CAD tools, with our own process, with our own, you know, other supporting IP and everything only works with our stuff. - So is that, is Arm winning the mobile platform space in terms of process?
- Of course, yeah. - And so in that, what you're describing is why they're winning. - Well, they had lots of people doing lots of different experiments. So they controlled the process architecture and IP, but they let people put in lots of different chips. And there was a lot of variability in what happened there.
Whereas Intel, when they made their mobile, their foray into mobile, they had one team doing one part. Right, so it wasn't 10 experiments. And then their mindset was PC mindset, Microsoft software mindset. And that brought a whole bunch of things along that the mobile world, the embedded world don't do.
- Do you think it was possible for Intel to pivot hard and win the mobile market? That's a hell of a difficult thing to do, right? For a huge company to just pivot. I mean, it's so interesting to, 'cause we'll talk about your current work. It's like, it's clear that PCs were dominating for several decades, like desktop computers.
And then mobile, it's unclear. - It's a leadership question. Like Apple under Steve Jobs, when he came back, they pivoted multiple times. You know, they build iPads and iTunes and phones and tablets and great Macs. Like who knew computers should be made out of aluminum? Nobody knew that. But they're great, it's super fun.
- No, Steve? - Yeah, Steve Jobs. Like they pivoted multiple times. And the old Intel, they did that multiple times. They made DRAMs and processors and processes. - I gotta ask this. What was it like working with Steve Jobs? - I didn't work with him. - Did you interact with him?
- Twice. I said hi to him twice in the cafeteria. - What did he say? Hi? - He said, "Hey, fellas." (laughing) He was friendly. He was wandering around with somebody. He couldn't find a table 'cause the cafeteria was packed. And I gave him my table. But I worked for Mike Colbert, who talked to, like Mike was the unofficial CTO of Apple and a brilliant guy.
And he worked for Steve for 25 years, maybe more. And he talked to Steve multiple times a day. And he was one of the people who could put up with Steve's, let's say, brilliance and intensity. And Steve really liked him. And Steve trusted Mike to translate the shit he thought up into engineering products that worked.
And then Mike ran a group called Platform Architecture. And I was in that group. So many times I'd be sitting with Mike and the phone would ring. And it'd be Steve. And Mike would hold the phone like this 'cause Steve would be yelling about something or other. - Yeah.
And then he would translate. - And he'd translate. And then he would say, "Steve wants us to do this." So. - Was Steve a good engineer or no? - I don't know. He was a great idea guy. - Idea person. - And he's a really good selector for talent.
- Yeah. That seems to be one of the key elements of leadership. Right? - And then he was a really good first principles guy. Like somebody say something couldn't be done and he would just think, "That's obviously wrong." Right? But maybe it's hard to do. Maybe it's expensive to do.
Maybe we need different people. There's a whole bunch of, if you wanna do something hard, maybe it takes time. Maybe you have to iterate. There's a whole bunch of things you could think about. But saying it can't be done is stupid. - How would you compare, so it seems like Elon Musk is more engineering centric.
But is also, I think he considers himself a designer too. He has a design mind. Steve Jobs feels like he is much more idea space, design space versus engineering. - Yeah. - Just make it happen. Like the world should be this way. Just figure it out. - But he used computers.
He had computer people talk to him all the time. Like Mike was a really good computer guy. He knew computers could do. - Computer meaning computer hardware? Like a whole lot of stuff. - Hardware, software, all the pieces. - The whole thing. - And then he would have an idea about what could we do with this next.
That was grounded in reality. It wasn't like he was just finger painting on the wall and wishing somebody would interpret it. So he had this interesting connection because he wasn't a computer architect or designer, but he had an intuition from the computers we had to what could happen. - Essentially you say intuition because it seems like he was pissing off a lot of engineers in his intuition about what can and can't be done.
What is all these stories about floppy disks and all that kind of stuff. - Yeah, so in Steve, the first round, like he'd go into a lab and look at what's going on and hate it and fire people or ask somebody in the elevator what they're doing for Apple and not be happy.
When he came back, my impression was, is he surrounded himself with this relatively small group of people and didn't really interact outside of that as much. And then the joke was, you'd see like somebody moving a prototype through the quad with a black blanket over it. And that was 'cause it was secret, partly from Steve 'cause they didn't want Steve to see it until it was ready.
- Yeah, the dynamic with Johnny Ive and Steve is interesting. It's like you don't wanna, he ruins as many ideas as he generates. It's a dangerous kind of line to walk. - If you have a lot of ideas, like Gordon Bell was famous for ideas, right? And it wasn't that the percentage of good ideas was way higher than anybody else.
It was, he had so many ideas and he was also good at talking to people about it and getting the filters right and seeing through stuff. Whereas Elon was like, hey, I wanna build rockets. So Steve would hire a bunch of rocket guys and Elon would go read rocket manuals.
- So Elon's a better engineer, a sense like, or like more like a love and passion for the manuals. - And the details, the data and the understanding. - The craftsmanship too, right? - Well, I guess you had craftsmanship too, but of a different kind. What do you make of the, just to stand in for just a little longer, what do you make of like the anger and the passion and all that, the firing and the mood swings and the madness, being emotional and all that, that's Steve and I guess Elon too.
So what, is that a bug or a feature? - It's a feature. So there's a graph, which is y-axis productivity, x-axis at zero is chaos, and infinity is complete order. So as you go from the origin, as you improve order, you improve productivity. And at some point productivity peaks and then it goes back down again.
Too much order, nothing can happen. - Yes, but the question is, how close to the chaos is that? - No, here's the thing, is once you start moving in a direction of order, the force vector to drive you towards order is unstoppable. - Oh, this is a slippery slope.
- And every organization will move to the place where their productivity is stymied by order. - So you need a-- - So the question is, who's the counterforce? Like, 'cause it also feels really good. As you get more organized and productivity goes up, the organization feels it, they orient towards it, right?
They hire more people. They get more guys who can run process, you get bigger. Right, and then inevitably, the organization gets captured by the bureaucracy that manages all the processes. Right, and then humans really like that. And so if you just walk into a room and say, guys, love what you're doing, but I need you to have less order.
If you don't have some force behind that, nothing will happen. - I can't tell you on how many levels that's profound. So-- - So that's why I say it's a feature. Now, could you be nicer about it? I don't know, I don't know any good examples of being nicer about it.
Well, the funny thing is to get stuff done, you need people who can manage stuff and manage people, 'cause humans are complicated. They need lots of care and feeding, and you need to tell them they look nice and they're doing good stuff, and pat 'em on the back. Right?
- I don't know. You tell me. Is that needed? - Oh yeah. - Do humans need that? - I had a friend, he started a magic group, and he said, I figured it out. You have to praise them before they do anything. I was waiting 'til they were done, and they were always mad at me.
Now I tell 'em what a great job they're doing while they're doing it. But then you get stuck in that trap, 'cause then when you're not doing something, how do you confront these people? - I think a lot of people that had trauma in their childhood would disagree with you, successful people, that you need to first do the rough stuff and then be nice later.
I don't know. - Okay, but-- - Being nice-- - Engineering companies are full of adults who had all kinds of wretched childhoods. You know. - I don't know. - Most people had okay childhoods. - Well, I don't know if-- - And lots of people only work for praise, which is weird.
- You mean like everybody? (laughing) - I'm not that interested in it, but. - Well, you're probably looking for somebody's approval. Even still. - Yeah, maybe. I should think about that. - Maybe somebody who's no longer with us kind of thing. I don't know. - I used to call up my dad and tell him what I was doing.
He was very excited about engineering and stuff. - You got his approval? - Yeah, a lot. I was lucky. He decided I was smart and unusual as a kid, and that was okay when I was really young. So when I did poorly in school, I was dyslexic. I didn't read until I was third or fourth grade.
They didn't care. My parents were like, "Oh, he'll be fine." So I was lucky. That was cool. - Is he still with us? You miss him? - Sure, yeah. He had Parkinson's and then cancer. His last 10 years were tough. And I killed him. Killing a man like that's hard.
- The mind? - Well, it's pretty good. Parkinson's causes slow dementia, and the chemotherapy, I think, accelerated it. But it was like hallucinogenic dementia. So he was clever and funny and interesting, and it was pretty unusual. - Do you remember conversations from that time? Like what, do you have fond memories of the guy?
- Yeah, oh yeah. - Anything come to mind? - A friend told me one time I could draw a computer on the whiteboard faster than anybody he'd ever met, and I said, "You should meet my dad." Like when I was a kid, he'd come home and say, "I was driving by this bridge, and I was thinking about it, "and he pulled out a piece of paper, "and he'd draw the whole bridge." He was a mechanical engineer.
And he would just draw the whole thing, and then he would tell me about it, and tell me how he would've changed it. And he had this idea that he could understand and conceive anything. And I just grew up with that, so that was natural. So when I interview people, I ask them to draw a picture of something they did on a whiteboard, and it's really interesting.
Like some people draw a little box, and then they'll say, "And it just talks to this." And I'll be like, "Oh, this is frustrating." And then I had this other guy come in one time, he says, "Well, I designed a floating point in this chip, "but I'd really like to tell you how the whole thing works, "and then tell you how the floating point works inside it.
"Do you mind if I do that?" And he covered two whiteboards in like 30 minutes. And I hired him. He was great. - Just craftsman. I mean, that's the craftsmanship to that. - Yeah, but also the mental agility to understand the whole thing. Put the pieces in context. Real view of the balance of how the design worked.
Because if you don't understand it properly, when you start to draw it, you'll fill up half the whiteboard with a little piece of it. Your ability to lay it out in an understandable way takes a lot of understanding. - And be able to sort of zoom into the detail and then zoom out to the big picture.
- And zoom out really fast. - What about the impossible thing? You said your dad believed that you could do anything. That's a weird feature for a craftsman. - Yeah. - It seems that that echoes in your own behavior. Like that's the-- - Well, it's not that anybody can do anything right now.
It's that if you work at it, you can get better at it and there might not be a limit. And they did funny things. Like he always wanted to play piano. So at the end of his life, he started playing the piano. When he had Parkinson's and he was terrible.
But he thought if he really worked at it in this life, maybe the next life he'd be better at it. - He might be onto something. - Yeah, indeed. (laughing) He enjoyed doing it. - Yeah. - So that's pretty funny. - Do you think the perfect is the enemy of the good in hardware and software engineering?
It's like we were talking about JavaScript a little bit and the messiness of the 10 day building process. - Yeah, it's creative tension, right? So creative tension is you have two different ideas that you can't do both, right? And, but the fact that you wanna do both causes you to go try to solve that problem.
That's the creative part. So if you're building computers, like some people say we have the schedule and anything that doesn't fit in the schedule we can't do. Right, so they throw out the perfect 'cause they have a schedule. I hate that. Then there's other people who say we need to get this perfectly right, and no matter what, you know, more people, more money, right?
And there's a really clear idea about what you want. Some people are really good at articulating it. So let's call that the perfect, yeah. - Yeah. - All right, but that's also terrible 'cause they never ship anything, and you never hit any goals. So now you have your framework.
- Yes. - You can't throw out stuff 'cause you can't get it done today, 'cause maybe you'll get it done tomorrow with the next project, right? You can't, so you have to, I work with a guy that I really like working with, but he over-filters his ideas. - Over-filters?
- He'd start thinking about something, and as soon as he figured out what's wrong with it, he'd throw it out. And then I start thinking about it, and you know, you come up with an idea, and you find out what's wrong with it, and then you give it a little time to set, 'cause sometimes, you know, you figure out how to tweak it, or maybe that idea helps some other idea.
So idea generation is really funny. So you have to give your ideas space. Like spaciousness of mind is key. But you also have to execute programs and get shit done. And then it turns out, computer engineering's fun because it takes, you know, a hundred people to build a computer, 200 or 300, whatever the number is, and people are so variable about, you know, temperament and, you know, skill sets and stuff, that in a big organization, you find the people who love the perfect ideas, and the people that wanna get stuff done yesterday, and people like to come up with ideas, and people like to, let's say, shoot down ideas, and it takes the whole, it takes a large group of people.
- Some are good at generating ideas, some are good at filtering ideas, and then all, in that giant mess, you're somehow, I guess the goal is for that giant mess of people to find the perfect path through the tension, the creative tension. But like, how do you know when, you said there's some people good at articulating what perfect looks like, what a good design is.
Like, if you're sitting in a room, and you have a set of ideas about, like, how to design a better processor, how do you know this is something special here, this is a good idea, let's try this. - Have you ever brainstormed an idea with a couple people that were really smart?
And you kinda go into it, and you don't quite understand it, and you're working on it, and then you start, you know, talking about it, putting it on the whiteboard, maybe it takes days or weeks, and then your brains start to kinda synchronize. It's really weird. Like, you start to see what each other is thinking.
And it starts to work. Like, you can see where, like, my talent in computer design is I can see how computers work in my head, like, really well. And I know other people can do that, too. And when you're working with people that can do that, like, it is kind of an amazing experience.
And then, and every once in a while, you get to that place, and then you find the flaw, which is kinda funny, 'cause you can fool yourself. But-- - The two of you kinda drifted along into a direction that was useless. - Yeah, that happens, too. Like, you have to, 'cause, you know, the nice thing about computer design, there's always reduction in practice.
Like, you come up with your good ideas, and I know some architects who really love ideas, and then they work on 'em, and they put it on the shelf, and they go work on the next idea, and put it on the shelf, and they never reduce it to practice, so they find out what's good and bad.
'Cause almost every time I've done something really new, by the time it's done, like, the good parts are good, but I know all the flaws, like-- - Yeah, would you say your career, just your own experience, is your career defined by, mostly by flaws or by successes? Like, if-- - Again, there's great tension between those.
If you haven't tried hard, right, and done something new, right, then you're not gonna be facing the challenges when you build it, then you find out all the problems with it, and-- - But when you look back, do you see problems, or? Okay. - When I look back, I think earlier in my career, like, EV5 was the second alpha chip.
I was so embarrassed about the mistakes, I could barely talk about it. And it was in the Guinness Book of World Records, and it was the fastest processor on the planet. So it was, and at some point I realized that was really a bad mental framework to deal with, like, doing something new.
We did a bunch of new things, and some worked out great, and some were bad, and we learned a lot from it, and then the next one we learned a lot. That also, EV6 also had some really cool things in it. I think the proportion of good stuff went up, but it had a couple fatal flaws in it that were painful.
And then, yeah. - You learned to channel the pain into, like, pride. - Not pride, really, just realization about how the world works, or how that kind of idea set works. - Life is suffering, that's the reality. - What-- - No, it's not. I know the Buddha said that, and a couple other people are stuck on it.
No, it's, you know, there's this kind of weird combination of good and bad, and light and darkness that you have to tolerate and deal with. Yeah, there's definitely lots of suffering in the world. - Depends on the perspective. It seems like there's way more darkness, but that makes the light part really nice.
What computing hardware, or just any kind of, even software design, are you, do you find beautiful? From your own work, from other people's work, that you're just, we were just talking about the battleground of flaws and mistakes and errors, but things that were just beautifully done. Is there something that pops to mind?
- Well, when things are beautifully done, usually there's a well-thought-out set of abstraction layers. - So the whole thing works in unison nicely. - Yes, and when I say abstraction layer, that means two different components, when they work together, they work independently. They don't have to know what the other one is doing.
- So that decoupling. - Yeah, so the famous one was the network stack. There's a seven-layer network stack, you know, data transport and protocol and all the layers, and the innovation was is when they really got that right, 'cause networks before that didn't define those very well, the layers could innovate independently, and occasionally the layer boundary would, you know, the interface would be upgraded.
And that let, you know, the design space breathe. You could do something new in layer seven without having to worry about how layer four worked. And so good design does that, and you see it in processor designs. When we did the Zen design at AMD, we made several components very modular.
And, you know, my insistence at the top was I wanted all the interfaces defined before we wrote the RTL for the pieces. One of the verification leads said, "If we do this right, I can test the pieces "so well independently when we put it together, "we won't find all these interaction bugs "'cause the floating point knows how the cache works." And I was a little skeptical, but he was mostly right.
That the modularity of the design greatly improved the quality. - Is that universally true in general? Would you say about good designs, the modularity is like usually-- - Well, we talked about this before. Humans are only so smart, and we're not getting any smarter, right? But the complexity of things is going up.
So, you know, a beautiful design can't be bigger than the person doing it. It's just, you know, their piece of it. Like the odds of you doing a really beautiful design with something that's way too hard for you is low, right? If it's way too simple for you, it's not that interesting.
It's like, well, anybody could do that. But when you get the right match of your expertise and, you know, mental power to the right design size, that's cool, but that's not big enough to make a meaningful impact in the world. So now you have to have some framework to design the pieces so that the whole thing is big and harmonious, but, you know, when you put it together, it's sufficiently interesting to be used, and so that's what a beautiful design is.
- Matching the limits of that human cognitive capacity to the module you can create, and creating a nice interface between those modules, and thereby, do you think there's a limit to the kind of beautiful complex systems we can build with this kind of modular design? It's like, you know, we build increasingly more complicated, you can think of like the internet.
Okay, let's scale it down. You can think of like social network, like Twitter as one computing system. But those are little modules, right? - But it's built on so many components nobody at Twitter even understands. - Right. - So if an alien showed up and looked at Twitter, he wouldn't just see Twitter as a beautiful, simple thing that everybody uses, which is really big.
You would see the network it runs on, the fiber optics, the data's transported, the computers, the whole thing is so bloody complicated, nobody at Twitter understands it. And so-- - I think that's what the alien would see. So yeah, if an alien showed up and looked at Twitter, or looked at the various different networked systems that you could see on Earth.
- So imagine they were really smart and they could comprehend the whole thing. And then they sort of evaluated the human and thought, this is really interesting, no human on this planet comprehends the system they built. - No individual, well, would they even see individual humans? Like we humans are very human-centric, entity-centric, and so we think of us as the central organism and the networks as just a connection of organisms.
But from a perspective of, from an outside perspective, it seems like we're just one organism. - Yeah, I get it. We're the ants and they'd see the ant colony. - The ant colony, yeah. Or the result of production of the ant colony, which is like cities and it's, in that sense, humans are pretty impressive.
The modularity that we're able to, and how robust we are to noise and mutation, all that kind of stuff. - Well, that's 'cause it's stress-tested all the time. - Yeah. - You know, you build all these cities with buildings and you get earthquakes occasionally, and some wars, earthquakes. Viruses every once in a while.
- You know, changes in business plans for, you know, like shipping or something. Like, as long as it's all stress-tested, then it keeps adapting to the situation. So, it's a curious phenomenon. - Well, let's go, let's talk about Moore's Law a little bit. At the broad view of Moore's Law, where it's just exponential improvement of computing capability.
Like, OpenAI, for example, recently published this kind of paper, it's looking at the exponential improvement in the training efficiency of neural networks. For like ImageNet and all that kind of stuff, we just got better on this, this is purely software side, just figuring out better tricks and algorithms for training neural networks.
And that seems to be improving significantly faster than the Moore's Law prediction, you know? So, that's in the software space. What do you think, if Moore's Law continues, or if the general version of Moore's Law continues, do you think that comes mostly from the hardware, from the software, some mix of the two, some interesting, totally, so not the reduction of the size of the transistor kind of thing, but more in the totally interesting kinds of innovations in the hardware space, all that kind of stuff?
- Well, there's like a half a dozen things going on in that graph. So, one is, there's initial innovations that had a lot of headroom to be exploited. So, you know, the efficiency of the networks has improved dramatically. And then, the decomposability of those, and the use, you know, they started running on one computer, then multiple computers, then multiple GPUs, and then arrays of GPUs, and they're up to thousands.
And at some point, so it's sort of like, they were going from like a single computer application to a thousand computer application. So, that's not really a Moore's Law thing, that's an independent vector. How many computers can I put on this problem? 'Cause the computers themselves are getting better on like a Moore's Law rate, but their ability to go from one to 10, to 100, to 1,000, - Yeah.
- you know, is something. And then, multiplied by, you know, the amount of computes it took to resolve like AlexNet to ResNet to Transformers. It's been quite, you know, steady improvements. - But those are like S-curves, aren't they? - Yeah. - That's the exactly kind of S-curves that are underlying Moore's Law from the very beginning.
- So. - So, what's the biggest, what's the most productive, rich source of S-curves in the future, do you think? Is it hardware, is it software, or is it-- - So, hardware is gonna move along relatively slowly. Like, you know, double performance every two years. There's still-- - I like how you call that slow.
- Yeah, that's the slow version. The snail's pace of Moore's Law, maybe we should, we should, we should trademark that one. Whereas, the scaling by number of computers, you know, can go much faster, you know. I'm sure at some point, Google had a, you know, their initial search engine was running on a laptop, you know, like.
- Yeah. - And at some point, they really worked on scaling that, and then they factored the indexer from, you know, this piece, and this piece, and this piece, and they spread the data on more and more things, and, you know, they did a dozen innovations. But as they scaled up the number of computers on that, it kept breaking, finding new bottlenecks in their software and their schedulers, and made 'em rethink, like, it seems insane to do a scheduler across a thousand computers, to schedule parts of it, and then send the results to one computer.
But if you wanna schedule a million searches, that makes perfect sense. So, there's, the scaling by just quantity is probably the richest thing. But then, as you scale quantity, like a network that was great on a hundred computers may be completely the wrong one. You may pick a network that's 10 times slower on 10,000 computers, like per computer.
But if you go from a hundred to 10,000, that's a hundred times. So, that's one of the things that happened when we did internet scaling, is the efficiency went down. Not up. The future of computing is inefficiency, not efficiency. - But scale, inefficient scale. - It's scaling faster than inefficiency bites you.
And as long as there's dollar value there, like, scaling costs lots of money. But Google showed, Facebook showed, everybody showed that the scale was where the money was at. - And so, it was worth it financially. Do you think, is it possible that basically the entirety of Earth will be like a computing surface?
Like, this table will be doing computing, this hedgehog will be doing computing, like everything really inefficient, dumb computing will be whatever. - The science fiction books, they call it computronium. - Computronium? - We turn everything into computing. Well, most of the elements aren't very good for anything. Like, you're not gonna make a computer out of iron.
Like, you know, silicon and carbon have nice structures. - Well, we'll see what you can do with the rest of it. People talk about, well, maybe we can turn the sun into a computer, but it's hydrogen. And a little bit of helium, so. - What I mean is more like actually just adding computers to everything.
- Oh, okay. So you're just converting all the mass of the universe into a computer? - No, no, no, so not using-- - That'd be ironic from the simulation point of view, is like, the simulator built mass to simulate, like. - Yeah, I mean, yeah, so, I mean, ultimately this is all heading towards a simulation, yes.
- Yeah, well, I think I might have told you this story. At Tesla, they were deciding, so they wanna measure the current coming out of the battery, and they decide between putting a resistor in there and putting a computer with a sensor in there. And the computer was faster than the computer I worked on in 1982.
And we chose the computer 'cause it was cheaper than the resistor. So, sure, this hedgehog, you know, it costs $13, and we can put an AI that's as smart as you in there for five bucks, it'll have one. So computers will be everywhere. - I was hoping it wouldn't be smarter than me, because-- - Well, everything's gonna be smarter than you.
- But you were saying it's inefficient. I thought it was better to have a lot of dumb things. - Well, Moore's Law will slowly compact that stuff. - So even the dumb things will be smarter than us? - The dumb things are gonna be smart, or they're gonna be smart enough to talk to something that's really smart.
It's like, well, just remember, a big computer chip, it's like an inch by an inch, and 40 microns thick. It doesn't take very many atoms to make a high-power computer, and 10,000 of 'em can fit in a shoebox. But you have the cooling and power problems, but people are working on that.
- But they still can't write compelling poetry or music or understand what love is or have a fear of mortality, so we're still winning. - Neither can most of humanity. - Well, they can write books about it. But speaking about this walk along the path of innovation towards the dumb things being smarter than humans, you are now the CTO of Tenstor, as of two months ago.
They build hardware for deep learning. How do you build scalable and efficient deep learning? This is such a fascinating space. - Yeah, yeah, so it's interesting. So up until recently, I thought there was two kinds of computers. There are serial computers that run like C programs, and then there's parallel computers.
So the way I think about it is, parallel computers have given parallelism. Like, GPUs are great 'cause you have a million pixels, and modern computers are great 'cause you have a million pixels, and modern GPUs run a program on every pixel. They call it the shader program. Right, so, or like finite element analysis.
You build something, you make this into little tiny chunks, you give each chunk to a computer, so you're given all these chunks, you have parallelism like that. But most C programs, you write this linear narrative, and you have to make it go fast. To make it go fast, you predict all the branches, all the data fetches, and you run that more in parallel, but that's found parallelism.
So, AI is, I'm still trying to decide how fundamental this is. It's a given parallelism problem. But the way people describe the neural networks, and then how they write them in PyTorch, it makes graphs. - Yeah, that might be fundamentally different than the GPU kind of-- - Parallelism, yeah, it might be.
Because when you run the GPU program on all the pixels, you're running, you know, it depends, you know, this group of pixels say it's background blue, and it runs a really simple program. This pixel is, you know, some patch of your face, so you have some really interesting shader program to give you an impression of translucency.
But the pixels themselves don't talk to each other. There's no graph, right? So, you do the image, and then you do the next image, and you do the next image, and you run eight million pixels, eight million programs every time, and modern GPUs have like 6,000 thread engines in them.
So, you know, to get eight million pixels, each one runs a program on, you know, 10 or 20 pixels, and that's how they work, but there's no graph. - But you think graph might be a totally new way to think about hardware? - So, Rajagirdar and I have been having this good conversation about given versus found parallelism, and then the kind of walk, 'cause we got more transistors, like, you know, computers way back when did stuff on scalar data.
Now we did it on vector data, famous vector machines. Now we're making computers that operate on matrices, right? And then the category we said was next was spatial. Like, imagine you have so much data that, you know, you wanna do the compute on this data, and then when it's done, it says, send the result to this pile of data, run some software on that.
And it's better to think about it spatially than to move all the data to a central processor and do all the work. - So, spatially, you mean moving in the space of data as opposed to moving the data? - Yeah, you have a petabyte data space spread across some huge array of computers, and when you do a computation somewhere, you send the result of that computation or maybe a pointer to the next program to some other piece of data and do it.
But I think a better word might be graph, and all the AI neural networks are graphs. Do some computations, send the result here, do another computation, do a data transformation, do a merging, do a pooling, do another computation. - Is it possible to compress and say how we make this thing efficient, this whole process efficient, this different?
- So first, the fundamental elements in the graphs are things like matrix multiplies, convolutions, data manipulations, and data movements. - Yeah. - So GPUs emulate those things with their little singles, you know, basically running a single-threaded program. - Yeah. - And then there's, you know, NVIDIA calls it a warp, where they group a bunch of programs that are similar together, so for efficiency and instruction use.
And then at a higher level, you kind of, you take this graph and you say this part of the graph is a matrix multiplier, which runs on these 32 threads. But the model at the bottom was built for running programs on pixels, not executing graphs. - So it's emulation, ultimately.
- Yes. - So is it possible to build something that natively runs graphs? - Yes, so that's what Tenstorrent did. So-- - Where are we on that? How, like, in the history of that effort, are we in the early days? - Yeah, I think so. Tenstorrent started by a friend of mine, Lobija Bajek, and I was his first investor.
So I've been, you know, kind of following him and talking to him about it for years. And in the fall, when I was considering things to do, I decided, you know, we held a conference last year with a friend who organized it, and we wanted to bring in thinkers.
And two of the people were Andrej Karpathy and Chris Lattner. And Andrej gave this talk, it's on YouTube, called Software 2.0, which I think is great. Which is, we went from programmed computers, where you write programs, to data program computers. You know, like the future is, you know, of software is data programs, the networks.
And I think that's true. And then Chris has been working, he worked on LLVM, the low-level virtual machine, which became the intermediate representation for all compilers. And now he's working on another project called MLIR, which is mid-level intermediate representation, which is essentially under the graph about how do you represent that kind of computation and then coordinate large numbers of potentially heterogeneous computers.
And I would say technically Tenstorrent's you know, two pillars of those two ideas, software 2.0 and mid-level representation. But it's in service of executing graph programs. The hardware is designed to do that. - So it's including the hardware piece. - Yeah. And then the other cool thing is, for a relatively small amount of money, they did a test chip and two production chips.
So it's like a super effective team. And unlike some AI startups, if you don't build the hardware to run the software that they really want to do, then you have to fix it by writing lots more software. So the hardware naturally does matrix multiply, convolution, the data manipulations, and the data movement between processing elements that you can see in the graph, which I think is all pretty clever.
And that's what I'm working on now. - So I think it's called the Grace Call processor introduced last year. It's, you know, there's a bunch of measures of performance. We're talking about horses. It seems to outperform 368 trillion operations per second. Seems to outperform NVIDIA's Tesla T4 system. So these are just numbers.
What do they actually mean in real world performance? Like what are the metrics for you that you're chasing in your horse race? Like what do you care about? - Well, first, so the native language of, you know, people who write AI network programs is PyTorch now. PyTorch, TensorFlow, there's a couple others.
- Do you think PyTorch has won over TensorFlow? Or is it just- - I'm not an expert on that. I know many people who have switched from TensorFlow to PyTorch. - Yeah. - And there's technical reasons for it. - I use both. Both are still awesome. - Both are still awesome.
- But the deepest love is for PyTorch currently. - Yeah, there's more love for that. And that may change. So the first thing is when they write their programs, can the hardware execute it pretty much as it was written? Right, so PyTorch turns into a graph. We have a graph compiler that makes that graph.
Then it fractions the graph down. So if you have big matrix multiply, we turn it into right-sized chunks to run on the processing elements. It hooks all the graph up. It lays out all the data. There's a couple of mid-level representations of it that are also simulatable. So that if you're writing the code, you can see how it's gonna go through the machine, which is pretty cool.
And then at the bottom, it schedules kernels like math, data manipulation, data movement kernels, which do this stuff. So we don't have to run, write a little program to do matrix multiply 'cause we have a big matrix multiplier. Like there's no SIMD program for that, but there is scheduling for that, right?
So one of the goals is if you write a piece of PyTorch code that looks pretty reasonable, you should be able to compile it, run it on the hardware without having to tweak it and do all kinds of crazy things to get performance. - There's not a lot of intermediate steps.
It's running directly as written. - Like on a GPU, if you write a large matrix multiply naively, you'll get five to 10% of the peak performance of the GPU. Right, and then there's a bunch of people who published papers on this, and I read them about what steps do you have to do.
And it goes from pretty reasonable, well, transpose one of the matrices. So you do row order, not column ordered. You know, block it so that you can put a block of the matrix on different SMs, you know, groups of threads. But some of it gets into little details, like you have to schedule it just so, so you don't have register conflicts.
So they call them CUDA ninjas. - CUDA ninjas, I love it. - To get to the optimal point, you either use a pre-written library, which is a good strategy for some things, or you have to be an expert in microarchitecture to program it. - Right, so the optimization step is way more complicated with the GPU.
- So our goal is if you write PyTorch, that's good PyTorch, you can do it. Now there's, as the networks are evolving, you know, they've changed from convolutional to matrix multiply. People are talking about conditional graphs, they're talking about very large matrices, they're talking about sparsity. They're talking about problems that scale across many, many chips.
So the native, you know, data item is a packet. Like, so you send a packet to a processor, it gets processed, it does a bunch of work, and then it may send packets to other processors, and they execute in like a data flow graph kind of methodology. - Got it.
- We have a big network on chip, and then 16, the next second chip has 16 ethernet ports to hook lots of them together, and it's the same graph compiler across multiple chips. - So that's where the scale comes in. - So it's built to scale naturally. Now, my experience with scaling is as you scale, you run into lots of interesting problems.
So scaling is a mountain to climb. - Yeah. - So the hardware is built to do this, and then we're in the process of-- - Is there a software part to this, with ethernet and all that? - Well, the protocol at the bottom, you know, we send, it's an ethernet PHY, but the protocol basically says, send the packet from here to there, it's all point to point.
The header bit says which processor to send it to, and we basically take a packet off our on-chip network, put an ethernet header on it, send it to the other end, strip the header off and send it to the local thing. It's pretty straightforward. - Human to human interaction is pretty straightforward too, but when you get a million of us, we could do some crazy stuff together.
- Yeah, it could be fun. - So is that the goal, is scale? So like, for example, I have been recently doing a bunch of robots at home for my own personal pleasure. Am I going to ever use Tenstor or is this more for? - There's all kinds of problems, like there's small inference problems or small training problems or big training problems.
- What's the big goal? Is it the big inference training problems or the small training problems? - One of the goals is to scale from 100 milliwatts to a megawatt, you know, so like really have some range on the problems and the same kind of AI programs work at all different levels.
So that's the goal. Since the natural data item is a packet that we can move around, it's built to scale, but so many people have, you know, small problems. - Right. - Right, but, you know-- - Like inside that phone is a small problem to solve. So do you see Tenstor potentially being inside a phone?
- Well, the power efficiency of local memory, local computation and the way we built it is pretty good. And then there's a lot of efficiency on being able to do conditional graphs and sparsity. I think for complicated networks, I want to go in a small factor, it's quite good, but we have to prove that that's a fun problem.
- And that's the early days of the company, right? It's a couple of years, you said? But you think you invested, you think they're legit and so you join. - Yeah, I do. Well, it's also, it's a really interesting place to be. Like the AI world is exploding, you know, and I looked at some other opportunities like build a faster processor, which people want, but that's more on an incremental path than what's gonna happen in AI in the next 10 years.
So this is kind of, you know, an exciting place to be part of. - The revolutions will be happening in the very space that Tenstor is. - And then lots of people are working on it, but there's lots of technical reasons why some of them, you know, aren't gonna work out that well.
And, you know, that's interesting. And there's also the same problem about getting the basics right. Like we've talked to customers about exciting features. And at some point we realized that, we should have realized, they wanna hear first about memory bandwidth, local bandwidth, compute intensity, programmability. They want to know the basics, power management, how the network ports work, what are the basics, do all the basics work?
'Cause it's easy to say, we've got this great idea, you know, the crack GPT-3. But the people we talked to wanna say, if I buy the, so we have a PCI Express card with our chip on it. If you buy the card, you plug it in your machine, you download the driver, how long does it take me to get my network to run?
- Right. - Right, you know, that's a real question. - It's a very basic question. - So. - Yeah, is there an answer to that yet? Or is it trying to get to-- - Our goal is like an hour. - Okay, when can I buy a Tenstor? - Pretty soon.
- For my, for the small case training. - Yeah, pretty soon, months. - Good, I love the idea of you inside a room with Karpathy, Andre Karpathy and Chris Lautner. Very, very interesting, very brilliant people, very out of the box thinkers, but also like first principles thinkers. - Well, they both get stuff done.
They only get stuff done to get their own projects done. They talk about it clearly, they educate large numbers of people and they've created platforms for other people to go do their stuff on. - Yeah, the clear thinking that's able to be communicated is kind of impressive. - It's kind of remarkable, yeah, I'm a fan.
- Well, let me ask, 'cause I talk to Chris actually a lot these days. He's been one of the, just to give him a shout out, he's been so supportive as a human being. So everybody's quite different. Like great engineers are different, but he's been like sensitive to the human element in a way that's been fascinating.
Like he was one of the early people on this stupid podcast that I do to say like, don't quit this thing and also talk to whoever the hell you wanna talk to. That kind of, from a legit engineer to get like props and be like, you can do this.
That was, I mean, that's what a good leader does, right? To just kinda let a little kid do his thing, like go do it, let's see what turns out. That's a pretty powerful thing. But what's your sense about, he used to be, now I think stepped away from Google, right?
He's at Sci-Fi, I think. What's really impressive to you about the things that Chris has worked on? 'Cause we mentioned the optimization, the compiler design stuff, the LLVM. Then there's, he's also at Google worked at the TPU stuff. He's obviously worked on Swift, so the programming language side. Talking about people that work in the entirety of the stack.
From your time interacting with Chris and knowing the guy, what's really impressive to you that just inspires you? - Well, like LLVM became the de facto platform for compilers, like it's amazing. And it was good code quality, good design choices. He hit the right level of abstraction. There's a little bit of the right time, the right place.
And then he built a new programming language called Swift, which after, let's say some adoption resistance became very successful. I don't know that much about his work at Google, although I know that that was a typical, they started TensorFlow stuff and it was new. They wrote a lot of code and then at some point it needed to be refactored to be, because it's development slowed down, why PyTorch started a little later and then passed it.
So he did a lot of work on that. And then his idea about MLIR, which is what people started to realize is the complexity of the software stack above the low level IR was getting so high that forcing the features of that into a level was putting too much of a burden on it.
So he's splitting that into multiple pieces. And that was one of the inspirations for our software stack where we have several intermediate representations that are all executable and you can look at them and do transformations on them before you lower the level. So that was, I think we started before MLIR really got far enough along to use, but we're interested in that.
- He's really excited about MLIR. He's, that's his like little baby. So he, and there seems to be some profound ideas on that that are really useful. - So each one of those things has been, as the world of software gets more and more complicated, how do we create the right abstraction levels to simplify it in a way that people can now work independently on different levels of it?
So I would say all three of those projects, LLVM, Swift and MLIR did that successfully. So I'm interested in what he's gonna do next in the same kind of way. - Yes. So on either the TPU or maybe the NVIDIA GPU side, how does TensorFlow, you think, or the ideas underlying it, does that have to be TensorFlow, just this kind of graph focused, graph centric hardware, deep learning centric hardware beat NVIDIA's?
Do you think it's possible for it to basically overtake NVIDIA? - Sure. - What's that process look like? What's that journey look like, you think? - Well, GPUs were built to run shader programs on millions of pixels, not to run graphs. - Yes. - So there's a hypothesis that says the way the graphs are built is going to be really interesting to be inefficient on computing this.
And then the primitives is not a SIMD program, it's matrix multiply convolution. And then the data manipulations are fairly extensive about like how do you do a fast transpose with a program? I don't know if you've ever written a transpose program. They're ugly and slow, but in hardware you can do really well.
Like I'll give you an example. So when GPU accelerator started doing triangles, like so you have a triangle which maps on the set of pixels. So you build, it's very easy, straightforward to build a hardware engine that'll find all those pixels. And it's kind of weird 'cause you walk along the triangle till you get to the edge, and then you have to go back down to the next row and walk along, and then you have to decide on the edge if the line of the triangle is like half on the pixel, what's the pixel color?
'Cause it's half of this pixel and half the next one. That's called rasterization. - And you're saying that can be done in hardware? - No, that's an example of that operation as a software program is really bad. I've written a program that did rasterization. The hardware that does it is actually less code than the software program that does it, and it's way faster.
Right, so there are certain times when the abstraction you have, rasterize a triangle, execute a graph, components of a graph, but the right thing to do in the hardware-software boundary is for the hardware to naturally do it. - And so the GPU is really optimized for the rasterization of triangles.
- Well, no, that's just, well, like in a modern, that's a small piece of modern GPUs. What they did is they still rasterized triangles when you're running a game, but for the most part, most of the computation here in the GPU is running shader programs, but they're single-threaded programs on pixels, not graphs.
- And to be honest, let's say I don't actually know the math behind shading and lighting and all that kind of stuff. I don't know what-- - They look like little simple floating-point programs or complicated ones. You can have 8,000 instructions in a shader program. - But I don't have a good intuition why it could be parallelized so easily.
- No, it's 'cause you have 8 million pixels in every single. So when you have a light that comes down, the angle, the amount of light, like say this is a line of pixels across this table, the amount of light on each pixel is subtly different. - And each pixel is responsible for figuring out what it's on.
- Figuring it out. So that pixel says, "I'm this pixel. "I know the angle of the light. "I know the occlusion. "I know the color I am." Like every single pixel here is a different color. Every single pixel gets a different amount of light. Every single pixel has a subtly different translucency.
So to make it look realistic, the solution was you run a separate program on every pixel. - See, but I thought there's like reflection from all over the place. Is it every pixel's-- - Yeah, but there is. So you build a reflection map, which also has some pixelated thing.
And then when the pixel's looking at the reflection map, it has to calculate what the normal of the surface is, and it does it per pixel. By the way, there's boatloads of hacks on that. You may have a lower resolution light map, your reflection map. There's all these hacks they do.
But at the end of the day, it's per pixel computation. - And it's so happening you can map graph-like computation onto this pixel-centric computation. - You could do floating point programs on convolutions and matrices. And NVIDIA invested for years in CUDA. First for HPC, and then they got lucky with the AI trend.
- But do you think they're going to essentially not be able to hardcore pivot out of their-- - We'll see. That's always interesting. How often do big companies hardcore pivot? Occasionally. - How much do you know about NVIDIA, folks? - Some. - Some? - Yeah. - Well, I'm curious as well.
Who's ultimately, as a-- - Well, they've innovated several times, but they've also worked really hard on mobile. They worked really hard on radios. They're fundamentally a GPU company. - Well, they tried to pivot. There's an interesting little game and play in autonomous vehicles, right? Or semi-autonomous, like playing with Tesla and so on, and seeing that's dipping a toe into that kind of pivot.
- They came out with this platform, which is interesting technically, but it was like a 3,000 watt, 1,000 watt, $3,000 GPU platform. - I don't know if it's interesting technically. It's interesting philosophically. Technically, I don't know if it's the execution, the craftsmanship is there. I'm not sure. But I didn't get a sense-- - But they were repurposing GPUs for an automotive solution.
- Right, it's not a real pivot. - They didn't build a ground-up solution. Like the chips inside Tesla are pretty cheap. Mobileye has been doing this. They're doing the classic work from the simplest thing. They were building 40 square millimeter chips, and Nvidia, their solution, had two 800 millimeter chips and two 200 millimeter chips.
Bolt loads are really expensive DRAMs, and it's a really different approach. So Mobileye fit the, let's say, automotive cost and form factor, and then they added features as it was economically viable. And Nvidia said, "Take the biggest thing, "and we're gonna go make it work." And that's also influenced Waymo.
There's a whole bunch of autonomous startups where they have a 5,000 watt server in their trunk. But that's 'cause they think, "Well, 5,000 watts and $10,000 is okay "'cause it's replacing a driver." Elon's approach was that port has to be cheap enough to put it in every single Tesla, whether they turn on autonomous driving or not.
And Mobileye was like, "We need to fit in the BOM "and cost structure that car companies do." So they may sell you a GPS for 1,500 bucks, but the BOM for that's like $25. - Well, and for Mobileye, it seems like neural networks were not first-class citizens, like the computation.
They didn't start out as a-- - Yeah, it was a CV problem. - Yeah. - And did classic CV and found stoplights and lines, and they were really good at it. - Yeah, and they never, I mean, I don't know what's happening now, but they never fully pivoted. I mean, it's like it's the Nvidia thing.
Then, as opposed to, so if you look at the new Tesla work, it's like neural networks from the ground up, right? - Yeah, and even Tesla started with a lot of CV stuff in it, and Andre's basically been eliminating it. Move everything into the network. - So without, this isn't like confidential stuff, but you sitting on a porch looking over the world, looking at the work that Andre's doing, that Elon's doing with Tesla Autopilot, do you like the trajectory of where things are going on the hardware side?
- Well, they're making serious progress. I like the videos of people driving the beta stuff. Like, it's taken some pretty complicated intersections and all that, but it's still an intervention per drive. I mean, I have Autopilot, the current Autopilot, my Tesla, I use it every day. - Do you have full self-driving beta or no?
So you like where this is going? - They're making progress. It's taking longer than anybody thought. You know, my wonder was, is, you know, hardware three, is it enough computing? Off by two, off by five, off by 10, off by 100. - Yeah. - And I thought it probably wasn't enough, but they're doing pretty well with it now.
- Yeah. - And one thing is, the data set gets bigger, the training gets better. And then there's this interesting thing is, you sort of train and build an arbitrary size network that solves the problem. And then you refactor the network down to the thing that you can afford to ship, right?
So the goal isn't to build a network that fits in the phone, it's to build something that actually works. And then how do you make that most effective on the hardware you have? And they seem to be doing that much better than a couple of years ago. - Well, the one really important thing is also what they're doing well is how to iterate that quickly, which means like, it's not just about one time deployment, one building, it's constantly iterating the network and trying to automate as many steps as possible, right?
- Yeah. - And that's actually the principles of the Software 2.0, like you mentioned with Andre, is it's not just, I mean, I don't know what the actual, his description of Software 2.0 is, if it's just high-level philosophical or their specifics, but the interesting thing about what that actually looks in the real world is, it's that, what I think Andre calls the data engine.
It's like, it's the iterative improvement of the thing. You have a neural network that does stuff, fails at a bunch of things, and learns from it over and over and over. So you're constantly discovering edge cases. So it's very much about data engineering, like figuring out, it's kind of what you were talking about with TensorTorrent is you have the data landscape.
You have to walk along that data landscape in a way that's constantly improving the neural network. And that feels like that's the central piece that they've got to solve. And there's two pieces of it. Like you find edge cases that don't work, and then you define something that goes get you data for that.
But then the other constraint is whether you have to label it or not. Like the amazing thing about like the GPT-3 stuff is it's unsupervised. So there's essentially infinite amount of data. Now there's obviously infinite amount of data available from cars of people who are successfully driving. But the current pipelines are mostly running on labeled data, which is human limited.
So when that becomes unsupervised, right? It'll create unlimited amount of data, which then they'll scale. Now the networks that may use that data might be way too big for cars, but then there'll be the transformation from now we have unlimited data. I know exactly what I want. Now can I turn that into something that fits in the car?
And that process is gonna happen all over the place. Every time you get to the place where you have unlimited data and that's what software 2.0 is about, unlimited data training networks to do stuff without humans writing code to do it. - And ultimately also trying to discover, like you're saying, the self-supervised formulation of the problem.
So the unsupervised formulation of the problem. Like in driving, there's this really interesting thing, which is you look at a scene that's before you and you have data about what a successful human driver did in that scene one second later. It's a little piece of data that you can use just like with GPT-3 as training.
Currently, even though Tesla says they're using that, it's an open question to me, how far can you, can you solve all of the driving with just that self-supervised piece of data? And like, I think- - Well, that's what Common AI is doing. - That's what Common AI is doing, but the question is how much data, so what Common AI doesn't have is as good of a data engine, for example, as Tesla does.
That's where the, like the organization of the data. I mean, as far as I know, I haven't talked to George, but they do have the data. The question is how much data is needed? 'Cause we say infinite very loosely here. And then the other question, which you said, I don't know if you think it's still an open question, is are we on the right order of magnitude for the compute necessary?
Is this, is it like what Elon said, this chip that's in there now is enough to do full self-driving, or do we need another order of magnitude? I think nobody actually knows the answer to that question. I like the confidence that Elon has, but. - Yeah, we'll see. There's another funny thing is you don't learn to drive with infinite amounts of data.
You learn to drive with an intellectual framework that understands physics and color and horizontal surfaces and laws and roads and all your experience from manipulating your environment. There's so many factors go into that. So then when you learn to drive, driving is a subset of this conceptual framework that you have.
And so with self-driving cars right now, we're teaching them to drive with driving data. You never teach a human to do that. You teach a human all kinds of interesting things, like language, like don't do that, watch out. There's all kinds of stuff going on. - This is where you, I think, previous time we talked about where you poetically disagreed with my naive notion about humans.
I just think that humans will make this whole driving thing really difficult. - Yeah, all right. I said humans don't move that slow. It's a ballistics problem. - It's a ballistics, humans are a ballistics problem, which is like poetry to me. - It's very possible that in driving they're indeed purely a ballistics problem.
And I think that's probably the right way to think about it. But I still, they still continue to surprise me, those damn pedestrians, the cyclists, other humans in other cars. - Yeah, but it's gonna be one of these compensating things. So like when you're driving, you have an intuition about what humans are going to do, but you don't have 360 cameras and radars and you have an attention problem.
So the self-driving car comes in with no attention problem, 360 cameras, a bunch of other features. So they'll wipe out a whole class of accidents. Emergency braking with radar, and especially as it gets AI enhanced, will eliminate collisions. But then you have the other problems of these unexpected things where you think your human intuition is helping, but then the cars also have a set of hardware features that you're not even close to.
- And the key thing, of course, is if you wipe out a huge number of kind of accidents, then it might be just way safer than a human driver, even if humans are still a problem. That's hard to figure out. - Yeah, that's probably what'll happen. Autonomous cars will have a small number of accidents humans would have avoided, but they'll get rid of the bulk of them.
- What do you think about like Tesla's dojo efforts, or it can be bigger than Tesla in general. It's kind of like the tense torrent, trying to innovate. This is the dichotomy, should a company try to from scratch build its own neural network training hardware? - Well, first, I think it's great.
So we need lots of experiments, right? And there's lots of startups working on this and they're pursuing different things. Now, I was there when we started dojo, and it was sort of like, what's the unconstrained computer solution to go do very large training problems? And then there's fun stuff like, we said, well, we have this 10,000 watt board to cool.
Well, you go talk to guys at SpaceX and they think 10,000 watts is a really small number, not a big number. And there's brilliant people working on it. I'm curious to see how it'll come out. I couldn't tell you. I know it pivoted a few times since I left.
- So the cooling does seem to be a big problem. I do like what Elon said about it, which is like, we don't wanna do the thing unless it's way better than the alternative, whatever the alternative is. So it has to be way better than racks of GPUs. - Yeah, and the other thing is just like the Tesla autonomous driving hardware, it was only serving one software stack.
And the hardware team and the software team were tightly coupled. If you're building a general purpose AI solution, then there's so many different customers with so many different needs. Now, something Andre said is, I think this is amazing, 10 years ago, like vision, recommendation, language were completely different disciplines.
He said the people literally couldn't talk to each other. And three years ago, it was all neural networks, but the very different neural networks. And recently it's converging on one set of networks. They vary a lot in size, obviously they vary in data, vary in outputs, but the technology has converged a good bit.
- Yeah, these transformers behind GPT-3, it seems like they could be applied to video, they could be applied to a lot of, and it's like, and they're all really simple. - And it was like, they literally replace letters with pixels, it does vision, it's amazing. - And then size actually improves the thing.
So the bigger it gets, the more compute you throw at it, the better it gets. - And the more data you have, the better it gets. So then you start to wonder, well, is that a fundamental thing or is this just another step to some fundamental understanding about this kind of computation?
Which is really interesting. - Us humans don't want to believe that that kind of thing will achieve conceptual understanding, as you were saying, like you'll figure out physics, but maybe it will. Maybe. - Probably will. Well, it's worse than that. It'll understand physics in ways that we can't understand.
I like your Stephen Wolfram talk where he said, there's three generations of physics. There was physics by reasoning, well, big things should fall faster than small things, right, that's reasoning. And then there's physics by equations. But the number of programs in the world that are solved with the single equations is relatively low.
Almost all programs have more than one line of code, maybe 100 million lines of code. So he said, now we're going to physics by equation, which is his project, which is cool. I might point out there was two generations of physics before reasoning, habit, like all animals know things fall and birds fly and predators know how to solve a differential equation to cut off an accelerating, curving animal path.
And then there was, the gods did it, right? So there's five generations. Now, software 2.0 says programming things is not the last step. Data, so there's going to be a physics, Beth Stevens, Wolfram's concept. - That's not explainable to us humans. - And actually, there's no reason that I can see why even that's a limit.
Like, there's something beyond that. I mean, usually when you have this hierarchy, it's not like, well, if you have this step and this step and this step, and they're all qualitatively different and conceptually different, it's not obvious why, you know, six is the right number of hierarchy steps and not seven or eight or-- - Well, then it's probably impossible for us to comprehend something that's beyond the thing that's not explainable.
- Yeah, but the thing that, you know, understands the thing that's not explainable to us, well, conceives the next one, and like, I'm not sure why there's a limit to it. Clicker brain hurts, that's a sad story. - If we look at our own brain, which is an interesting illustrative example, in your work with Tess Thornton and trying to design deep learning architectures, do you think about the brain at all?
Maybe from a hardware designer perspective, if you could change something about the brain, what would you change, or do you-- - Funny question. - Like, how would you-- - So, your brain is really weird. Like, you know, your cerebral cortex, where we think we do most of our thinking, is what, like six or seven neurons thick?
- Yeah. - Like, that's weird. Like, all the big networks are way bigger than that. Like, way deeper. So, that seems odd. And then, you know, when you're thinking, if the input generates a result you can lose, it goes really fast, but if it can't, that generates an output that's interesting, which turns into an input, and then your brain, to the point where you mull things over for days, and how many trips through your brain is that, right?
Like, it's, you know, 300 milliseconds or something to get through seven levels of neurons. I forget the number exactly. But then it does it over and over and over as it searches. And the brain clearly looks like some kind of graph, 'cause you have a neuron with connections, and it talks to other ones, and it's locally very computationally intense, but it also does sparse computations across a pretty big area.
- There's a lot of messy biological type of things, and it's meaning, like, first of all, there's mechanical, chemical, and electrical signals, it's all that's going on. Then there's the asynchronicity of signals, and there's just a lot of variability that seems continuous and messy, and just a mess of biology, and it's unclear whether that's a good thing, or it's a bad thing, because if it's a good thing, then we need to run the entirety of the evolution.
Well, we're gonna have to start with basic bacteria to create something-- - But imagine you could build a brain with 10 layers. Would that be better or worse? Or more connections, or less connections? Or, you know, we don't know to what level our brains are optimized. But if I was changing things, like, you know you can only hold seven numbers in your head.
Like, why not 100, or a million? - Never thought of that. - And why can't we have a floating point processor that can compute anything we want, like, and see it all properly? Like, that would be kind of fun. And why can't we see in four or eight dimensions?
Like, 3D is kind of a drag. Like, all the hard mass transforms are up in multiple dimensions. So there's, you know, you could imagine a brain architecture that, you know, you could enhance with a whole bunch of features that would be, you know, really useful for thinking about things.
- It's possible that the limitations you're describing are actually essential for, like, the constraints are essential for creating, like, the depth of intelligence. Like, that, the ability to reason, you know. - Yeah, it's hard to say, 'cause, like, your brain is clearly a parallel processor. You know, 10 billion neurons talking to each other at a relatively low clock rate.
But it produces something that looks like a serial thought process, it's a serial narrative in your head. - That's true. - But then, there are people, famously, who are visual thinkers. Like, I think I'm a relatively visual thinker. I can imagine any object and rotate it in my head and look at it.
And there are people who say they don't think that way at all. And recently, I read an article about people who say they don't have a voice in their head. They can talk, but when they, you know, it's like, well, what are you thinking? They'll describe something that's visual.
So that's curious. Now, if you're saying, if we dedicated more hardware to holding information, like, you know, 10 numbers or a million numbers, like, would that distract us from our ability to form this kind of singular identity? - Like, it dissipates somehow. - Right, but maybe, you know, future humans will have many identities that have some higher-level organization, but can actually do lots more things in parallel.
- Yeah, there's no reason, if we're thinking modularly, there's no reason we can't have multiple consciousnesses in one brain. - Yeah, and maybe there's some way to make it faster so that the, you know, the area of the computation could still have a unified feel to it while still having way more ability to do parallel stuff at the same time.
Could definitely be improved. - Could be improved? - Yeah. - Okay, well, it's pretty good right now. Actually, people don't give it enough credit. The thing is pretty nice. The fact that the right ends seem to be, give a nice, like, spark of beauty to the whole experience. I don't know.
I don't know if it can be improved easily. - It could be more beautiful. - I don't know how, yeah. - What do you mean, how? All the ways you can imagine. - No, but that's the whole point. I wouldn't be able to imagine, the fact that I can imagine ways in which it could be more beautiful means-- - So do you know, you know, Ian Banks, his stories?
So the super smart AIs there live, mostly live in the world of what they call infinite fun because they can create arbitrary worlds. So they interact in, you know, the story has it. They interact in the normal world and they're very smart and they can do all kinds of stuff.
And, you know, a given mind can, you know, talk to a million humans at the same time 'cause we're very slow. And for reasons, you know, artificial to the story, they're interested in people and doing stuff, but they mostly live in this other land of thinking. - My inclination is to think that the ability to create infinite fun will not be so fun.
- That's sad. - Well-- - Why there's so many things to do. Imagine being able to make a star, move planets around. - Yeah, yeah, but because we can imagine that as why life is fun, if we actually were able to do it, it'd be a slippery slope where fun wouldn't even have a meaning because we just consistently desensitize ourselves by the infinite amounts of fun we're having.
The sadness, the dark stuff is what makes it fun, I think. That could be the Russian-- - It could be the fun makes it fun and the sadness makes it bittersweet. - Yeah, that's true. Fun could be the thing that makes it fun. So what do you think about the expansion, not through the biology side, but through the BCI, the brain-computer interfaces?
Now you got a chance to check out the Neuralink stuff. - It's super interesting. Like humans, like our thoughts to manifest as action. You know, like as a kid, you know, like shooting a rifle was super fun. Driving a mini bike, doing things. And then computer games, I think, for a lot of kids became the thing where they, you know, they can do what they want.
They can fly a plane, they can do this, they can do this, right? But you have to have this physical interaction. Now imagine, you know, you could just imagine stuff and it happens, right? Like really richly and interestingly. Like we kind of do that when we dream. Like dreams are funny because like if you have some control or awareness in your dreams, like it's very realistic looking or not realistic, depends on the dream, but you can also manipulate that.
And you know, what's possible there is odd. And the fact that nobody understands it's hilarious, but. - Do you think it's possible to expand that capability through computing? - Sure. - Is there some interesting, so from a hardware designer perspective, is there, do you think it'll present totally new challenges in the kind of hardware required that like, so this hardware isn't standalone computing.
- Well, just take it from this. So today, computer games are rendered by GPUs. - Right. - Right, so, but you've seen the GAN stuff, right? Where trained neural networks render realistic images, but there's no pixels, no triangles, no shaders, no light maps, no nothing. So the future of graphics is probably AI, right?
- Yes. - Now that AI is heavily trained by lots of real data. Right, so if you have an interface with a AI renderer, right, so if you say render a cat, it won't say, well, how tall is the cat and how big, you know, it'll render a cat.
And you might say, well, a little bigger, a little smaller, you know, make it a tabby, shorter hair, you know, like you could tweak it. Like the amount of data you'll have to send to interact with a very powerful AI renderer could be low. - But the question is, for brain-computer interfaces, we'd need to render not onto a screen, but render onto the brain.
And like directly, so there's a bandwidth. - Well, we could do it both ways. I mean, our eyes are really good sensors. It could render onto a screen, and we could feel like we're participating in it. You know, they're gonna have, you know, like the Oculus kind of stuff.
It's gonna be so good when a projection to your eyes, you think it's real. You know, they're slowly solving those problems. And I suspect when the renderer of that information into your head is also AI mediated, you know, they'll be able to give you the cues that, you know, you really want for depth and all kinds of stuff.
Like your brain is partly faking your visual field, right? Like your eyes are twitching around, but you don't notice that. Occasionally they blank, you don't notice that. You know, there's all kinds of things. Like you think you see over here, but you don't really see there. It's all fabricated.
- Yeah. - So. - Yeah, peripheral vision is fascinating. - So if you have an AI renderer that's trained to understand exactly how you see and the kind of things that enhance the realism of the experience, it could be super real actually. So I don't know what the limits that are.
But obviously if we have a brain interface that goes in inside your visual cortex in a better way than your eyes do, which is possible, it's a lot of neurons. - Yeah. - Maybe that'll be even cooler. - But the really cool thing is that it has to do with the infinite fun that you were referring to, which is our brains seem to be very limited.
And like you said, computations. - Also very plastic. - Very plastic, yeah. So it's an interesting combination. - The interesting open question is the limits of that neuroplasticity. Like how flexible is that thing? 'Cause we haven't really tested it. - We know about the experiments where they put like a pressure pad on somebody's head and had a visual transducer pressurize it and somebody slowly learned to see.
- Yep. Especially at a young age, if you throw a lot at it, like what can it, so can you like arbitrarily expand it with computing power? So connected to the internet directly somehow? - Yeah, the answer's probably yes. - So the problem with biology and ethics is like, there's a mess there.
Like us humans are perhaps unwilling to take risks into directions that are full of uncertainty. So it's like-- - No, no. 90% of the population's unwilling to take risks. The other 10% is rushing into the risks, unaided by any infrastructure whatsoever. That's where all the fun happens in society.
There's been huge transformations in the last couple thousand years. - Yeah, it's funny. I've gotten the chance to interact with, this is Matthew Johnson from Johns Hopkins. He's doing this large-scale study of psychedelics. It's becoming more and more, I've gotten a chance to interact with that community of scientists working on psychedelics.
But because of that, that opened the door to me to all these, what do they call it, psychonauts, the people who, like you said, the 10% who are like, I don't care, I don't know if there's a science behind this. I'm taking this spaceship to, if I'm be the first on Mars, I'll be, psychedelics are interesting in the sense that in another dimension, like you said, it's a way to explore the limits of the human mind.
Like, what is this thing capable of doing? 'Cause you kinda, like when you dream, you detach it. I don't know exactly the neuroscience of it, but you detach your reality from what your mind, the images your mind is able to conjure up, and your mind goes into weird places.
And like entities appear. Somehow Freudian type of trauma is probably connected in there somehow, but you start to have like these weird, vivid worlds that-- - So do you actively dream? - No. - Why not? I have like six hours of dreams a night. It's like really useful time.
- I know, I haven't, I don't for some reason. I just knock out, and I have sometimes anxiety-inducing kinda like very pragmatic nightmare type of dreams, but nothing fun, nothing-- - Nothing fun? - Nothing fun. I try, I unfortunately mostly have fun in the waking world, which is very limited in the amount of fun you can have.
- It's not that limited either. - Yeah, that's why-- - Maybe we'll have to talk. (laughing) - Yeah, I need instructions. Yeah, why-- - There's like a manual for that. You might wanna-- - I'll look it up. I'll ask Elon. What do you dream? - You know, years ago, and I read about, you know, like a book about how to have, you know, become aware of your dreams.
I worked on it for a while. Like there's this trick about, you know, imagine you can see your hands and look out, and I got somewhat good at it. But my mostly, when I'm thinking about things or working on problems, I prep myself before I go to sleep. It's like I pull into my mind all the things I wanna work on or think about.
And then that, let's say, greatly improves the chances that I'll work on that while I'm sleeping. - And once-- - And then I also, you know, basically ask to remember it. And I often remember very detailed-- - Within the dream or outside the dream. - Well, to bring it up in my dreaming and then to remember it when I wake up.
It's more of a meditative practice. You say, you know, to prepare yourself to do that. Like if you go to, you know, the sleep, still gnashing your teeth about some random thing that happened that you're not that really interested in, you'll dream about it. - That's really interesting. Maybe-- - But you can direct your dreams somewhat by prepping.
- Yeah, I'm gonna have to try that. It's really interesting. Like the most important, the interesting, not like, what did this guy send in an email, kind of like stupid worry stuff, but like fundamental problems you're actually concerned about in prepping-- - And interesting things you're worried about or a book you're reading or, you know, some great conversation you had or some adventure you wanna have.
Like there's a lot of space there. And it seems to work that, you know, my percentage of interesting dreams and memories went up. - Is there a, is that the source of, if you were able to deconstruct like where some of your best ideas came from, is there a process that's at the core of that?
Like so some people, you know, walk and think, some people like in the shower, the best ideas hit 'em. If you talk about like Newton, Apple hitting 'em on the head. - No, I found out a long time ago, I process things somewhat slowly. So like in college, I had friends who could study at the last minute and get an A the next day.
I can't do that at all. So I always front loaded all the work. Like I do all the problems early, you know, for finals, like the last three days, I wouldn't look at a book. Because I want, you know, 'cause like a new fact day before finals may screw up my understanding of what I thought I knew.
So my goal was to always get it in and give it time to soak. And I used to, you know, I remember when we were doing like 3D calculus, I would have these amazing dreams of 3D surfaces with normal, you know, calculating the gradient and just like all come up.
So it was like really fun, like very visual. And if I got cycles of that, that was useful. And the other is, is don't over filter your ideas. Like I like that process of brainstorming where lots of ideas can happen. I like people who have lots of ideas. - And they just let them sit.
- Then there's, yeah, I'll let them sit and let it breathe a little bit and then reduce it to practice. Like at some point you really have to, does it really work? Like, you know, is this real or not? Right, but you have to do both. There's creative tension there.
Like how do you be both open and, you know, precise? - Have you had ideas that you just, that sit in your mind for like years before the? - Sure. - That's an interesting way to, is generate ideas and just let them sit. Let them sit there for a while.
- I think I have a few of those ideas. - You know, that was so funny. Yeah, I think that's, you know, creativity discipline or something. - For the slow thinkers in the room, I suppose. As I, some people, like you said, are just like, like the. - Yeah, it's really interesting.
There's so much diversity in how people think. You know, how fast or slow they are, how well they remember or don't. Like, you know, I'm not super good at remembering facts, but processes and methods. Like in our engineering, I went to Penn State and almost all our engineering tests were open book.
I could remember the page and not the formula. But as soon as I saw the formula, I could remember the whole method if I'd learned it. - Yeah. - So it's a funny, where some people could, you know, I just watched friends like flipping through the book trying to find the formula, even knowing that they'd done just as much work.
And I would just open the book, you know, it's on page 27, bottom half, I could see the whole thing visually. - Yeah. - And you know. - And you have to learn that about yourself and figure out what the, how to function optimally. - I had a friend who was always concerned he didn't know how he came up with ideas.
He had lots of ideas, but he said they just sort of popped up like you'd be working on something, you have this idea, like where does it come from? But you can have more awareness of it. Like, like, like how your brain works as a little murky as you go down from the voice in your head or the obvious visualizations.
Like when you visualize something, how does that happen? - Yeah, that's weird. - You know, if I say, you know, visualize volcano, it's easy to do, right? - And what does it actually look like when you visualize it? - I can visualize to the point where I don't see the very much out of my eyes and I see the colors of the thing I'm visualizing.
- Yeah, but there's like a, there's a shape, there's a texture, there's a color, but there's also conceptual visualization. Like what are you actually visualizing when you're visualizing volcano? Just like with peripheral vision, you think you see the whole thing. - Yeah, yeah, yeah, that's a good way to say it.
You know, you have this kind of almost peripheral vision of your visualizations. They're like these ghosts. But if you work on it, you can get a pretty high level of detail. - And somehow you can walk along those visualizations and come up with an idea, which is weird. - But when you're thinking about solving problems, like you're putting information in, you're exercising the stuff you do know, you're sort of teasing the area that you don't understand and don't know, but you can almost feel that process happening.
Like I know sometimes when I'm working really hard on something, I get really hot when I'm sleeping. And it's like, I got the blanket throw, I wake up, I hold a blanket throw on the floor. And every time, it's wow, I wake up and think, wow, that was great.
- Are you able to reverse engineer what the hell happened there? - Well, sometimes it's vivid dreams and sometimes it's this kind of, like you say, like shadow thinking that you sort of have this feeling you're going through this stuff, but it's not that obvious. - Isn't that so amazing that the mind just does all these little experiments?
I never, I always thought it's like a river that you can't, you're just there for the ride. But you're right, if you prep it, maybe-- - It's all understandable. Meditation really helps. You gotta start figuring out, you need to learn the language of your own mind. And there's multiple levels of it.
But-- - The abstractions again, right? - It's somewhat comprehensible and observable and feelable or whatever the right word is. Yeah, you're not alone for the ride. You are the ride. - I have to ask you, hardware engineer, working on neural networks now, what's consciousness? What the hell is that thing?
Is that just some little weird quirk of our particular computing device? Or is it something fundamental that we really need to crack open if we're to build good computers? Do you ever think about consciousness? Like why it feels like something to be-- - I know, it's really weird. So, I mean, everything about it's weird.
First, it's a half a second behind reality. It's a post hoc narrative about what happened. You've already done stuff by the time you're conscious of it. And your consciousness generally is a single threaded thing, but we know your brain is 10 billion neurons running some crazy parallel thing. And there's a really big sorting thing going on there.
It also seems to be really reflective in the sense that you create a space in your head. Like we don't really see anything, right? Like photons hit your eyes, it gets turned into signals, it goes through multiple layers of neurons. I'm so curious that that looks glassy and that looks not glassy.
Like how the resolution of your vision is so high you have to go through all this processing. Where for most of it, it looks nothing like vision. Like there's no theater in your mind, right? So we have a world in our heads. We're literally just isolated behind our sensors.
But we can look at it, speculate about it, speculate about alternatives, problem solve, what if. There's so many things going on and that process is lagging reality. - And it's single threaded even though the underlying thing is like massively parallel. - So it's so curious. So imagine you're building an AI computer.
If you wanted to replicate humans, well you'd have huge arrays of neural networks and apparently only six or seven deep, which is hilarious. They don't even remember seven numbers but I think we can upgrade that a lot. And then somewhere in there, you would train the network to create basically the world that you live in.
- So it tells stories to itself about the world that it's perceiving. - Well, create the world, tell stories in the world and then have many dimensions of, like side shows to it. We have an emotional structure. We have a biological structure and that seems hierarchical too. Like if you're hungry, it dominates your thinking.
If you're mad, it dominates your thinking. And we don't know if that's important to consciousness or not, but it certainly disrupts, intrudes in the consciousness. So there's lots of structure to that and we like to dwell on the past. We like to think about the future. We like to imagine, we like to fantasize.
And the somewhat circular observation of that is the thing we call consciousness. Now, if you created a computer system and did all things, create worldviews, create the future alternate histories, dwelled on past events accurately or semi-accurately. - Will consciousness just spring up like naturally? - Well, would that look and feel conscious to you?
Like you seem conscious to me, but I don't know. - External observer sense. Do you think a thing that looks conscious is conscious? Like do you, again, this is like an engineering kind of question I think, because, like if we want to engineer consciousness, is it okay to engineer something that just looks conscious?
Or is there a difference between-- - Well, we have all consciousness 'cause it's a super effective way to manage our affairs. - Yeah, it's a social element, yeah. - Well, it gives us a planning system. We have a huge amount of stuff. Like when we're talking, like the reason we can talk really fast is we're modeling each other in really high-level detail.
- And consciousness is required for that. - Well, all those components together manifest consciousness, right? So if we make intelligent beings that we want to interact with, that we're like wondering what they're thinking, looking forward to seeing them, when they interact with them, they're interesting, surprising, fascinating, they will probably feel conscious like we do and we'll perceive them as conscious.
I don't know why not, but you never know. - Another fun question on this, because from a computing perspective, we're trying to create something that's human-like or superhuman-like. Let me ask you about aliens. - Aliens. - Do you think there's intelligent alien civilizations out there? And do you think their technology, their computing, their AI bots, their chips are of the same nature as ours?
- Yeah, I've got no idea. If there's lots of aliens out there, they've been awfully quiet. I mean, there's speculation about why. There seems to be more than enough planets out there. - There's a lot. - There's intelligent life on this planet that seems quite different. Dolphins seem plausibly understandable.
Octopuses don't seem understandable at all. If they lived longer than a year, maybe they would be running the planet. They seem really smart. And their neural architecture is completely different than ours. Now, who knows how they perceive things. - I mean, that's the question, is for us intelligent beings, we might not be able to perceive other kinds of intelligence if they become sufficiently different than us.
So we cannot understand all of this. - We live in the current constrained world. It's three-dimensional geometry and the geometry defines a certain amount of physics. And there's how time work seems to work. There's so many things that seem like a whole bunch of the input parameters to another conscious being are the same.
Like if it's biological, biological things seem to be in a relatively narrow temperature range. Because organics aren't stable, too cold or too hot. So if you specify the list of things that input to that, but soon as we make really smart beings and they go solve about how to think about a billion numbers at the same time and how to think in n dimensions.
There's a funny science fiction book where all the society had uploaded into this matrix. And at some point, some of the beings in the matrix thought, I wonder if there's intelligent life out there. So they had to do a whole bunch of work to figure out like how to make a physical thing 'cause their matrix was self-sustaining.
And they made a little spaceship and they traveled to another planet. When they got there, there was like life running around, but there was no intelligent life. And then they figured out that there was these huge, organic matrix all over the planet. Inside there were intelligent beings and they uploaded themselves into that matrix.
So everywhere intelligent life was, soon as it got smart, it up-leveled itself into something way more interesting than 3D geometry. - Yeah, it escaped, whatever this, up-leveled is better. - No, not escaped. - Up-leveled is better. The essence of what we think of as an intelligent being, I tend to like the thought experiment of the organism, like humans aren't the organisms.
I like the notion of like Richard Dawkins and memes that ideas themselves are the organisms that are just using our minds to evolve. So we're just like meat receptacles for ideas to breed and multiply and so on. And maybe those are the aliens. - So Jordan Peterson has a line that says, you think you have ideas, but ideas have you.
- Yeah, good line. And then we know about the phenomenon of groupthink and there's so many things that constrain us. But I think you can examine all that and not be completely owned by the ideas and completely sucked into groupthink. And part of your responsibility as a human is to escape that kind of phenomena, which isn't, it's one of the creative tension things again.
You're constructed by it, but you can still observe it and you can think about it and you can make choices about to some level, how constrained you are by it. And it's useful to do that. But at the same time, and it could be by doing that, the group in society you're part of becomes collectively even more interesting.
So the outside observer will think, wow, all these Lexus running around with all these really independent ideas have created something even more interesting in the aggregate. So I don't know. Those are lenses to look at the situation. That'll give you some inspiration, but I don't think they're constrained. - Right.
As a small little quirk of history, it seems like you're related to Jordan Peterson, like you mentioned. He's going through some rough stuff now. Is there some comment you can make about the roughness of the human journey, the ups and downs? - Well, I became an expert in Benzo withdrawal, which is you took Benzo's aspens and at some point they interact with GABA circuits to reduce anxiety and do a hundred other things.
There's actually no known list of everything they do 'cause they interact with so many parts of your body. And then once you're on them, you habituate to them and you have a dependency. It's not like you're a drug dependency where you're trying to get high. It's a metabolic dependency.
And then if you discontinue them, there's a funny thing called kindling, which is if you stop them and then go, you'll have a horrible withdrawal symptoms. If you go back on them at the same level, you won't be stable. And that unfortunately happened to him. - Because it's so deeply integrated into all the kinds of systems in the body.
- It literally changes the size and numbers of neurotransmitter sites in your brain. So there's a process called the Ashton Protocol where you taper it down slowly over two years. The people go through that, go through unbelievable hell. And what Jordan went through seemed to be worse because on advice of doctors, well, stop taking these and take this.
It was a disaster. And he got some, yeah, it was pretty tough. He seems to be doing quite a bit better intellectually. You can see his brain clicking back together. I spent a lot of time with him. I've never seen anybody suffer so much. - Well, his brain is also like this powerhouse, right?
So I wonder, does a brain that's able to think deeply about the world suffer more through these kinds of withdrawals? - I don't know. I've watched videos of people going through withdrawal. They all seem to suffer unbelievably. And my heart goes out to everybody. And there's some funny math about this.
Some doctor says, best he can tell, the standard recommendations, don't take them for more than a month and then taper over a couple of weeks. Many doctors prescribe them endlessly, which is against the protocol, but it's common, right? And then something like 75% of people, when they taper, it's, you know, half the people have difficulty, but 75% get off okay.
20% have severe difficulty, and 5% have life-threatening difficulty. And if you're one of those, it's really bad. And the stories that people have on this is heartbreaking and tough. - So you put some of the fault at the doctors. They just not know what the hell they're doing. - Oh, no, it's hard to say.
It's one of those commonly prescribed things. Like one doctor said, what happens is, if you're prescribed them for a reason, and then you have a hard time getting off, the protocol basically says you're either crazy or dependent and you get kind of pushed into a different treatment regime where a drug addict or a psychiatric patient.
And so like one doctor said, you know, I prescribed them for 10 years thinking I was helping my patients and I realized I was really harming them. And, you know, the awareness of that is slowly coming up. The fact that they're casually prescribed to people is horrible and it's bloody scary.
And some people are stable on them, but they're on them for life. Like once you, you know, it's another one of those drugs. But benzos long range have real impacts on your personality. People talk about the benzo bubble where you get disassociated from reality and your friends a little bit.
It's really terrible. - The mind is terrifying. We were talking about how the infinite possibility of fun, but like it's the infinite possibility of suffering too, which is one of the dangers of like expansion of the human mind. It's like, I wonder if all the possible experiences that an intelligent computer can have, is it mostly fun or is it mostly suffering?
So like if you brute force expand the set of possibilities, like are you going to run into some trouble in terms of like torture and suffering and so on? Maybe our human brain is just protecting us from much more possible pain and suffering. Maybe the space of pain is like much larger than we could possibly imagine.
And that-- - The world's in a balance. You know, all the literature on religion and stuff is, you know, the struggle between good and evil is balanced for us, very finely tuned for reasons that are complicated. But that's a long philosophical conversation. - Speaking of balance that's complicated, I wonder because we're living through one of the more important moments in human history with this particular virus, it seems like pandemics have at least the ability to kill off most of the human population at their worst.
And there's just fascinating 'cause there's so many viruses in this world, there's so many. I mean, viruses basically run the world in the sense that they've been around a very long time. They're everywhere. They seem to be extremely powerful in a distributed kind of way, but at the same time, they're not intelligent and they're not even living.
Do you have like high level thoughts about this virus that like in terms of you being fascinated or terrified or somewhere in between? - So I believe in frameworks, right? So like one of them is evolution. Like we're evolved creatures, right? - Yes. - And one of the things about evolution is it's hyper competitive.
And it's not competitive out of a sense of evil, it's competitive in a sense of there's endless variation and variations that work better win. And then over time, there's so many levels of that competition. Like multicellular life partly exists because of the competition between different kinds of life forms.
And we know sex partly exists to scramble our genes so that we have genetic variation against the invasion of the bacteria and the viruses and it's endless. Like I read some funny statistic, like the density of viruses and bacteria in the ocean is really high. And one third of the bacteria die every day because the virus is invading them.
Like one third of them. - Wow. - Like I don't know if that number is true, but it was like, the amount of competition and what's going on is stunning. And there's a theory as we age, we slowly accumulate bacterias and viruses and as our immune system kind of goes down, that's what slowly kills us.
- It just feels so peaceful from a human perspective when we sit back and are able to have a relaxed conversation and there's wars going on out there. - Like right now, you're harboring how many bacteria? - There's ones, many of them are parasites on you and some of them are helpful and some of them are modifying your behavior and some of them are, it's really wild.
But this particular manifestation is unusual in the demographic, how it hit, and the political response that it engendered and the healthcare response it engendered and the technology it engendered, it's kind of wild. - Yeah, the communication on Twitter that it led to. - Every level. - Yeah, all that kind of stuff, at every single level, yeah.
- But what usually kills life, the big extinctions are caused by meteors and volcanoes. - That's the one you're worried about as opposed to human-created bombs that we launch-- - Solar flares are another good one. Occasionally, solar flares hit the planet. - So it's nature. - Yeah, it's all pretty wild.
- Another historic moment, this is perhaps outside but perhaps within your space of frameworks that you think about that just happened, I guess, a couple weeks ago is, I don't know if you're paying attention at all, is the GameStop and Wall Street Bets. - It was super fun. - So it's really fascinating.
There's kind of a theme to this conversation we're having today 'cause it's like neural networks, it's cool how there's a large number of people in a distributed way, almost having a kind of fund, were able to take on the powerful elite hedge funds, centralized powers, and overpower them. Do you have thoughts on this whole saga?
- I don't know enough about finance but it was like the Elon, Robin Hood guy when they talked. - Yeah, what'd you think about that? - Well, the Robin Hood guy didn't know how the finance system worked, that was clear. He was treating the people who settled the transactions as a black box and suddenly somebody called him up and said, "Hey, black box calling you.
"Your transaction volume means you need "to put out $3 billion right now." And he's like, "I don't have $3 billion. "I don't even make any money on these trades. "Why do I owe $3 billion while you're sponsoring the trade?" So there was a set of abstractions that, I don't think either, like now we understand it.
This happens in chip design. You buy wafers from TSMC or Samsung or Intel and they say it works like this and you do your design based on that and then chip comes back and it doesn't work. And then suddenly you start having to open the black boxes. Do transistors really work like they said?
What's the real issue? There's a whole set of things that created this opportunity and somebody spotted it. Now, people spot these kinds of opportunities all the time. There's been flash crashes. Short squeezes are fairly regular. Every CEO I know hates the shorts because they're trying to manipulate their stock in a way that they make money and deprive value from both the company and the investors.
So the fact that some of these stocks were so short, it's hilarious, that this hasn't happened before. I don't know why. I don't actually know why some serious hedge funds didn't do it to other hedge funds. And some of the hedge funds actually made a lot of money on this.
So my guess is we know 5% of what really happened and that a lot of the players don't know what happened and the people who probably made the most money aren't the people that they're talking about. - Do you think there was something, I mean, this is the cool kind of Elon, you're the same kind of conversationalist, which is like first principles questions of like what the hell happened.
Just very basic questions of like, was there something shady going on? What, who are the parties involved? It's the basic questions that everybody wants to know about. - Yeah, so we're in a very hyper-competitive world, right? But transactions like buying and selling stock is a trust event. I trust the company, represented themselves properly.
I bought the stock 'cause I think it's gonna go up. I trust that the regulations are solid. Now, inside of that, there's all kinds of places where humans over trust and this expose, let's say, some weak points in the system. I don't know if it's gonna get corrected. I don't know if we have close to the real story.
My suspicion is we don't. And listen to that guy, he was like a little wide-eyed about and then he did this and then he did that. And I was like, I think you should know more about your business than that. But again, there's many businesses when like this layer is really stable, you stop paying attention to it.
You pay attention to the stuff that's bugging you or new. Like you don't pay attention to the stuff that just seems to work all the time. You just, you know, sky's blue every day, California. And once in a while, it rains there. It's like, what do we do? Somebody go bring in the lawn furniture.
You know, like it's getting wet. You don't know why it's getting wet. - Yeah, it doesn't know. - I was blue for like 100 days and now it's, you know. - But part of the problem here with Vlad, the CEO of Robinhood is the scaling that we've been talking about is there's a lot of unexpected things that happen with the scaling.
And you have to be, I think the scaling forces you to then return to the fundamentals. - Well, it's interesting because when you buy and sell stocks, the scaling is, you know, the stocks only move in a certain range. And if you buy a stock, you can only lose that amount of money.
On the short market, you can lose a lot more than you can benefit. Like it has a weird cost function or whatever the right word for that is. So he was trading in a market where he wasn't actually capitalized for the downside. If it got outside a certain range.
Now, whether something nefarious has happened, I have no idea, but at some point, the financial risk to both him and his customers was way outside of his financial capacity and his understanding how the system work was clearly weak or he didn't represent himself. I don't know the person. - There's a-- - When I listened to him, it could have been the surprise question was like, and then these guys called and, you know, it sounded like he was treating stuff as a black box.
Maybe he shouldn't have, but maybe he has a whole pile of experts somewhere else and it was going on. I don't know. - Yeah. I mean, this is one of the qualities of a good leader is under fire, you have to perform. And that means to think clearly and to speak clearly.
And he dropped the ball on those things 'cause, and understand the problem, quickly learn and understand the problem like at the basic level, like what the hell happened. And my guess is, you know, at some level, it was amateurs trading against, you know, experts/insiders/people with, you know, special information.
- Outsiders versus insiders. - Yeah. And the insiders, you know, my guess is the next time this happens, we'll make money on it. - The insiders always win? - Well, they have more tools and more incentive. I mean, this always happens. Like the outsiders are doing this for fun.
The insiders are doing this 24/7. - But there's numbers in the outsiders. This is the interesting thing is, it could be a new chapter. - Well, there's numbers on the insiders too. - Different kind of numbers. - Different kind of numbers. - But this could be a new era because, I don't know, at least I didn't expect that a bunch of Redditors could, you know, there's millions of people can get together.
- It was a surprise attack. The next one will be a surprise. - But don't you think the crowd, the people are planning the next attack? - We'll see. It has to be a surprise, it can't be the same game. - And so the insiders-- - It could be there's a very large number of games to play and they can be agile about it.
I don't know, I'm not an expert. - Right, that's a good question. The space of games, how restricted is it? - Yeah, and the system is so complicated, it could be relatively unrestricted. And also like, you know, during the last couple of financial crashes, you know, what set it off was, you know, sets of derivative events where, you know, Nassim Taleb's, you know, thing is, they're trying to lower volatility in the short run by creating tail events.
And systems always evolve towards that and then they always crash. Like the S curve is the, you know, star low, ramp, plateau, crash. It's 100% effective. - In the long run. Let me ask you some advice to put on your profound hat. There's a bunch of young folks who listen to this thing for no good reason whatsoever.
Undergraduate students, maybe high school students, maybe just young folks, young at heart, looking for the next steps to take in life. What advice would you give to a young person today about life, maybe career, but also life in general? - Get good at some stuff. Well, get to know yourself, right?
Like get good at something that you're actually interested in. You have to love what you're doing to get good at it. You really gotta find that. Don't waste all your time doing stuff that's just boring or bland or numbing, right? Don't let old people screw you. Well, people get talked into doing all kinds of shit and racking up huge student debts and like there's so much crap going on, you know?
- And it drains your time and drains your-- - You know, the Eric Weinstein thesis that the older generation won't let go and they're trapping all the young people. - I think there's some truth to that. - Yeah, sure. Just because you're old doesn't mean you stop thinking. I know lots of really original old people.
I'm an old person. But you have to be conscious about it. You can fall into the ruts and then do that. I mean, when I hear young people spouting opinions, it sounds like they come from Fox News or CNN, I think they've been captured by groupthink and memes. - They're supposed to think on their own.
- So if you find yourself repeating what everybody else is saying, you're not gonna have a good life. Like that's not how the world works. It seems safe, but it puts you at great jeopardy for being boring or unhappy. - How long did it take you to find the thing that you have fun with?
- I don't know. I've been a fun person since I was pretty little. - So everything. - I've gone through a couple periods of depression in my life. - For a good reason or for a reason that doesn't make any sense? - Yeah. Yeah, like some things are hard.
Like you go through mental transitions in high school. I was really depressed for a year. And I think I had my first midlife crisis at 26. I kind of thought, is this all there is? Like I was working at a job that I loved, but I was going to work and all my time was consumed.
- What's the escape out of that depression? What's the answer to is this all there is? - Well, a friend of mine, I asked him 'cause he was working his ass off. I said, "What's your work-life balance?" Like there's work, friends, family, personal time. Are you balancing in that?
And he said, "Work 80%, family 20%." And I tried to find some time to sleep. Like there's no personal time. There's no passionate time. Like young people are often passionate about work. So, and I was sort of like that. But you need to have some space in your life for different things.
- And that creates, that makes you resistant to the deep dips into depression kind of thing. - Yeah, well, you have to get to know yourself too. Meditation helps. Some physical, something physically intense helps. - Like the weird places your mind goes kind of thing. - And why does it happen?
Why do you do what you do? - Like triggers, like the things that cause your mind to go to different places kind of thing, or like events. - Your upbringing, for better or worse, whether your parents are great people or not, you come into adulthood with all kinds of emotional burdens.
And you can see some people are so bloody stiff and restrained and they think the world's fundamentally negative, like you maybe. You have unexplored territory. - Yeah. - Or you're afraid of something. - Definitely afraid of quite a few things. - Then you gotta go face 'em. Like what's the worst thing that can happen?
You're gonna die, right? Like that's inevitable. You might as well get over that, like 100%, that's right. Like people are worried about the virus, but the human condition is pretty deadly. - There's something about embarrassment that's, I've competed a lot in my life, and I think if I'm to introspect it, the thing I'm most afraid of is being humiliated, I think.
- Nobody cares about that. Like you're the only person on the planet who cares about you being humiliated. It's like a really useless thought. - It is. - It's like, you're all humiliated, something happened in a room full of people and they walk out and they didn't think about it one more second.
Or maybe somebody told a funny story to somebody else and then it dissipates throughout, yeah. - No, I know it too. I've been really embarrassed about shit that nobody cared about myself. - Yeah. - It's a funny thing. - So the worst thing ultimately is just-- - Yeah, but that's a cage and you have to get out of it.
Like once you, here's the thing, once you find something like that, you have to be determined to break it. 'Cause otherwise you'll just, so you accumulate that kind of junk and then you die as a mess. - So the goal, I guess it's like a cage within a cage.
I guess the goal is to die in the biggest possible cage. - Well, ideally you'd have no cage. People do get enlightened, I've got a few, it's great. - You found a few? There's a few out there? I don't know. - Of course there are. - Wow. - Either that or they have, you know, it's a great sales pitch.
There's like enlightened people who write books and do all kinds of stuff. - It's a good way to sell a book, I'll give you that. - You've never met somebody you just thought, they just kill me. Like this, like mental clarity, humor. - No, 100%, but I just feel like they're living in a bigger cage.
They have their own. - You still think there's a cage? - There's still a cage. - You secretly suspect there's always a cage. - There's nothing outside the universe. - There's nothing outside the cage. (laughing) - You work, you worked at a bunch of companies, you led a lot of amazing teams.
I don't, I'm not sure if you've ever been like at the early stages of a startup, but do you have advice for somebody that wants to do a startup or build a company, like build a strong team of engineers that are passionate and just want to solve a big problem?
Like is there a more specifically on that point? - Well, you have to be really good at stuff. If you're gonna lead and build a team, you better be really interested in how people work and think. - The people or the solution to the problem. So there's two things, right?
One is how people work and the other is the-- - Well, actually, there's quite a few successful startups. It's pretty clear the founders don't know anything about people. Like the idea was so powerful that it propelled them. But I suspect somewhere early, they hired some people who understood people.
'Cause people really need a lot of care and feeding to collaborate and work together and feel engaged and work hard. Like startups are all about outproducing other people. Like you're nimble because you don't have any legacy. You don't have a bunch of people who are depressed about life, just showing up.
So startups have a lot of advantages that way. - Do you like the, Steve Jobs talked about this idea of A players and B players. I don't know if you know this formulation. - Yeah, I know. - That organizations that get taken over by B player leaders, often really underperform their RSC players.
That said in big organizations, there's so much work to do. And there's so many people who are happy to do what the leadership or the big idea people would consider menial jobs. And you need a place for them, but you need an organization that both values and rewards them, but doesn't let them take over the leadership of it.
- Got it. But so you need to have an organization that's resistant to that. But in the early days, the notion with Steve was that like one B player in a room of A players will be like destructive to the whole. - I've seen that happen. I don't know if it's like always true.
Like, you run into people who are clearly B players, but they think they're A players. And so they have a loud voice at the table and they make lots of demands for that. But there's other people who are like, I know who I am. I just want to work with cool people on cool shit and just tell me what to do and I'll go get it done.
So you have to, again, this is like people skills. Like what kind of person is it? I've met some really great people I love working with that weren't the biggest ID people, the most productive ever, but they show up, they get it done. They create connection and community that people value.
It's pretty diverse. I don't think there's a recipe for that. - I gotta ask you about love. - I heard you're into this now. - Into this love thing? - Yeah. Do you think this is your solution to your depression? - No, I'm just trying to, like you said, the enlightened people on occasion, trying to sell a book.
I'm writing a book about love. - You're writing a book about love? - No, I'm not. I'm not. (laughing) - A friend of mine, he said, you should really write a book about your management philosophy, he said, it'd be a short book. (laughing) - Well, that one was thought pretty well.
What role do you think love, family, friendship, all that kind of human stuff play in a successful life? You've been exceptionally successful in the space of like running teams, building cool shit in this world, creating some amazing things. What, did love get in the way? Did love help the family get in the way?
Did family help? Friendship? - You want the engineer's answer? - Please. - But first, love is functional, right? - It's functional in what way? - So, we habituate ourselves to the environment. And actually Jordan told me, Jordan Peterson told me this line. So, you go through life and you just get used to everything, except for the things you love.
They remain new. Like, this is really useful for, you know, like other people's children and dogs and trees. You just don't pay that much attention to them. Your own kids, you're monitoring them really closely. Like, and if they go off a little bit, because you love them, if you're smart, if you're gonna be a successful parent, you notice it right away.
You don't habituate to just things you love. And if you wanna be successful at work, if you don't love it, you're not gonna put the time in somebody else. It's somebody else that loves it. Like, 'cause it's new and interesting and that lets you go to the next level.
- So, it's the thing, it's just a function that generates newness and novelty and surprises you and all those kinds of things. - It's really interesting. There's people who figured out lots of frameworks for this. Like, humans seem to go in partnership, go through interest. Like, suddenly somebody's interesting and then you're infatuated with them and then you're in love with them.
And then, you know, different people have ideas about parental love or mature love. Like, you go through a cycle of that, which keeps us together and it's super functional for creating families and creating communities and making you support somebody despite the fact that you don't love them. Like, and it can be really enriching.
You know, now, in the work-life balance scheme, if all you do is work, you think you may be optimizing your work potential, but if you don't love your work or you don't have family and friends and things you care about, your brain isn't well-balanced. Like, everybody knows the experience of you worked on something all week, you went home and took two days off and you came back in.
The odds of you working on the thing, picking up right where you left off is zero. Your brain refactored it. But being in love is great. It's like changes the color of the light in a room. Like, it creates a spaciousness that's different. It helps you think. It makes you strong.
- Bukowski had this line about love being a fog that dissipates with the first light of reality in the morning. - It's death-depressing, I think it's the other way around. - It lasts. Well, like you said, it's a function. It's a thing that generates-- - It can be the light that actually enlivens your world and creates the interest and the power and the strength to go do something.
Well, it's like, that sounds like, you know, there's like physical love, emotional love, intellectual love, spiritual love, right? - Isn't it all the same thing, kinda? - Nope. You should differentiate that, maybe that's your problem. In your book, you should refine that a little bit. - It's the different chapters?
- Yeah, there's different chapters. - What's the, what's, these are, aren't these just different layers of the same thing, or the stack of physical-- - People, some people are addicted to physical love and they have no idea about emotional or intellectual love. I don't know if they're the same things.
I think they're different. - That's true, they could be different. I guess the ultimate goal is for it to be the same. - Well, if you want something to be bigger and interesting, you should find all its components and differentiate them, not clump it together. Like, people do this all the time, they, yeah, the modularity.
Get your abstraction layers right, and then you can, you have room to breathe. - Well, maybe you can write the forward to my book about love. - Yeah, or the afterwards. - And the afterwards. - You really tried. I feel like Lex has made a lot of progress in this book.
Well, you have things in your life that you love. - Yeah, yeah. And they are, you're right, they're modular. It's quite-- - And you can have multiple things with the same person or the same thing, but, yeah. - Depending on the moment of the day. - Yeah, there's, like, what Pekoski described is that moment when you go from being in love to having a different kind of love.
- Yeah, yeah, just a transition. - But when it happens, if you'd read the owner's manual and you believed it, you would've said, oh, this happened. It doesn't mean it's not love, it's a different kind of love. But maybe there's something better about that. As you grow old, if all you do is regret how you used to be, it's sad, right?
You should've learned a lot of things, 'cause, like, who you can be in your future self is actually more interesting and possibly delightful than being a mad kid in love with the next person. That's super fun when it happens, but that's 5% of the possibility. (Lex laughs) - Yeah, that's right, that there's a lot more fun to be had in the long-lasting stuff.
- Yeah, or meaning, if that's your thing. - Meaning, which is a kind of fun. It's a deeper kind of fun. - And it's surprising. The thing I like is surprises. You just never know what's gonna happen. But you have to look carefully and you have to work at it and you have to think about it.
- Yeah, you have to see the surprises when they happen. You have to be looking for it. From the branching perspective, you mentioned regrets. Do you have regrets about your own trajectory? - Oh yeah, of course. Yeah, some of it's painful, but you wanna hear the painful stuff? (Lex laughs) I would say, in terms of working with people, when people did stuff I didn't like, especially if it was a bit nefarious, I took it personally and I also felt it was personal about them.
But a lot of times, like humans, most humans are a mess, right? And then they act out and they do stuff. And this psychologist I heard a long time ago said, "You tend to think somebody does something to you, "but really what they're doing is they're doing "what they're doing while they're in front of you.
"It's not that much about you." - Yeah. - Right? And as I got more interested in, when I work with people, I think about them, and probably analyze them, and understand them a little bit. And then when they do stuff, I'm way less surprised. And if it's bad, I'm way less hurt.
And I react way less. I sort of expect everybody's got their shit. - Yeah, and it's not about you as much. - It's not about me that much. It's like you do something and you think you're embarrassed, but nobody cares. And somebody's really mad at you, the odds of it being about you, no, they're getting mad the way they're doing that because of some pattern they learned.
And maybe you can help them if you care enough about it. Or you could see it coming and step out of the way. Like I wish I was way better at that. I'm a bit of a hothead. - You regret that? You said with Steve that was a feature, not a bug.
- Yeah, well, he was using it as the counterforce to orderliness that would crush his work. - Well, you were doing the same. - Eh, maybe. I don't think my vision was big enough. It was more like I just got pissed off and did stuff. - I'm sure that's what Steve, yeah, you're telling-- - I don't know if it had the, it didn't have the amazing effect of creating the trillion dollar company.
It was more like I just got pissed off and left and/or made enemies that I shouldn't have. Yeah, it's hard. Like I didn't really understand politics until I worked at Apple, where Steve was a master player of politics and his staff had to be or they wouldn't survive him.
And it was definitely part of the culture. And then I've been in companies where they say it's political but it's all fun and games compared to Apple. And it's not that the people at Apple are bad people, it's just they operate politically at a higher level. It's not like, oh, somebody said something bad about somebody else, which is most politics.
They had strategies about accomplishing their goals, sometimes over the dead bodies of their enemies. With sophistication-- - Game of Thrones. - Yeah, more Game of Thrones, sophistication and a big time factor rather than a-- - That requires a lot of control over your emotions, I think, to have a bigger strategy in the way you behave.
- Yeah, and it's effective in the sense that coordinating thousands of people to do really hard things, where many of the people in there don't understand themselves much less how they're participating, creates all kinds of drama and problems that our solution is political in nature. Like how do you convince people, how do you leverage them, how do you motivate them, how do you get rid of them?
There's so many layers of that that are interesting. And even though some of it, let's say, may be tough, it's not evil unless you use that skill to evil purposes. Which some people obviously do. But it's a skill set that operates. And I wish I'd, I was interested in it, but it was sort of like, I'm an engineer, I do my thing.
And there's times when I could have had a way bigger impact if I knew how to, if I paid more attention and knew more about that. - About the human layer of the stack. - Yeah, that human political power expression layer of the stack. Which is complicated. And there's lots to know about it.
I mean, people are good at it, they're just amazing. And when they're good at it, and let's say, relatively kind and oriented in a good direction, you can really feel, you can get lots of stuff done and coordinate things that you never thought possible. But all people like that also have some pretty hard edges 'cause it's a heavy lift.
And I wish I'd spent more time on that. And I wish I'd spent more time with that when I was younger. But maybe I wasn't ready. You know, I was a wide-eyed kid for 30 years. - Still a bit of a kid. - I know. - What do you hope your legacy is when there's a book, like "A Hitchhiker's Guide to the Galaxy," and this is like a one-sentence entry by Jim Crow.
From like, that guy lived at some point. There's not many, you know, not many people will be remembered. You're one of the sparkling little human creatures that had a big impact on the world. How do you hope you'll be remembered? - My daughter was trying to get, she added to my Wikipedia page to say that I was a legend and a guru.
But they took it out, so she put it back in, she's 15. I think that was probably the best part of my legacy. She got her sister, and they were all excited. They were trying to put it in the references 'cause there's articles in that on the title. - Calling you that?
So in the eyes of your kids, you're a legend. - Well, they're pretty skeptical 'cause they hope you're better than that. They're like, "Dad!" So yeah, that kind of stuff is super fun. In terms of the big legend stuff, I don't care. - You don't care? - Legacy, I don't really care.
- You're just an engineer. - Yeah, I've been thinking about building a big pyramid. So I had a debate with a friend about whether pyramids or craters are cooler. And he realized that there's craters everywhere, but they built a couple pyramids 5,000 years ago. - And they remember you for a while.
- We're still talking about it. I think that would be cool. - Those aren't easy to build. - Oh, I know. And they don't actually know how they built them, which is great. - It's either AGI or aliens could be involved. So I think you're gonna have to figure out quite a few more things than just-- - I know.
- The basics of civil engineering. So I guess you hope your legacy is pyramids. - That would be cool. And my Wikipedia page, getting updated by my daughter periodically. Like those two things would pretty much make it. - Jim, it's a huge honor talking to you again. I hope we talk many more times in the future.
I can't wait to see what you do with Tennis Torrent. I can't wait to use it. I can't wait for you to revolutionize yet another space in computing. It's a huge honor to talk to you. Thanks for talking today. - This was fun. - Thanks for listening to this conversation with Jim Keller.
And thank you to our sponsors, Athletic Greens, All-in-One Nutrition Drink, Brooklyn and Sheets, ExpressVPN, and Belcampo Grass-Fed Meat. Click the sponsor links to get a discount and to support this podcast. And now let me leave you with some words from Alan Turing. Those who can imagine anything can create the impossible.
Thank you for listening and hope to see you next time. (upbeat music) (upbeat music)