back to index

Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming | Lex Fridman Podcast #224


Chapters

0:0 Introduction
1:11 Early programming
22:52 SciPy
39:46 Open source
51:29 NumPy
88:44 Guido van Rossum
101:2 Efficiency
109:54 Objects
116:52 Numba
125:58 Anaconda
130:25 Conda
146:1 Quansight Labs
149:37 OpenTeams
157:10 GitHub
162:40 Marketing
167:18 Great programming
178:8 Hiring
182:6 Advice for young people

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Travis Olyphant,
00:00:03.600 | one of the most impactful programmers
00:00:05.520 | and data scientists ever.
00:00:07.920 | He created NumPy, SciPy, and Anaconda.
00:00:12.760 | NumPy formed the foundation
00:00:14.500 | of tensor-based machine learning in Python.
00:00:17.080 | SciPy formed the foundation
00:00:18.880 | of scientific programming in Python.
00:00:20.960 | And Anaconda, specifically with Conda,
00:00:23.980 | made Python more accessible to a much larger audience.
00:00:27.620 | Travis's life work across a large number of programming
00:00:31.200 | and entrepreneurial efforts has and will continue
00:00:34.760 | to have immeasurable impact on millions of lives
00:00:38.460 | by empowering scientists and engineers in big companies,
00:00:42.440 | small companies, and open-source communities
00:00:45.380 | to take on difficult problems
00:00:47.200 | and solve them with the power of programming.
00:00:50.520 | Plus, he's a truly kind human being,
00:00:53.440 | which is something that when combined with vision
00:00:56.000 | and ambition makes for a great leader
00:00:58.400 | and a great person to chat with.
00:01:01.160 | To support this podcast,
00:01:02.320 | please check out our sponsors in the description.
00:01:04.880 | This is the Alex Friedman Podcast,
00:01:06.960 | and here is my conversation with Travis Olyphant.
00:01:10.640 | What was the first computer program you've ever written?
00:01:14.480 | Do you remember?
00:01:15.320 | - Whoa, that's a good question.
00:01:16.920 | I think it was in fourth grade.
00:01:18.360 | Just a simple loop in Basic.
00:01:20.920 | - Basic.
00:01:21.760 | - Basic, yeah, on an Atari 800,
00:01:23.320 | Atari 400, I think, or maybe it was an Atari 800.
00:01:26.840 | It was a part of a class,
00:01:28.320 | and we just were just basic loops to print things out.
00:01:32.560 | - Did you use goto statements?
00:01:34.920 | - Yes, yes, we used goto statements.
00:01:37.760 | - I remember in the early days,
00:01:39.560 | that's when I first realized
00:01:41.160 | there's principles to programming,
00:01:43.320 | when I was told that don't use goto statements.
00:01:45.720 | Those are bad software engineering principles.
00:01:48.360 | It goes against what great, beautiful code is.
00:01:52.040 | I was like, oh, okay, there's rules to this game.
00:01:54.800 | - I didn't see that until high school
00:01:56.240 | when I took an AP Computer Science course.
00:01:58.360 | I did a lot of other kinds of just programming in TI,
00:02:02.200 | but finally when I took an AP Computer Science course
00:02:04.160 | in Pascal.
00:02:05.720 | - Wow.
00:02:06.560 | - Yeah, it was Pascal.
00:02:07.440 | That's when I, oh, there are these principles.
00:02:09.720 | - Not C or C++?
00:02:11.320 | - No, I didn't take C until the next year in college.
00:02:14.660 | I had a course in C, but I haven't done much in Pascal,
00:02:18.080 | just that AP Computer Science course.
00:02:21.320 | - Now, sorry for the romanticized question,
00:02:23.480 | but when did you first fall in love with programming?
00:02:26.760 | - Oh, man, good question.
00:02:27.880 | I think actually when I was 10.
00:02:30.280 | My dad got us a Timex Sinclair,
00:02:33.460 | and he was excited about the spreadsheet capability,
00:02:37.180 | and then, but I made him get the basic,
00:02:39.560 | the add-ons so we could actually program in basic,
00:02:41.880 | and just being able to write instructions
00:02:44.520 | and have the computer do something.
00:02:45.960 | Then we got a TI99, TI99-4A when I was about 12,
00:02:50.040 | and I would just, it had sprites and graphics and music.
00:02:52.960 | You could actually program it to do music.
00:02:55.320 | That's when I really sort of fell in love with programming.
00:02:58.600 | - So this is a full, like a real computer
00:03:01.080 | with memory and storage and processors,
00:03:04.600 | so we're not, 'cause you say TI, it's not--
00:03:06.360 | - Yeah, the Timex Sinclair was one of the very first.
00:03:08.240 | It was a cheap, cheap, I think it was,
00:03:11.420 | well, it was still expensive, but it was 2K of memory.
00:03:14.440 | We got the 16K add-on pack.
00:03:16.780 | But yeah, it had memory, and you could program it.
00:03:19.000 | You had the, in order to store your programs,
00:03:20.920 | you had to attach a tape drive.
00:03:22.880 | Remember that old, the sound that would play
00:03:24.400 | when you converted the modem,
00:03:28.000 | it would convert digital bits to audio files
00:03:30.360 | and set on a tape drive.
00:03:31.920 | Still remember that sound, but that was the storage.
00:03:34.760 | - And what was the programming language, do you remember?
00:03:36.480 | - It was basic. - It was basic.
00:03:37.600 | - And then they had a VisiCalc,
00:03:38.960 | and so a little bit of spreadsheet programming in VisiCalc,
00:03:41.160 | but mostly just some basic.
00:03:42.760 | - Do you remember what kind of things
00:03:44.960 | drew you to programming?
00:03:46.320 | Was it working with data?
00:03:48.740 | Was it video games?
00:03:49.960 | - Math. - Games?
00:03:51.360 | - Math. - Math.
00:03:52.200 | - Math-y stuff?
00:03:53.020 | - Yeah, I've always loved math,
00:03:54.800 | and a lot of people think they don't like math
00:03:58.080 | because I think when they're exposed to it early,
00:04:00.440 | it's about memory.
00:04:02.000 | You know, when you're exposed to math early,
00:04:03.240 | you have a good short-term memory,
00:04:04.280 | it remembers timetables.
00:04:05.920 | And I do have a reasonably, I mean, not perfect,
00:04:08.600 | but a reasonably long little short-term memory buffer.
00:04:12.480 | And so I did great at timetables.
00:04:14.320 | I said, "Oh, I'm good at math."
00:04:15.840 | But I started to really like math,
00:04:17.400 | just the problem-solving aspect.
00:04:20.300 | And so computing was problem-solving applied.
00:04:25.020 | And so that's always kind of been the draw,
00:04:28.260 | kind of coupled with the mathematics.
00:04:30.460 | - Did you ever see the computer
00:04:31.820 | as like an extension of your mind,
00:04:34.660 | like something able to achieve--
00:04:36.500 | - Not till later.
00:04:37.740 | - Okay. - Yeah, not then.
00:04:39.340 | - It's just like a little set of puzzles
00:04:40.860 | that you can play with,
00:04:41.700 | and you can play with math puzzles.
00:04:43.540 | - Yeah, it was too rudimentary early on.
00:04:46.100 | Like it was sort of, yeah, it was a lot of work
00:04:49.180 | to actually take a thought you'd have
00:04:51.460 | and actually get it implemented.
00:04:53.100 | And that's still work, but it's getting easier.
00:04:56.020 | And so, yeah, I would say that's definitely
00:04:58.260 | what's attracting me to Python,
00:04:59.580 | is that that was more real, right?
00:05:02.140 | I could think in Python.
00:05:04.860 | Speaking of foreign language,
00:05:05.820 | I only speak another language fluently,
00:05:07.740 | besides English, which is Spanish.
00:05:09.060 | And I remember the day when I would dream in Spanish.
00:05:11.700 | And you start to think in that language.
00:05:13.420 | And then you actually,
00:05:14.260 | I do definitely believe that language
00:05:16.980 | limits or expands your thinking.
00:05:19.580 | There's some languages that actually
00:05:21.160 | lead you to certain thought processes.
00:05:23.820 | - Yeah, like, so I speak Russian fluently.
00:05:27.260 | And that's certainly a language
00:05:30.060 | that leads you down certain thought processes.
00:05:33.220 | Well, yeah, I mean, there's a history
00:05:36.180 | of the two world wars,
00:05:39.340 | of millions of people starving to death
00:05:43.020 | or near to death throughout its history,
00:05:45.180 | of suffering, of injustice,
00:05:47.580 | like this promise sold to the people
00:05:49.780 | and then the carpet or whatever
00:05:52.700 | is swept from under them.
00:05:54.380 | It's like broken promises.
00:05:55.700 | And all of that pain and melancholy
00:05:57.700 | is in the language, the sad songs,
00:06:01.020 | the sad, hopeful songs, the over-romanticized,
00:06:04.180 | like, I love you, I hate you,
00:06:06.220 | the sort of the swings between
00:06:08.780 | all the various spectrums of emotion.
00:06:11.540 | So that's all within the language.
00:06:13.580 | The way it's twisted,
00:06:14.900 | there's a strong culture of rhyming poetry.
00:06:19.500 | So like the bards, like the sync,
00:06:21.860 | there's a musicality to the language too.
00:06:24.740 | - Did Dostoevsky write in Russian?
00:06:27.420 | - Yeah, so like,
00:06:28.380 | (speaking in foreign language)
00:06:32.140 | All the--
00:06:32.980 | - The ones that I know about,
00:06:33.820 | which are translated.
00:06:34.660 | And I'm curious how the translations.
00:06:36.380 | - So Dostoevsky did not use
00:06:39.980 | the musicality of the language too much.
00:06:42.180 | So it actually translates pretty well
00:06:44.180 | because it's so philosophically dense
00:06:46.540 | that the story does a lot of the work.
00:06:48.460 | But there's a bunch of things that are untranslatable.
00:06:51.160 | Certainly the poetry is not translatable.
00:06:53.580 | I actually have a few conversations coming up offline
00:06:57.940 | and also in this podcast
00:06:59.140 | with people who've translated Dostoevsky.
00:07:01.980 | And that's for people who worked in this field,
00:07:06.340 | know how difficult that is.
00:07:07.380 | Sometimes you can spend months
00:07:10.380 | thinking about a single sentence in context,
00:07:13.980 | 'cause there's just a magic captured by that sentence.
00:07:16.380 | And how do you translate just in the right way?
00:07:18.940 | Because those words can be really powerful.
00:07:22.380 | There's a famous line,
00:07:24.220 | "Beauty will save the world" from Dostoevsky.
00:07:26.460 | There's so many ways to translate that.
00:07:29.500 | And you're right,
00:07:30.540 | the language gives you the tools
00:07:32.700 | with which to tell the story,
00:07:34.140 | but it also leads your mind down certain trajectories
00:07:37.260 | and paths to where over time,
00:07:39.660 | as you think in that language,
00:07:41.140 | you become a different human being.
00:07:42.740 | - Yes.
00:07:43.780 | Yeah, that's a fascinating reality, I think.
00:07:45.860 | I know people have explored that,
00:07:47.020 | but it's, I guess, rediscovered.
00:07:49.740 | - Well, we don't, we live in our own little pockets.
00:07:52.580 | This is the sad thing,
00:07:54.180 | is I feel like, unfortunately,
00:07:56.900 | given time and getting older,
00:07:59.140 | I'll never know China, the Chinese world,
00:08:03.620 | 'cause I don't truly know the language.
00:08:05.780 | Same with Japanese.
00:08:06.980 | I don't truly know Japanese and Portuguese and Brazil,
00:08:10.340 | that whole South American continent.
00:08:12.060 | Like, yeah, I'll go to Brazil and Argentina,
00:08:14.460 | but will I truly understand the people
00:08:17.140 | if I don't understand the language?
00:08:18.500 | It's sad because I wonder how much,
00:08:23.500 | how many geniuses we're missing
00:08:25.220 | because so much of the scientific world,
00:08:28.540 | so much of the technical world is in English,
00:08:31.300 | and so much of it might be lost
00:08:33.140 | because we don't have the common language.
00:08:36.100 | - I completely agree.
00:08:36.940 | I'm very much in that vein of,
00:08:39.620 | there's a lot of genius out there that we miss,
00:08:41.780 | and we're sort of fortunate when it bubbles up
00:08:45.020 | into something that we can understand or process.
00:08:48.660 | There's a lot we miss.
00:08:50.420 | So that's why I tend to lean towards
00:08:51.620 | really loving democratization
00:08:54.060 | or things that empower people
00:08:55.420 | or very resistant, sort of authoritarian structures.
00:09:00.100 | Fundamentally for that reason,
00:09:01.900 | well, several reasons, but it just hurts us.
00:09:04.460 | We're worse off.
00:09:06.420 | - So speaking of languages that empower you,
00:09:09.020 | so Python was the first language for me
00:09:11.820 | that I really enjoyed thinking in, as you said.
00:09:16.820 | - Sounds like you shared my experience too.
00:09:18.500 | - So when did you first,
00:09:19.620 | do you remember when you first kind of connected with Python,
00:09:21.860 | maybe even fell in love with Python?
00:09:23.740 | - It's a good question.
00:09:24.580 | It was a process that took about a year.
00:09:26.500 | I first encountered Python in 1997.
00:09:29.460 | I was a graduate student studying biomedical engineering
00:09:31.700 | at the Mayo Clinic, and I had previously,
00:09:34.660 | I'd been involved in taking information from satellites.
00:09:39.300 | I was an electrical engineering student,
00:09:41.300 | used to taking information
00:09:42.620 | and trying to get something out of it,
00:09:43.980 | doing some data processing information out of it.
00:09:46.100 | And I'd done that in MATLAB.
00:09:47.620 | I'd done that in Perl.
00:09:49.100 | I'd done that in scripting on a VMS.
00:09:52.540 | There's actually a VAX VMS system,
00:09:54.220 | and they had their own little scripting tools
00:09:56.340 | around Fortran.
00:09:57.980 | Done a lot of that.
00:09:58.860 | And then as a graduate student,
00:10:00.860 | I was looking for something and encountered Python,
00:10:04.420 | and because Python had an array,
00:10:06.180 | had two things that made me not filter it away.
00:10:09.100 | Because I was filtering a bunch of stuff.
00:10:10.420 | It was Yorick, I looked at Yorick,
00:10:11.740 | I looked at a few other languages that are out there
00:10:14.460 | at the time in 1997, but it had arrays.
00:10:17.740 | There's a library called Numeric
00:10:19.100 | that had just been written in '95,
00:10:20.900 | like not very, not too much earlier,
00:10:23.780 | by an MIT alum, Jim Huganin.
00:10:26.980 | You know, and I went back and read the mailing list
00:10:29.140 | to see the history of how it grew,
00:10:30.340 | and there was a very interesting,
00:10:31.260 | it's fascinating to do that, actually,
00:10:32.420 | to see how this emergent cooperation,
00:10:36.020 | unstructured cooperation happens in the open source world
00:10:39.500 | that led to a lot of this collective programming,
00:10:43.340 | which is something maybe we might get into a little later,
00:10:45.180 | about what that looks like.
00:10:46.140 | - What gap did Numeric fill?
00:10:48.340 | - Numeric filled the gap of having an array object.
00:10:50.300 | So instead-- - There was no array object.
00:10:51.620 | - There was no array,
00:10:52.460 | there was a one-dimensional byte concept,
00:10:55.340 | but there was no N-dimensional,
00:10:57.540 | two, three, four-dimensional tensor, they call it now.
00:11:00.660 | I'm still in the category that a tensor is another thing,
00:11:03.220 | and it's just an N-V-A-R-A, we should call it,
00:11:05.180 | but kind of lost that battle.
00:11:07.140 | - There's many battles in this world,
00:11:10.140 | some of which we win, some we lose.
00:11:12.060 | - That's exactly right.
00:11:14.180 | But it had no math to it.
00:11:17.140 | So Numeric had math and a basic way to think in arrays.
00:11:20.780 | So I was looking for that, and it had complex numbers.
00:11:23.600 | A lot of programming languages,
00:11:26.460 | and you can see it because, you know,
00:11:28.100 | if you're just a computer scientist,
00:11:29.500 | you think, "Ah, complex numbers are just two floats."
00:11:32.060 | So people can build that on.
00:11:34.980 | But in practice, a complex number
00:11:36.740 | as one of the significant algebras
00:11:38.980 | that helps connect a lot of physical
00:11:40.740 | and mathematical ideas,
00:11:42.260 | particularly to FFT for an actual engineer.
00:11:45.100 | And it's a really important concept,
00:11:48.140 | and not having it means you have to develop it
00:11:50.860 | several times, and those times may not share an approach.
00:11:54.300 | One of the common things in programming,
00:11:55.700 | one of the things programming enables is abstractions.
00:11:59.100 | But when you have shared abstractions, it's even better.
00:12:01.180 | It sort of gets to the level of language
00:12:02.980 | of actually we all think of this the same way,
00:12:05.520 | which is both powerful and dangerous, right?
00:12:07.940 | Because powerful in that we now can quickly make
00:12:11.740 | bigger and higher level things
00:12:13.340 | on top of those abstractions dangerous,
00:12:14.800 | because it also limits us as to the things
00:12:17.100 | we maybe left behind in producing that abstraction,
00:12:20.500 | which is at the heart of programming today,
00:12:21.900 | and actually building around the programming world.
00:12:24.140 | So I think it's a fascinating philosophical topic.
00:12:26.540 | - Yeah, that will continue for many years, I think.
00:12:28.700 | - For many years.
00:12:29.540 | - As we build more and more and more abstractions.
00:12:31.260 | - Yes, I often think about, you know,
00:12:32.340 | we have a world that's built on these abstractions
00:12:35.060 | that, were they the only ones possible?
00:12:37.500 | Certainly not, but they led to,
00:12:39.860 | now it's very hard to do it differently.
00:12:42.300 | Like there's an inertia that's very hard to,
00:12:44.980 | you know, push out, push away from.
00:12:47.500 | There's, it has implications for things like,
00:12:49.620 | you know, the Julia language,
00:12:50.740 | which you have heard of, I'm sure.
00:12:52.700 | And I've met the creators, and I like Julia.
00:12:55.700 | It's a really cool language,
00:12:56.580 | but they've struggled to kind of,
00:12:58.540 | against just the tide of like this inertia
00:13:01.300 | of people using Python.
00:13:03.380 | And, you know, there's strategies to approach that,
00:13:05.800 | but nonetheless, it's a phenomenon.
00:13:07.580 | And sometimes, so I love complex numbers,
00:13:09.580 | and I love to erase, so I looked at Python.
00:13:12.100 | And then I had the experience, I did some stuff in Python,
00:13:15.260 | and I was just doing my PhD, so I was out,
00:13:17.860 | my focus was on, I was actually doing a combination
00:13:20.940 | of MRI and ultrasound, and looking at a phenomenon
00:13:23.780 | called elastography, which is you push waves
00:13:25.700 | into the body, and observe those waves,
00:13:28.460 | like you can actually measure them,
00:13:30.320 | and then you do mathematical inversion
00:13:32.780 | to see what the elasticity is.
00:13:35.220 | And so that's the problem I was solving,
00:13:36.820 | is how to do that with both ultrasound and MRI.
00:13:39.780 | I needed some tool to do that with.
00:13:41.380 | So I was starting to use Python in '97.
00:13:44.260 | In '98, I went back, looked at what I'd written,
00:13:47.340 | and realized I could still understand it,
00:13:49.560 | which is not the experience I'd had
00:13:50.900 | when doing Perl in '95, right?
00:13:53.660 | I'd done the same thing, and then I looked back,
00:13:55.620 | and I'd forgotten what I was even saying.
00:13:58.380 | Now, you know, I'm not saying, so that made me,
00:14:00.700 | hey, this may work, I like this.
00:14:02.420 | This is something I can retain
00:14:05.020 | without becoming an expert, per se.
00:14:07.660 | And so that led me to go, hmm, I'm gonna push more to this.
00:14:10.380 | And then '98 was kind of when I started
00:14:14.820 | to fall in love with Python, I would say.
00:14:16.820 | - A few peculiar things about Python,
00:14:20.900 | so maybe compare it to Perl,
00:14:22.940 | compare it to some of the other languages.
00:14:24.580 | So there's no braces.
00:14:26.340 | - Yeah, yeah.
00:14:27.180 | - So space is used, indentation, I should say,
00:14:31.980 | is used as part of the language.
00:14:33.980 | - Yeah, right.
00:14:35.540 | - So did you, I mean, that's quite a leap.
00:14:40.020 | Were you comfortable with that leap,
00:14:41.220 | or were you just very open-minded?
00:14:42.740 | - It's a good question.
00:14:43.900 | I was open-minded, so I was cognizant of the concern.
00:14:48.060 | And it definitely has, it has specific challenges.
00:14:52.100 | You know, cut and pasting, for example,
00:14:53.980 | when you're cut and pasting code,
00:14:55.540 | and if your editors aren't supportive of that,
00:14:57.300 | if you're putting it into a terminal,
00:14:59.060 | and particularly in the past,
00:15:00.100 | when terminals didn't necessarily have the intelligence
00:15:02.500 | to manage it now.
00:15:03.340 | Now, iPython and Jupyter Notebooks handle it just fine,
00:15:06.020 | so there's really no problem,
00:15:06.900 | but in the past, it created some challenges,
00:15:08.820 | formatting challenges, also mixed tabs and spaces.
00:15:12.540 | If editors weren't, you weren't clear
00:15:14.820 | on what was happening, you would have these issues.
00:15:16.940 | So there were really concrete reasons about it
00:15:19.260 | that I heard and understood.
00:15:20.460 | I never really encountered a problem with it,
00:15:23.460 | personally, like it was occasional annoyances,
00:15:26.540 | but I really liked the fact
00:15:28.500 | that it didn't have all this extra characters, right?
00:15:31.140 | That these extra characters didn't show up
00:15:33.180 | in my visual field when I was just trying
00:15:35.500 | to process understanding a snippet of code.
00:15:38.060 | - Yeah, there's a cleanness to it.
00:15:39.300 | But I mean, the idea is supposed to be
00:15:41.220 | that Perl also has a cleanness to it
00:15:43.380 | because of the minimalism of like how many characters
00:15:46.580 | it takes to express a certain thing.
00:15:48.380 | So it's very compact.
00:15:49.860 | But what you realize with that compactness comes,
00:15:53.220 | there's a culture that prizes compactness.
00:15:57.220 | And so the code gets more and more compact
00:15:58.940 | and less and less readable to a point where it's like,
00:16:01.900 | like to be a good programmer in Perl,
00:16:05.460 | you write code that's basically unreadable.
00:16:07.860 | - Right. - There's a culture.
00:16:09.140 | - Correct, and you're proud of it.
00:16:10.900 | - Yeah, you're proud of it.
00:16:12.500 | - Right, exactly, and it's like feels good.
00:16:14.180 | And it's really selective.
00:16:16.300 | Like it means you have to be an expert in Perl
00:16:19.380 | to understand it.
00:16:20.420 | Whereas Python allowed you not to have to be an expert.
00:16:23.020 | You don't have to take all this brain energy.
00:16:24.780 | You could leverage, what I say,
00:16:25.700 | you could leverage your English language center,
00:16:28.240 | which you're using all the time.
00:16:30.020 | I've wondered about other languages,
00:16:31.260 | particularly non-Latin-based languages.
00:16:34.740 | Latin-based languages with the characters are at least similar.
00:16:37.260 | I think people have an easier time,
00:16:38.620 | but I don't know what it's like to be a Japanese
00:16:41.300 | or a Chinese person trying to learn a different syntax.
00:16:45.740 | Like what would computer programming look like in that?
00:16:49.700 | I haven't looked at that at all,
00:16:50.740 | but it certainly doesn't,
00:16:52.100 | leveraging your Chinese language center,
00:16:54.260 | I'm not sure Python or any programming language does that.
00:16:57.020 | But that was a big deal.
00:16:58.100 | The fact that it was accessible, I could be a scientist.
00:17:00.300 | What I really liked is many programming languages
00:17:02.860 | really demand a lot of you, and you can get a lot,
00:17:04.980 | you do a lot if you learn it.
00:17:07.160 | But Python enables you to do a lot
00:17:08.860 | without demanding a lot of you.
00:17:10.420 | There's nuance to that statement,
00:17:13.060 | but it certainly is more accessible.
00:17:15.300 | So more people could actually, as a scientist,
00:17:17.980 | as somebody who, or an engineer,
00:17:19.820 | who was trying to solve another problem
00:17:21.420 | besides programming, I could still use this language
00:17:24.900 | and get things done and be happy about it.
00:17:27.300 | Now I was also comfortable in C at that time.
00:17:30.060 | - And MATLAB you did a little bit of that.
00:17:30.900 | - And MATLAB I did a lot before that, exactly.
00:17:33.140 | So I was comfortable in,
00:17:34.860 | those three languages were really the tools I used
00:17:37.580 | during my studies and schooling.
00:17:39.560 | But to your point about language helping you think,
00:17:42.620 | one of the big things about MATLAB was it was,
00:17:44.580 | and APL before it, I don't know if you remember APL.
00:17:47.660 | - Nope.
00:17:48.500 | - APL is actually the predecessor of array-based programming,
00:17:51.660 | which I think is really an underappreciated,
00:17:54.180 | if I talk to people who are just steeped
00:17:55.340 | in computer programming, computer science,
00:17:57.900 | most of the people that Microsoft has hired
00:17:59.460 | in the past, for example,
00:18:01.100 | Microsoft as a company generally did not understand
00:18:03.900 | array-based programming, culturally they didn't understand it
00:18:06.620 | so they kept missing the boat,
00:18:08.580 | kept missing the understanding of what this was.
00:18:11.580 | They've gotten better, but there's still a whole culture
00:18:14.020 | of folks that doesn't, programming,
00:18:15.660 | that's systems programming or web programming
00:18:18.900 | or lists and maps and what about an n-dimensional array?
00:18:22.540 | Oh yeah, that's just an implementation detail.
00:18:24.700 | Well, you can think that, but then actually
00:18:27.340 | if you have that as a construct,
00:18:28.820 | you actually think differently.
00:18:29.860 | APL was the first language to understand that
00:18:31.660 | and it was in the 60s.
00:18:33.500 | The challenge of APL is APL had very dense,
00:18:36.780 | not only glyphs, like new characters, new glyphs,
00:18:39.340 | they even had a new keyboard
00:18:40.480 | because to produce those glyphs,
00:18:42.340 | this was back in the early days of computing
00:18:43.980 | when the QWERTY keyboard maybe wasn't as established.
00:18:47.980 | Like, well, we can have a new keyboard, no big deal.
00:18:50.780 | But it was a big deal and it didn't catch on
00:18:52.900 | and the language APL, very much like Perl,
00:18:56.500 | as people would pride themselves on how much,
00:18:58.620 | could they write the game of life in 30 characters of APL?
00:19:03.100 | APL has characters that mean summation
00:19:06.100 | and they have adverbs, they would have adjectives
00:19:08.780 | and these things called adverbs,
00:19:10.060 | which are like methods, like reduction,
00:19:12.260 | it would be an adverb on an ad operator.
00:19:14.680 | But using these tools, you could construct
00:19:18.660 | and then you start to think at that level,
00:19:20.900 | you think in n dimensions, it's something I like to say,
00:19:22.900 | and you start to think differently about data at that point.
00:19:26.540 | It really helps.
00:19:27.540 | - Yeah, I mean, outside of programming,
00:19:30.100 | if you really internalize linear algebra as a course,
00:19:33.700 | I mean, it philosophically allows you
00:19:35.580 | to think of the world differently.
00:19:37.220 | It's almost like liberating.
00:19:38.540 | You don't have to think about the individual numbers
00:19:42.100 | in the n dimensional array.
00:19:44.220 | You could think of it as an object in itself
00:19:46.140 | and all of a sudden this world can open up.
00:19:48.500 | You're saying MATLAB and APL were like the early,
00:19:52.620 | I don't know if many languages got that right ever.
00:19:54.980 | - No, no, no, they didn't.
00:19:57.660 | Even still, I would say, I mean,
00:19:59.540 | NumPy is an inheritor of the traditions.
00:20:03.020 | I would say APLJ was another version that was,
00:20:06.580 | what it did is not have the glyphs,
00:20:08.340 | just have short characters,
00:20:09.700 | but still a Latin keyboard could type them.
00:20:11.740 | And then Numeric inherited from that
00:20:14.540 | in terms of let's add arrays plus broadcasting,
00:20:17.660 | plus methods, reduction,
00:20:19.700 | even some of the language like rank
00:20:21.100 | is a concept that was in Python,
00:20:23.140 | it's still in Python, for the number of dimensions.
00:20:25.900 | That's different than say the rank of a matrix,
00:20:29.460 | which people think of as well.
00:20:31.140 | So it came from that tradition,
00:20:33.060 | but NumPy is a very pragmatic, practical tool.
00:20:37.940 | NumPy inherited from Numeric
00:20:39.260 | and we can get to where NumPy came from,
00:20:40.820 | which is the current array,
00:20:42.220 | at least current as of 2015, 2017,
00:20:46.100 | now there's a ton of them over the past two or three years.
00:20:49.300 | We can get into that too.
00:20:50.300 | - So if we just sort of linger on the early days
00:20:52.780 | of what was your favorite feature of Python?
00:20:56.220 | Do you remember like what--
00:20:57.060 | - Yeah.
00:20:58.020 | - It's so interesting to linger on like the,
00:21:02.260 | what really makes you connect with a language?
00:21:06.300 | I'm not sure it's obvious to introspect that.
00:21:09.380 | - No, it isn't, and I've thought about that
00:21:11.180 | at some length.
00:21:12.820 | I think definitely the fact that I could read it later,
00:21:16.420 | that I could use it productively without becoming an expert.
00:21:19.500 | Like other languages I had to put more effort into.
00:21:21.420 | - Right, that's like an empirical observation,
00:21:23.940 | like you're not analyzing any one aspect of the language,
00:21:26.460 | it just seems time after time,
00:21:28.700 | you look back, it's somehow readable.
00:21:30.540 | - It's somehow readable,
00:21:31.380 | and then it was sort of,
00:21:32.220 | I could take executable English
00:21:35.340 | and translate it to Python more easily.
00:21:36.780 | Like I didn't have to go,
00:21:37.620 | there was no translation layer.
00:21:39.780 | As an engineer or as a scientist,
00:21:41.580 | I could think about what I wanted to do,
00:21:43.220 | and then the syntax wasn't that far behind it.
00:21:45.900 | - Yeah.
00:21:46.740 | - Right, now there are some warts there still,
00:21:49.220 | it wasn't perfect.
00:21:50.620 | There's some areas where I'm like,
00:21:51.460 | "Ah, it'd be better if this were different,
00:21:52.780 | "or if this were different."
00:21:54.380 | Some of those things got added to the language too.
00:21:56.580 | I was really grateful for some of the early pioneers
00:21:58.580 | in the Python ecosystem back,
00:22:00.220 | 'cause Python got written in '91,
00:22:01.900 | is when the first version came out.
00:22:03.120 | But Guido was very open to users,
00:22:06.540 | and one of the sets of users
00:22:07.620 | were people like Jim Hugonin,
00:22:08.700 | and David Asher, and Paul Dubois,
00:22:11.260 | and Conrad Hinson.
00:22:13.460 | These were people that were on the main list,
00:22:15.380 | and they were just asking for things like,
00:22:16.780 | "Hey, we really should have complex numbers
00:22:18.260 | "in this language."
00:22:19.220 | So let's, you know, there's a J,
00:22:21.580 | there's a one J, right?
00:22:22.540 | And the fact that they went the engineering route of J
00:22:24.380 | is interesting.
00:22:25.220 | I don't think that's entirely favoring engineers,
00:22:28.660 | I think it's because I is so often used
00:22:30.460 | as the index of a for loop.
00:22:32.100 | (Lex laughs)
00:22:32.940 | I think that's actually why.
00:22:34.260 | - Probably.
00:22:35.100 | - Right.
00:22:35.940 | - That's the pragmatic aspect.
00:22:36.780 | - But the fact that complex numbers were there,
00:22:38.260 | I love that.
00:22:39.100 | The fact that I could write NDA array constructs,
00:22:41.460 | and that reduction was there.
00:22:42.820 | Very simple to write summations,
00:22:44.620 | and broadcasting was there.
00:22:46.540 | I could do addition of whole arrays.
00:22:48.440 | So that was cool.
00:22:50.380 | Those are some things I loved about it.
00:22:52.660 | - I don't know what to start talking to you about,
00:22:54.820 | 'cause you've created so many incredible projects
00:22:57.860 | that basically changed the whole landscape of programming.
00:23:00.140 | But okay, let's start with,
00:23:02.380 | let's go chronologically with SciPy.
00:23:06.060 | You created SciPy over two decades ago now?
00:23:09.100 | - Yes. - Right?
00:23:09.940 | - Yes, I love to talk about SciPy.
00:23:10.780 | SciPy was really my baby.
00:23:12.980 | - What is it?
00:23:14.460 | What was its goal?
00:23:15.420 | What is its goal?
00:23:16.420 | How does it work?
00:23:17.260 | - Yeah, fantastic.
00:23:18.080 | So SciPy was effectively,
00:23:20.300 | here I am using Python to do stuff
00:23:22.980 | that I previously used MATLAB to use.
00:23:25.020 | And I was using Numeric,
00:23:26.020 | which is an array library that made a lot of it possible.
00:23:28.340 | But there's things that were missing.
00:23:29.940 | Like I didn't have an ordinary differential equation solver,
00:23:32.140 | I could just call, right?
00:23:33.460 | I didn't have integration.
00:23:35.260 | Hey, I wanted to integrate this function.
00:23:37.180 | Okay, well, I don't have just a function
00:23:38.820 | I can call to do that.
00:23:40.580 | These are things I remember being critical things
00:23:42.540 | that I was missing.
00:23:43.700 | Optimization, I just wanna pass a function to an optimizer
00:23:46.780 | and have it tell me what the optimum value is.
00:23:49.120 | Those are things like,
00:23:50.940 | well, why don't we just write a library
00:23:52.540 | that adds these tools?
00:23:54.340 | And I started to post to the main list,
00:23:55.700 | and there'd previously been,
00:23:57.300 | people have discussed,
00:23:58.140 | I remember Conrad Hinson saying,
00:23:59.140 | wouldn't it be great if we had this optimizer library?
00:24:00.980 | Or David Ash would say this stuff.
00:24:02.620 | And I'm ambitious,
00:24:06.140 | ambitious is the wrong word,
00:24:06.980 | and eager, and probably more time than sense.
00:24:11.380 | I was a poor graduate student.
00:24:13.620 | My wife thinks I'm working on my PhD, and I am.
00:24:15.900 | But part of the PhD that I loved
00:24:17.260 | was the fact that it's exploratory.
00:24:19.180 | You're not just taking orders,
00:24:21.580 | fulfilling a list of things to do,
00:24:23.540 | you're trying to figure out what to do.
00:24:25.780 | And so I thought, well, I'm writing tools
00:24:27.940 | for my own use in a PhD,
00:24:29.180 | so I'll just start this project.
00:24:32.180 | And so in '99, '98 was when I first started
00:24:34.980 | to write libraries for Python.
00:24:36.660 | But really when I fell in love with Python in '98,
00:24:38.260 | I thought, oh, well, there's just a few things missing.
00:24:39.740 | Like, oh, I need a reader to read DICOM files.
00:24:42.700 | I was in medical imaging,
00:24:43.540 | and DICOM was a format that,
00:24:45.060 | I want to be able to load that into Python.
00:24:46.980 | Okay, how do I write a reader for that?
00:24:48.220 | So I wrote something called,
00:24:49.980 | it was an IO package, right?
00:24:51.740 | And that was my very first extension module,
00:24:54.540 | which is C.
00:24:55.380 | So I wrote C code to extend Python
00:24:57.100 | so that in Python I could write things more easily.
00:24:59.740 | That combination kind of hooked me.
00:25:02.300 | It was the idea that I could,
00:25:03.340 | here's this powerful tool I can use
00:25:04.860 | as a scripting language and a high-level language
00:25:06.540 | to think about, but that I can extend easily.
00:25:09.740 | - In C. - Easily in C.
00:25:11.460 | Easily for me because I knew enough C.
00:25:13.820 | And then Guido had written a,
00:25:15.300 | I mean, the only, the hard part of extending Python
00:25:17.260 | was something called the way memory management works,
00:25:19.540 | and you have to reference counting.
00:25:21.100 | And so there's a tracking of reference counting
00:25:23.860 | you have to do manually.
00:25:25.540 | And if you don't, you have memory leaks.
00:25:27.540 | And so that's hard.
00:25:29.060 | Plus in C, it's much more,
00:25:31.060 | you have to put more effort into it.
00:25:32.220 | It's not just, I have to now think about pointers
00:25:34.740 | and I have to think about stuff that is different.
00:25:37.660 | I have to kind of,
00:25:38.500 | you're like putting a new cartridge in your brain.
00:25:40.780 | Like you're, okay, I'm thinking about MRI,
00:25:42.420 | now I'm thinking about programming.
00:25:44.140 | And there are distinct modules
00:25:45.340 | you end up having to think about.
00:25:46.660 | So it's harder.
00:25:47.500 | When I was just in Python,
00:25:48.320 | I could just think about MRI and high-level writing.
00:25:51.540 | But I could do that, and that kind of, I liked it.
00:25:54.020 | I found that to be enjoyable and fun.
00:25:55.780 | And so I ended up, oh,
00:25:57.220 | well, let me just add a bunch of stuff to Python
00:25:59.060 | to do integration.
00:26:00.600 | Well, and the cool thing is,
00:26:01.660 | is that the power of the internet,
00:26:03.060 | I just looking around and I found,
00:26:04.300 | oh, there's this NetLib,
00:26:06.320 | which has hundreds of Fortran routines
00:26:08.900 | that people had written in the '60s and the '70s
00:26:11.380 | and the '80s and Fortran 77, fortunately,
00:26:13.940 | it wasn't Fortran 60s,
00:26:14.980 | it had been ported to Fortran 77.
00:26:17.220 | And Fortran 77 is actually a really great language.
00:26:21.700 | Fortran 90 probably is my favorite Fortran
00:26:24.140 | because it's got complex numbers, got arrays,
00:26:26.820 | and it's pretty high level.
00:26:27.740 | Now, the problem with it is you'd never wanna write
00:26:29.820 | a program in Fortran 90 or Fortran 77,
00:26:32.300 | but it's totally fine to write a subroutine in.
00:26:34.620 | Right, and so, and then Fortran
00:26:36.660 | kind of got a little off course
00:26:37.700 | when they tried to compete with C++.
00:26:39.100 | But at the time, I just want libraries that do something,
00:26:42.220 | like, oh, here's an order-inference equation.
00:26:43.980 | Here's integration.
00:26:44.900 | Here's run-cut-integration.
00:26:46.820 | Already done.
00:26:47.660 | I don't have to think about that algorithm.
00:26:48.820 | I mean, you could, but it's nice to have somebody
00:26:50.460 | who's already done one and tested it.
00:26:51.820 | And so I sort of started this journey in '98, really,
00:26:55.100 | if you look back at the main list,
00:26:56.020 | there's sort of this productive era of me
00:26:58.620 | writing an extension module
00:27:01.140 | to connect run-cut-integration to Python
00:27:04.580 | and making an ordinary additional equation solver,
00:27:06.700 | and then releasing that as a package.
00:27:09.180 | So we could call it ODE-PAC, I think I called it then,
00:27:11.860 | QuadPAC, and then I just made these packages.
00:27:14.420 | Eventually, that became Multipac
00:27:16.260 | 'cause they were originally modular.
00:27:17.580 | You can install them separately.
00:27:19.140 | But a massive problem in Python
00:27:20.700 | was actually just getting your stuff installed.
00:27:23.420 | At the time, releasing software for me,
00:27:25.820 | like, today, people think, what does that mean?
00:27:27.580 | Well, then it meant some poorly-written webpage,
00:27:30.780 | I had some bad webpage up, and I put a tarball,
00:27:33.100 | just a gzip tarball of source code.
00:27:35.780 | That was the release.
00:27:37.180 | - But, okay, can we just end that?
00:27:39.140 | - Sure. - Because
00:27:39.980 | the community aspect of creating the package
00:27:44.380 | and sharing that, that's rare.
00:27:49.020 | - To both have-- - What do you mean?
00:27:50.220 | - At that time, so like the raw--
00:27:51.060 | - Yeah, it was pretty early, yeah.
00:27:52.900 | - Well, not rare.
00:27:54.740 | Maybe you can correct me on this,
00:27:57.060 | but it seems like in the scientific community,
00:27:59.700 | so many people, you were basically solving
00:28:01.980 | the problems you needed to solve
00:28:04.100 | to process the particular application,
00:28:06.940 | the data that you need.
00:28:08.620 | And to also have the mind
00:28:10.980 | that I'm going to make this usable for others, that's--
00:28:15.420 | - I would say I was inspired.
00:28:16.580 | I'd been inspired by Linux,
00:28:18.100 | been inspired by Linus and him making his code available,
00:28:21.860 | and I was starting to use Linux at the time.
00:28:23.300 | I went, "This is cool."
00:28:24.460 | So I'd kind of been previously primed that way.
00:28:27.140 | And generally, I was into science
00:28:29.220 | because I liked the sharing notion.
00:28:31.020 | I liked the idea of, hey, let's,
00:28:32.660 | if collectively we build knowledge and share it,
00:28:34.820 | we can all be better off.
00:28:35.740 | - Okay, so you were energized by that idea.
00:28:37.460 | - So I was energized by that already, right?
00:28:39.540 | And I can't deny that, I was.
00:28:40.940 | I'm sort of, I had this very,
00:28:42.900 | I liked that part of science, that part of sharing.
00:28:45.700 | And then all of a sudden, oh, wait, here's something,
00:28:47.300 | and here's something I could do.
00:28:49.940 | And then I slowly over years learned how to share better
00:28:52.780 | so that you could actually engage more people faster.
00:28:55.140 | One of the key things was actually giving people
00:28:56.700 | a binary they could install, right?
00:28:59.020 | So that it wasn't just, here's source code, good luck.
00:29:01.460 | - Compile this, and then--
00:29:02.660 | - It's compiled, ready to install, just, you know, so.
00:29:05.740 | In fact, a lot of the journey from '98, even through 2012,
00:29:08.300 | when I started Anaconda, was about that.
00:29:10.780 | Like, it's why, you know, it's really the key
00:29:13.260 | as to why a scientist with dreams of doing MRI research
00:29:17.460 | ended up starting a software company
00:29:19.500 | that installs software.
00:29:22.260 | - I work with a few folks now that don't program,
00:29:25.800 | like, on the creative side, the video side, the audio side.
00:29:29.620 | And because my whole life is run on scripts,
00:29:32.500 | I have to try to get them, I'm having now the task
00:29:35.180 | of teaching them how to do Python enough to run the scripts.
00:29:39.220 | And so I've been actually facing this,
00:29:40.820 | whether it's on Anaconda or some,
00:29:43.300 | with the task of how do I minimally explain,
00:29:45.700 | basically to my mom, how to write a Python script.
00:29:48.940 | And it's an interesting challenge.
00:29:50.540 | I have to, it's a to-do item for me to figure out,
00:29:52.820 | like, what is the minimal amount of information
00:29:55.340 | I have to teach, what are the tools you use,
00:29:57.980 | that when you enjoy it, to your effect of it--
00:30:00.820 | - And they're related, those are two related questions.
00:30:02.540 | - And then the debugging, like the iterative process
00:30:05.500 | of running the script to figure out what the error is,
00:30:07.860 | maybe even for some people to do the fix yourself.
00:30:11.620 | So do you compile it, do you,
00:30:13.620 | like, how do you distribute that code to them?
00:30:15.660 | And it's interesting because I think
00:30:18.580 | it's exactly what you're talking about.
00:30:20.140 | If you increase the circle of empathy,
00:30:24.300 | the circle of people that are able to use your programs,
00:30:28.060 | you increase it, it's like, effectiveness and it's power.
00:30:32.940 | And so you have to think, you know, can I write scripts,
00:30:37.060 | can I write programs that can be used by medical engineers,
00:30:40.180 | by all kinds of people that don't know programming?
00:30:43.940 | And actually, maybe plant a seed,
00:30:46.020 | have them catch the bug of programming
00:30:48.420 | so that they start on their journey.
00:30:50.220 | That's a huge responsibility and ultimately
00:30:52.100 | it has to do with the Amazon one-click buy,
00:30:55.380 | like, how frictionless can you make the early steps?
00:30:58.820 | - Frictionless is actually really key.
00:31:00.420 | To go in any community is, any friction point,
00:31:03.100 | you're just gonna lose some people.
00:31:04.620 | - Yeah. - Right?
00:31:05.780 | Sometimes you may wanna intentionally do that
00:31:09.100 | if you're early enough on, you need a lot of help,
00:31:11.660 | you need people who have the skills,
00:31:13.380 | you might actually, it's helpful,
00:31:14.740 | you don't necessarily have too many users
00:31:16.860 | as opposed to contributors if you're early on.
00:31:20.380 | Anyway, SciFi started in '98,
00:31:23.140 | but it really emerged as this collection of modules
00:31:25.780 | that I was just putting on the net,
00:31:27.380 | people were downloading.
00:31:28.620 | And you know, I think I got 100 users, right,
00:31:31.620 | by the end of that year.
00:31:33.020 | But the fact that I got 100 users and more than that,
00:31:35.700 | people started to email me with fixes.
00:31:39.420 | And that was actually intoxicating, right?
00:31:41.340 | That was the, you know, here I'm writing papers,
00:31:44.260 | I'm giving conferences, and I get people to say hello,
00:31:46.220 | but yeah, good job, but mostly it was,
00:31:48.500 | you're viewed with, it's competitive, right?
00:31:51.580 | You publish a paper and people are like,
00:31:52.900 | oh, it wasn't my paper, you know?
00:31:54.620 | I was starting to see that sense of academic life
00:31:59.260 | where it was so much, I thought there was
00:32:00.620 | this cooperative effort, but it sounds like
00:32:02.020 | we're here just to one-up each other.
00:32:04.220 | - Right.
00:32:05.060 | - And you know, that's not true across the board,
00:32:07.700 | but a lot of that's there.
00:32:08.580 | But here in this world, I was getting responses
00:32:11.300 | from people all over the world.
00:32:13.460 | You know, I remember Piero Peterson in Estonia, right,
00:32:16.060 | was one of the first people.
00:32:17.340 | And he sent me back this makefile,
00:32:18.740 | 'cause you know, the first thing he did is,
00:32:19.580 | "Yeah, your build thing stinks,
00:32:21.140 | "and here's a better makefile."
00:32:23.020 | Now it was a complex makefile,
00:32:24.380 | I don't think I ever understood that makefile actually,
00:32:26.540 | but it worked, and it did a lot more,
00:32:29.220 | and so I was like, thanks, this is cool.
00:32:30.980 | And that was my first kind of engagement
00:32:32.500 | with community development.
00:32:35.100 | But you know, the process was, he sent me a patch file,
00:32:37.660 | I had to upload a new tarball.
00:32:39.860 | And I just found I really loved that.
00:32:41.580 | And the style back then was, here's a mailing list,
00:32:43.620 | it's very, it wasn't as, it certainly weren't
00:32:46.180 | the tools that are available today, it was very early on.
00:32:48.900 | But I really started to, that's the whole year,
00:32:50.700 | I think I did about seven packages that year, right?
00:32:54.580 | And then by the end of the year,
00:32:55.540 | I collected them into a thing called Multipack.
00:32:57.820 | So '99, there was this thing called Multipack,
00:32:59.740 | and that's when a high school student,
00:33:01.780 | I know he was a high school student at the time,
00:33:03.020 | a guy named Robert Kern, took that package
00:33:07.140 | and made a Windows installer, right?
00:33:09.700 | And then of course a massive increase of usage.
00:33:12.660 | - So by the way, most of this development was under Linux.
00:33:15.860 | - Yes, yes, it was on Linux, I was a Linux developer,
00:33:18.580 | doing it on a Unix box.
00:33:20.220 | I mean, at the time, I was actually getting into,
00:33:22.980 | I had a new hard drive, did some kernel programming
00:33:25.060 | to make the hard drive work.
00:33:26.460 | I mean, not programming, but modification to the kernel
00:33:28.740 | so I could actually get a hard drive working.
00:33:31.140 | I love that aspect of it.
00:33:32.300 | I was also, at school, I was building a cluster,
00:33:36.060 | I took Mac computers, and you put Yellow Dog Linux on 'em.
00:33:39.940 | At the Mayo Clinic, they were just,
00:33:42.180 | all these Macs that were older,
00:33:43.500 | they were just getting rid of,
00:33:44.700 | and so I kind of got permission to go grab 'em together,
00:33:46.820 | I put about 24 of 'em together in a cluster in a cabinet,
00:33:50.340 | and put Yellow Dog Linux on 'em all,
00:33:51.660 | and I wrote a C++ program to do MRI simulation.
00:33:56.220 | That was what I was doing at the same time
00:33:58.900 | for my day job, so to speak.
00:34:01.380 | So I was loving the whole process.
00:34:03.460 | At the same time, I was, oh,
00:34:04.540 | I need a ordinary differential equation.
00:34:06.260 | That's why ordinary differential equations were key,
00:34:08.140 | was because that's the heart of a block equation
00:34:09.820 | for simulating MRI, is an ODE solver.
00:34:12.420 | And so that's, but I actually did that,
00:34:15.700 | it doesn't happen at the same time.
00:34:16.940 | That's why, kind of, what you're working on
00:34:18.540 | and what you're interested in, they're coinciding.
00:34:20.500 | I was definitely scratching my own itch
00:34:22.380 | in terms of building stuff,
00:34:24.940 | which helped in the sense that I was using it for me,
00:34:27.020 | so at least I had one user.
00:34:28.500 | I had one person who was like, well, no, this is better,
00:34:30.340 | I like this interface better,
00:34:31.380 | and I had the experience of MATLAB
00:34:33.260 | to guide some of what those APIs might look like.
00:34:36.420 | But you're just doing yourself,
00:34:37.660 | you're building all this stuff.
00:34:38.940 | But with the Windows installer,
00:34:40.020 | it was the first time I realized,
00:34:40.980 | oh, yeah, the binary installer really helps people.
00:34:43.700 | And so that led to spending more time
00:34:46.940 | on that side of things.
00:34:49.060 | So around 2000, so I graduated my PhD in 2000,
00:34:52.740 | end of year, end of 2000.
00:34:53.740 | So '99, doing a lot of work there,
00:34:56.620 | '98, doing a lot of work there,
00:34:57.700 | '99, kind of spending more time on my PhD,
00:35:00.740 | helping people use the tools,
00:35:02.380 | thinking about what do I wanna go from here.
00:35:04.020 | There was a company, there was a guy, actually,
00:35:05.580 | Eric Jones and Travis Vought,
00:35:07.580 | they were two friends who founded a company called Enthought.
00:35:11.020 | It's here in Austin, still here.
00:35:13.580 | And they, Eric contacted me at the time
00:35:16.020 | when I was a graduate student still,
00:35:19.340 | and he said, hey, why don't you come down,
00:35:20.820 | we wanna build a company,
00:35:23.220 | we're thinking of a scientific company
00:35:25.700 | and we wanna take what you're doing
00:35:27.540 | and kind of add it to some stuff that he'd done,
00:35:29.460 | he'd written some tools.
00:35:31.180 | And then Piero Peterson had done F2Pi,
00:35:32.820 | let's come together and build,
00:35:34.340 | pull this all together and call it SciPy.
00:35:36.700 | So that's the origin of the SciPy brand.
00:35:39.460 | It came from Multipack and a whole bunch of modules
00:35:42.180 | I'd written, plus a few things from some other folks,
00:35:44.460 | and then pulled together in a single installer.
00:35:47.540 | SciPy was really a distribution of Python,
00:35:49.500 | masquerading as a library.
00:35:51.220 | - How did you think about SciPy in context of Python,
00:35:54.340 | in context of numeric?
00:35:55.660 | Like what-- - So we saw SciPy
00:35:56.900 | as a way to make an R&D environment for Python,
00:35:59.980 | like use Python, depended on numeric.
00:36:03.340 | So numeric was the array library we depended on.
00:36:05.500 | And then from there, extend it
00:36:07.060 | with a bunch of modules that allowed for,
00:36:09.460 | and at the time, the original vision of SciPy
00:36:11.500 | was to have plotting, was to have, you know,
00:36:14.300 | REPL environment and kind of a whole,
00:36:17.020 | really a whole data environment
00:36:18.580 | that you could then install and get going with.
00:36:20.980 | And that was kind of the thinking.
00:36:22.980 | It didn't really evolve that way, right?
00:36:24.980 | It sort of had a, but one, it's really hard
00:36:28.340 | to do massive scale projects with open source collectives.
00:36:33.340 | Actually, there's sort of an intrinsic cooperation limit
00:36:38.460 | as to which, you know, too many cooks in the kitchen,
00:36:40.580 | you know, you can do amazing infrastructure work.
00:36:42.740 | When it comes down to bringing it all together
00:36:44.220 | into a single deliverable,
00:36:45.860 | that actually requires a little more product management
00:36:49.660 | that is not, that doesn't really emerge
00:36:52.820 | from the same dynamic.
00:36:53.980 | So it struggled, you know, it struggled to get,
00:36:56.900 | almost too many voices, it's hard to have everybody agree,
00:36:59.060 | you know, consensus doesn't really work at that scale.
00:37:02.100 | You end up with politics, you end up with the same kind
00:37:03.900 | of things that's happened in large organizations
00:37:06.100 | trying to decide on what to do together.
00:37:09.420 | So consensus building was still, was challenging at scale
00:37:12.380 | as more people came in, right?
00:37:13.860 | Early on, it's fine 'cause there's nobody there.
00:37:15.700 | And so it works, but then as you get more successful
00:37:17.780 | and more people use it, all of a sudden,
00:37:19.020 | oh, there's this scale at which this doesn't work anymore
00:37:22.340 | and we have to come up with different approaches.
00:37:24.020 | So Sci-Fi came out officially in 2001,
00:37:26.740 | was the first release, most of the time,
00:37:28.940 | I remember the days of getting that release ready,
00:37:31.100 | it was a Windows installer and there were bugs on how,
00:37:34.060 | you know, the Windows compiler handled complex numbers
00:37:36.340 | and you were chasing segmentation faults.
00:37:38.580 | It was, it's a lot of work.
00:37:40.460 | There was a lot of,
00:37:42.260 | effort had nothing to do with my area of study.
00:37:45.540 | And at the same time, I had just gotten an offer.
00:37:47.540 | So he wondered if I wanted to come down
00:37:48.820 | and help him start that company with his friend.
00:37:51.500 | And I, at the time, I was like, I was so intrigued,
00:37:53.420 | but I was squaring a path, an academic path.
00:37:56.620 | And I had just got an offer to go and teach
00:37:59.100 | at my alma mater.
00:38:00.020 | So I took that tenure track position.
00:38:02.460 | And Sci-Fi and kind of,
00:38:03.900 | then I started working on Sci-Fi as a professor too.
00:38:07.100 | - Okay.
00:38:07.940 | - So that's, I left, I've got the Mayo Clinic, graduated,
00:38:10.020 | wrote my thesis using Sci-Fi, wrote, you know,
00:38:12.340 | there's images that were created.
00:38:15.540 | Now the plotting tool I used was something
00:38:17.340 | from Yorick actually.
00:38:18.660 | It was a plotting, a PLT,
00:38:21.020 | kind of a plotting language that I used.
00:38:22.660 | - Yorick is a programming language?
00:38:23.940 | - It was a programming language.
00:38:24.780 | It had a plotting tool, Dyslin.
00:38:27.540 | It had integration to Dyslin.
00:38:28.940 | I ended up using Dyslin plus some of the plotting
00:38:31.340 | from Yorick, linked to from Python.
00:38:33.780 | Anyway, it was, people don't plot that way now,
00:38:37.180 | but this is before, and Sci-Fi was trying to add plotting.
00:38:40.260 | - Yeah. - Right?
00:38:41.460 | It didn't have much success.
00:38:42.580 | Really the success of plotting came from John Hunter,
00:38:45.580 | who had a similar experience to my experience,
00:38:47.420 | my kind of maverick experience as a person
00:38:49.660 | just trying to get stuff done and kind of having more time
00:38:51.700 | than money maybe, right?
00:38:53.820 | - And John Hunter created what?
00:38:55.300 | - Matplotlib.
00:38:56.260 | - He's the creator of Matplotlib?
00:38:57.180 | - Yeah, so John Hunter was, you know,
00:38:59.140 | he wasn't a student at the time, but he was an,
00:39:00.580 | he was working in quant field,
00:39:01.780 | and he said, "We need better plotting."
00:39:03.500 | So he just went out and said, "Cool, I'll make a new project,
00:39:05.460 | "and we'll call it Matplotlib," and he released it in 2001,
00:39:08.260 | about the same time that Sci-Fi came out.
00:39:09.900 | And it was separate library, separate install,
00:39:12.980 | used numeric, Sci-Fi used numeric.
00:39:15.540 | And so Sci-Fi, you know, in 2001 we released Sci-Fi,
00:39:18.980 | and then Enthoq created a conference called Sci-Fi,
00:39:22.380 | which brought people together to talk about the space.
00:39:25.460 | And that conference is still ongoing.
00:39:26.700 | It's one of the favorite conferences of a lot of people,
00:39:28.460 | because it's, you know, it's changed over the years,
00:39:30.820 | but early on it was, you know, a collection of 50 people
00:39:33.740 | who care about, scientists mostly, you know,
00:39:36.700 | practicing scientists who want to care about coding
00:39:39.300 | and doing it well and not using MATLAB.
00:39:42.180 | And I remember being driven by, you know, I like MATLAB,
00:39:44.140 | but I didn't like the fact that,
00:39:46.460 | so I'm not opposed to proprietary software.
00:39:48.100 | I'm actually not an open source zealot.
00:39:50.260 | I love open source for what it brings,
00:39:52.700 | but I also see the role for proprietary software.
00:39:54.500 | But what I didn't like was the fact
00:39:55.540 | that I would develop code and publish it,
00:39:58.700 | and then effectively telling somebody,
00:39:59.940 | "Here, to run my code,
00:40:00.860 | "you have to have this proprietary software."
00:40:02.540 | - Right, and there's also culture around MATLAB,
00:40:05.400 | as much, 'cause I've talked to a few folks,
00:40:07.500 | Mathworks creates MATLAB.
00:40:09.900 | - Yeah.
00:40:10.860 | - I mean, there's just a culture, they try really hard,
00:40:13.940 | but it just is this corporate IBM style culture
00:40:16.820 | that's like, or whatever.
00:40:18.420 | I don't want to say negative things about IBM or whatever,
00:40:20.820 | but there's a--
00:40:22.260 | - No, it's really that connection.
00:40:23.740 | It's something I'm in the middle of right now
00:40:24.980 | is the business of open source
00:40:27.020 | and how do you connect the ethos of cooperative development
00:40:30.840 | with the necessity of creating profits, right?
00:40:34.780 | And like right now today, you know,
00:40:36.700 | I'm still in the middle of that.
00:40:38.100 | That's actually the early days
00:40:40.060 | of me exploring this question.
00:40:42.220 | 'Cause I was writing sci-fi, I mean, as an aside,
00:40:44.700 | I also had, so I had three kids at the time.
00:40:46.540 | I have six kids now.
00:40:47.880 | I got married early, wanted a family.
00:40:49.980 | I had three kids, and I remember reading,
00:40:52.620 | I read Richard Stallman's post,
00:40:54.300 | and I was a fan of Stallman.
00:40:55.540 | I would read his work.
00:40:56.660 | I liked his collective ideas he would have.
00:40:58.780 | Certainly the ideas on IP law, I read a lot of his stuff.
00:41:01.740 | But then he said, you know,
00:41:03.900 | "Okay, well, how do I make money with this?
00:41:05.700 | "How do I make a living?
00:41:06.620 | "How do I pay for my kids?"
00:41:07.740 | All this stuff was in my mind,
00:41:09.300 | young graduate student making no money,
00:41:10.640 | thinking I gotta get a job.
00:41:12.060 | And he said, "Well, you know,
00:41:13.220 | "I think just be like me and don't have kids, right?
00:41:15.760 | "That's just don't, don't."
00:41:17.100 | - That's his take on it, that's his--
00:41:18.540 | - That was the, what he said in that moment, right?
00:41:20.860 | That's the thing I read, and I went,
00:41:22.360 | "Okay, this is a train I can't get on."
00:41:23.660 | - Yeah.
00:41:24.980 | There has to be a way to preserve the culture of open source
00:41:27.460 | and still be able to make sufficient money to feed your--
00:41:29.940 | - Yes, exactly, there's gotta be.
00:41:31.500 | Well, so that actually led me to a study of economics.
00:41:34.500 | 'Cause at the time, I was ignorant, and I really was.
00:41:36.660 | And I'm actually, I'm embarrassed for educational system
00:41:39.420 | that they could let me, and I was valedictorian
00:41:41.300 | in my high school class, and I did super well in college.
00:41:43.740 | And academically, I did great, right?
00:41:47.620 | But the fact that I could do that and then be clueless
00:41:49.980 | about this key part of life,
00:41:52.760 | it led me to go, "There's a problem."
00:41:54.420 | Like, I should've learned this in fifth grade.
00:41:56.660 | I should've learned this in eighth grade.
00:41:58.380 | Like, everybody should come out
00:41:59.220 | with a basic knowledge of economics.
00:42:01.700 | - You're an interesting example
00:42:02.820 | because you've created tools that changed the lives
00:42:05.460 | of probably millions of people,
00:42:07.620 | and the fact that you don't understand,
00:42:09.500 | at the time of the creation of those tools,
00:42:11.580 | the basics economics of how to build up a giant system
00:42:14.420 | is a problem.
00:42:15.260 | - Yeah, it's a problem.
00:42:16.100 | And so, during my PhD at the same time,
00:42:18.260 | this is back in '98, '99, at the same time,
00:42:20.700 | I was in a library, I was reading books on capitalism,
00:42:23.380 | I was reading books on Marxism,
00:42:24.700 | I was reading books on, you know, what is this thing?
00:42:27.700 | What does it mean?
00:42:29.740 | And I encountered, basically, I encountered a set of writings
00:42:33.140 | from people that said they were the inheritors
00:42:34.620 | of Adam Smith.
00:42:35.580 | Read Adam Smith for the first time, right?
00:42:37.220 | Which is the wealth of nations
00:42:38.620 | and kind of this notion of emergent societies,
00:42:42.500 | and realized, oh, there's this whole world out here
00:42:45.140 | of people, and the challenge of economics
00:42:48.180 | is also political.
00:42:49.500 | Like, 'cause economics, you know, people,
00:42:53.140 | different parties running for office,
00:42:54.940 | they want their economic friends.
00:42:58.100 | They want their economists to back them up, right?
00:43:00.060 | Or to be their magicians,
00:43:02.980 | like the magicians in Pharaoh's court, right?
00:43:04.620 | The people that are gonna say,
00:43:05.620 | hey, this is, you should listen to me
00:43:06.980 | because I've got the expert who says this.
00:43:09.420 | And so, it gets really muddled, right?
00:43:11.540 | But I was looking at it as a scientist,
00:43:13.900 | going, what is this space?
00:43:14.740 | What does this mean?
00:43:15.820 | How does Paris get fed?
00:43:16.940 | How does, what is money?
00:43:18.380 | How does it work?
00:43:19.380 | And I found a lot of writings that I really loved.
00:43:21.580 | I found some things that I really loved,
00:43:22.820 | and I learned from that.
00:43:23.980 | It was writings from people like Von Mises.
00:43:26.260 | He wrote a paper in 1920
00:43:27.940 | that still should be read more than it is.
00:43:30.100 | It was the economic calculation problem
00:43:33.060 | of the socialist commonwealth.
00:43:34.540 | It was basically in response
00:43:35.380 | to the Bolshevik Revolution in 1917.
00:43:37.100 | And his basic argument was,
00:43:39.020 | it's not gonna work to not have private property.
00:43:41.780 | You're not gonna be able to come up with prices.
00:43:43.380 | The bureaucrats aren't gonna be able to determine
00:43:45.180 | how to allocate resources without a price system.
00:43:47.660 | And a price system emerges from people making trades.
00:43:51.700 | And they can only make trades
00:43:52.780 | if they have authority over the thing they're trading.
00:43:55.460 | And that creates information flow
00:43:58.020 | that you just don't have
00:43:59.660 | if you try to top down it.
00:44:01.300 | - Right. - Right.
00:44:02.140 | It's like, huh, that's a really good point.
00:44:04.780 | - Yeah, the prices have a signal that's used.
00:44:06.860 | And it's important to have that signal
00:44:09.400 | when you're trying to build a community
00:44:11.020 | of productive people-- - Yeah.
00:44:11.860 | - Like you would in the software engineering space.
00:44:13.700 | - Yeah, the prices are actually
00:44:14.860 | an important signaling mechanism.
00:44:17.100 | - Yeah. - Right?
00:44:17.940 | And that money is just a bartering tool.
00:44:20.820 | - Yeah. - Right?
00:44:21.660 | So this is the first time I've encountered
00:44:22.540 | any of this concept, right?
00:44:23.860 | And the fact that, oh, this is actually really critical.
00:44:26.600 | Like, it's so critical to our prosperity
00:44:29.340 | and that we're dangerously not learning about this,
00:44:34.100 | not teaching our children about this.
00:44:36.140 | - So you had the three kids,
00:44:37.260 | and you had to make some hard decisions.
00:44:38.100 | - Had to make some money, right?
00:44:38.940 | Had to figure it out.
00:44:39.900 | But I didn't really care.
00:44:40.740 | I mean, I've never been driven by money,
00:44:42.460 | just need it, in fact. - Yeah, right.
00:44:44.740 | To eat.
00:44:45.580 | So how did that resolve itself in terms of Sly Fi?
00:44:49.100 | - So I would say it didn't really resolve itself.
00:44:51.340 | It sort of started a journey that I'm continuing on.
00:44:53.380 | I'm still on, I would say.
00:44:54.220 | I don't think it resolved itself.
00:44:55.680 | But I will say I went in eyes wide open.
00:44:59.260 | Like, I knew that there were problems
00:45:00.940 | with giving stuff away and creating the market externalities
00:45:05.940 | that the fact that, yeah, people might use it,
00:45:09.740 | and I might not get paid for it,
00:45:10.580 | and I'll have to figure something else out to get paid.
00:45:13.060 | Like, at least I can say I'm not bitter
00:45:14.940 | that a lot of people have used stuff that I've written,
00:45:17.220 | and I haven't necessarily benefited economically from it.
00:45:20.260 | I've heard other people be bitter about that
00:45:22.300 | when they write or they talk,
00:45:23.260 | like, oh, I should've gotten more value out of this.
00:45:24.900 | And I'm also, I want to create systems
00:45:27.740 | that let people like me,
00:45:29.860 | who might have these desires to do things,
00:45:31.540 | let them benefit, so it actually creates more of the same.
00:45:34.700 | - Not to turn on your bitterness mojo,
00:45:36.880 | but there's some aspect, I wish there was mechanisms
00:45:40.460 | for me to reward whoever created Sly Fi and NumPy,
00:45:43.580 | 'cause it brought so much joy to my life.
00:45:45.300 | - I appreciate that.
00:45:46.140 | - You know what I mean?
00:45:46.980 | - The tip jar notion was there.
00:45:48.340 | I appreciate that, and I think--
00:45:49.180 | - But there should be a very frictionless mechanism.
00:45:51.860 | - There should be a frictionless mechanism, I totally agree.
00:45:53.340 | I would love to talk about some of the ideas I have,
00:45:55.220 | 'cause I actually came across,
00:45:56.220 | I think I've come up with some interesting notions
00:45:58.180 | that could work, but they'll require,
00:46:00.580 | anything that will work takes time to emerge, right?
00:46:03.940 | Things don't just turn overnight.
00:46:04.940 | That's definitely one thing I've also understood and learned
00:46:07.360 | is any fixes, that's why it's kind of funny,
00:46:10.100 | we often give credit to, oh, this president gets elected,
00:46:12.940 | and oh, look how great things have done.
00:46:14.420 | And I saw that when I had a transition in a condo
00:46:18.340 | when a new CEO came in, right?
00:46:19.500 | And it's like the success that's happening,
00:46:22.180 | there's an inertia there, right?
00:46:24.380 | - And sometimes the decision you made 10 years before
00:46:26.780 | is the reason why the success is--
00:46:29.020 | - Right, exactly, so we're sort of just
00:46:30.700 | running around taking credit for stuff.
00:46:32.180 | - The credit assignment has a delay to it
00:46:35.180 | that makes the credit assignment
00:46:36.820 | basically wrong more than right.
00:46:39.180 | - Wrong more than right, exactly,
00:46:40.340 | and so I'm like, oh, this is,
00:46:42.180 | that's the stuff I would read a ton about early on.
00:46:45.720 | So I don't, I feel like I'm with you.
00:46:47.900 | I want the same thing, I want to be able to,
00:46:49.500 | and honestly, not for personally, I've been happy,
00:46:51.740 | I've been happy, I feel like I don't have any,
00:46:53.820 | I mean, we've been done reasonably okay,
00:46:55.540 | but I've had to pursue it.
00:46:56.940 | Like, that's really what started my
00:46:58.980 | trajectory from academia, is reading that stuff
00:47:02.200 | led me to say, oh, entrepreneurship matters.
00:47:04.740 | I love software, but we need more entrepreneurs,
00:47:09.180 | and I want to understand that better.
00:47:10.360 | So once I kind of had that virus infect my brain,
00:47:14.940 | even though I was on a trajectory
00:47:17.580 | to go to a tenure track position at a university,
00:47:20.640 | and I was there for six years,
00:47:22.780 | I was kind of already out the door when I started,
00:47:26.060 | and we can get into that, but--
00:47:27.660 | - Well, can I just ask a quick question on,
00:47:30.340 | is there some design principles
00:47:32.760 | that were in your mind around sci-pi?
00:47:34.740 | Like, was there some key ideas
00:47:36.460 | that were just sticking to you,
00:47:38.060 | that this is the fundamental ideas?
00:47:40.300 | - Yeah, I would say so.
00:47:41.140 | I would think it's basically accessibility to scientists.
00:47:43.660 | Like, give them, give scientists and engineers
00:47:46.500 | tools that they don't have to think a lot about programming,
00:47:48.360 | so give them really good building blocks.
00:47:50.260 | Give them functions that they want to call,
00:47:51.820 | and sort of just the right length of spelling.
00:47:54.320 | There's one tradition in programming
00:47:58.140 | where it's like, make very, very long names, right?
00:48:01.860 | And you can see it in some programming languages
00:48:03.660 | where the names get, take half the screen.
00:48:06.540 | And in the 4chan world,
00:48:09.860 | characters would have to be six letters early on, right?
00:48:12.260 | And that's way too much, too little,
00:48:14.320 | but I was like, I liked to have names
00:48:16.820 | that were informative, but short.
00:48:18.940 | - So even though Python,
00:48:21.140 | well, this is a different conversation,
00:48:22.300 | but documentation is doing some work there.
00:48:25.860 | So when you look at great scientific libraries
00:48:29.180 | and functions, there's a richness of documentation
00:48:32.700 | that helps you get into the details.
00:48:34.820 | The first glance at a function gives you the intuition
00:48:37.620 | of all it needs to do by looking at the headers and so on,
00:48:40.540 | but to get the depths of all the complexities involved,
00:48:43.420 | all the options involved,
00:48:44.740 | documentation does some of the work.
00:48:45.580 | - Documentation is essential.
00:48:47.020 | - Yeah.
00:48:47.860 | - So we thought about several things.
00:48:50.500 | One is we wanted plotting.
00:48:51.940 | We wanted interactive environment.
00:48:53.580 | We wanted good documentation.
00:48:54.860 | These are things we knew we wanted.
00:48:56.780 | The reality is those took about 10 years to evolve, right?
00:49:00.460 | Given the fact that we didn't have a big budget,
00:49:02.060 | it was all volunteer labor.
00:49:03.100 | It was sort of, when Enthought got created
00:49:06.940 | and they started to try to find projects,
00:49:10.060 | people would pay for pieces
00:49:11.060 | and they were able to fund some of it,
00:49:13.740 | not nearly enough to keep up with what was necessary.
00:49:15.780 | And no criticism, just simply the reality.
00:49:18.860 | I mean, it's hard to start a business and then do consulting
00:49:22.380 | and then also promote an open source project
00:49:24.300 | that's still fairly new.
00:49:26.180 | Saipa was fairly niche.
00:49:27.780 | We stayed connected all while I was a student,
00:49:30.140 | sorry, a professor.
00:49:30.980 | I went to BYU and started to teach electrical engineering,
00:49:33.580 | all the applied math courses.
00:49:35.060 | I loved teaching signal processing,
00:49:37.020 | probability theory, electromagnetism.
00:49:39.180 | I was, if you look at my professor,
00:49:40.980 | which my kids love to do,
00:49:42.540 | I wasn't, I got some bad reviews
00:49:45.060 | because people--
00:49:46.900 | - What was the criticism?
00:49:48.580 | - I would speak too high of a level.
00:49:51.140 | I definitely had a calibration problem
00:49:52.660 | coming out of graduate work
00:49:55.000 | where I hate to be condescending to people.
00:49:56.980 | Like I really have a ton of respect
00:49:58.380 | for people fundamentally.
00:49:59.300 | Like my fundamental thing is I respect people.
00:50:02.060 | Sometimes that can lead to a,
00:50:03.900 | I was thinking they had more knowledge than they did.
00:50:07.640 | And so I would just speak at a very high level,
00:50:10.100 | assume they got it.
00:50:11.060 | - But they need to rise to the standard that you set.
00:50:14.340 | I mean, that's one of the,
00:50:15.260 | some of the greatest teachers do that.
00:50:17.180 | - And I agree, and that was kind of what was inspiring me.
00:50:19.020 | But you also have to,
00:50:21.260 | I cannot say I was an articulate
00:50:24.820 | of some of the greatest teachers.
00:50:26.340 | I was, like one classic example,
00:50:28.580 | when I first taught at BYU,
00:50:30.420 | my very first class, it was overheads,
00:50:31.980 | transparencies, overheads.
00:50:34.100 | Before projectors were really that common.
00:50:36.380 | Transparencies, I'm writing my notes out.
00:50:38.260 | I go in, room's half dark.
00:50:40.540 | I just blaring through these transparencies.
00:50:42.780 | Here it is, here it is, here it is.
00:50:44.940 | And I gave a quiz after two weeks.
00:50:47.520 | Nowhere knew anything.
00:50:48.940 | Nothing I had taught had gotten anywhere.
00:50:51.020 | (laughing)
00:50:52.300 | And I realized, okay, I'm not, this is not working.
00:50:54.180 | So I put away the transparencies,
00:50:56.420 | and I turned around and just started using the chalkboard.
00:50:58.900 | And what it did is it slowed me down.
00:51:00.780 | Right, the chalkboard just slowed me down
00:51:02.300 | and gave people time to process and to think.
00:51:04.460 | And then that made me focus.
00:51:06.120 | My writing wasn't great on the chalkboard,
00:51:07.940 | but I really love that part of like the teaching.
00:51:10.560 | So that entered Sai Pai's world
00:51:12.220 | in terms of we always understood
00:51:13.340 | that there's a didactic aspect of Sai Pai.
00:51:15.420 | Kind of how do you take the knowledge and then produce it?
00:51:18.640 | The challenge we had was the scope.
00:51:21.020 | Like ultimately Sai Pai was everything, right?
00:51:23.420 | And so 2001 when it first came out,
00:51:25.620 | people were starting to use it.
00:51:26.820 | No, this is cool.
00:51:27.660 | This is a tool we actually use.
00:51:29.580 | At the same time, 2001 timeframe,
00:51:31.420 | there was a little bit of like the Hubble Space Telescope.
00:51:33.940 | The folks at Hubble had started to say,
00:51:35.380 | hey, Python, we're gonna use Python
00:51:36.620 | for processing images from Hubble.
00:51:38.720 | And so Perry Greenfield was a good friend
00:51:40.820 | and running that program.
00:51:42.420 | And he had called me before I left BYU and said,
00:51:45.060 | you know, we wanna do this,
00:51:47.020 | but numeric actually has some challenges
00:51:49.020 | in terms of, you know, it's not a,
00:51:50.620 | the array doesn't have enough types.
00:51:52.700 | We need more operations.
00:51:54.260 | You know, broadcast needs to be a little more settled.
00:51:56.660 | They wanted record arrays.
00:51:57.940 | They wanted, you know, record arrays are like a data frame,
00:52:00.580 | but a little bit different.
00:52:02.220 | But they wanted more structured data.
00:52:03.820 | So he had called me even early on then.
00:52:06.020 | And he said, you know,
00:52:06.860 | would you wanna work on something to make this work?
00:52:08.300 | And I said, yeah, I'm interested, but I'm going here.
00:52:10.140 | And I, you know, we'll see if I have time.
00:52:12.100 | So in the meantime, while I was teaching
00:52:13.340 | and SciPy was emerging, and I had a student,
00:52:15.660 | I was constantly, while I was teaching,
00:52:16.860 | trying to figure a way to fund this stuff.
00:52:18.840 | So I had a graduate student, my only graduate student,
00:52:21.660 | a Chinese fellow, Liu Hongze is his name, great guy.
00:52:26.260 | He wrote a bunch of stuff for iterative linear algebra,
00:52:29.900 | like got into writing some of the iterative
00:52:31.380 | linear algebra tools that are currently there in SciPy.
00:52:34.340 | And they've gotten better since, but this is in 2005.
00:52:38.220 | Kept working on SciPy,
00:52:39.260 | but Perry has started working on a replacement
00:52:43.060 | to NumEric called NumEri.
00:52:45.260 | And in 2004, a package called NDImage,
00:52:49.020 | it was an image processing library
00:52:50.740 | that was written for NumEri.
00:52:53.220 | And it had in it a morphology tool.
00:52:55.540 | I don't know if you know what morphology is.
00:52:56.740 | It's open dilations, you know, there was sort of this,
00:52:59.580 | as a medical imaging student, I knew what it was
00:53:02.420 | because it was used in segmentation a lot.
00:53:04.380 | And in fact, I'd wanted to do something like that
00:53:06.460 | in Python, in SciPy, but just had never gotten around to it.
00:53:10.220 | So when it came out, but it worked only on NumEri,
00:53:14.180 | and SciPy needed NumEric, and so we effectively
00:53:17.140 | had the beginning of this split.
00:53:20.020 | And NumEric and NumEri didn't share data.
00:53:22.500 | They were just two, so you could have a gigabyte
00:53:24.420 | of NumEri data and a gigabyte of NumEric data,
00:53:26.540 | and they wouldn't share it.
00:53:27.380 | And so you had these, then you had these
00:53:28.820 | scientific libraries written on top.
00:53:31.300 | I got really bugged by that.
00:53:32.940 | I got really like, oh man, this is not good.
00:53:35.060 | We're not cooperating now.
00:53:36.300 | We're not, we're sort of redoing each other's work,
00:53:37.980 | and we're just this young community.
00:53:40.380 | So that's what led me, even though I knew it was risky,
00:53:43.940 | because my, you know, I was on a tenure track position.
00:53:47.140 | 2004, I got reviewed.
00:53:48.540 | They said, hey, things are going okay.
00:53:49.540 | You're doing well.
00:53:50.380 | Paper's coming out.
00:53:51.540 | But you're kind of spending a lot of time
00:53:52.380 | on this open source stuff.
00:53:53.360 | Maybe do a little less of that,
00:53:54.780 | and a little more of the paper writing and grant writing,
00:53:57.260 | which was naive, but it was definitely the time,
00:53:59.780 | you know, the thinking.
00:54:00.620 | - It still goes on.
00:54:01.740 | - Still goes on.
00:54:03.060 | - You're basically creating a thing
00:54:05.120 | which enables science in the 21st century.
00:54:08.340 | - Right.
00:54:09.340 | - Maybe don't emphasize that so much
00:54:11.060 | in your four-year tenure.
00:54:12.060 | - Right.
00:54:12.900 | (laughing)
00:54:13.720 | It illustrates some of the challenges.
00:54:14.860 | - Yes.
00:54:15.700 | - It does, and it's, people mean well,
00:54:18.940 | but we've gotten broken in a bunch of ways.
00:54:21.820 | - Certain things, programming,
00:54:23.660 | understanding the role of software engineering
00:54:25.420 | and programming in society is a little bit lacking.
00:54:27.860 | - Exactly.
00:54:28.700 | Now, I was in an electrical engineering position.
00:54:30.020 | - Right.
00:54:30.860 | That's even worse.
00:54:32.260 | There.
00:54:33.140 | - Yeah, it was very, they were very focused.
00:54:34.700 | And so, you know, good people, and I had a great time.
00:54:37.300 | I loved my time, I loved my teaching,
00:54:38.940 | I loved all the things I did there.
00:54:40.460 | The problem was, this split was happening
00:54:42.540 | in this community that I loved.
00:54:43.740 | - Yeah.
00:54:44.580 | - I saw people, and I went, oh my gosh,
00:54:45.460 | this is gonna be, this is not great.
00:54:47.780 | And so, I happened, you know, fate,
00:54:50.060 | I had a class I had signed up for.
00:54:52.740 | I was trying to build an MRI system,
00:54:54.860 | so I had a kind of a radio, instead of a radio,
00:54:58.300 | a digital radio class, it was a digital MRI class.
00:55:00.340 | - Mm-hmm.
00:55:01.820 | - And I had people sign up, two people signed up,
00:55:04.020 | then they dropped, and so I had nobody in this class.
00:55:06.660 | So, and I didn't have any other courses to teach,
00:55:08.820 | and I thought, oh, I've got some time,
00:55:10.940 | and I'll just write a merger of numeric and numeric.
00:55:14.820 | Like, I'll basically take the numeric code base,
00:55:16.980 | add the features numeric was adding,
00:55:19.240 | and then kind of come up with a single array library
00:55:21.180 | that everybody can use.
00:55:22.460 | So that's where NumPy came from, was my thinking,
00:55:25.500 | hey, I can do this, and who else is going to?
00:55:27.860 | 'Cause at that point, I'd been around the community
00:55:29.260 | long enough, and I'd written enough C code,
00:55:30.820 | I knew the structures.
00:55:33.260 | In fact, my first contribution to numeric
00:55:35.060 | had been writing the C API documentation
00:55:38.580 | that went in the first documentation for NumPy,
00:55:41.100 | for numeric, sorry, this is Paul Dubois,
00:55:43.020 | David Asher, Conrad Hinson, and myself.
00:55:45.100 | I got credit 'cause I wrote this chapter,
00:55:47.580 | which is all the C API of numeric, all the C stuff.
00:55:51.240 | So I said, I'm probably the one to do it,
00:55:53.780 | nobody else is gonna do this.
00:55:54.780 | So it was sort of, out of a sense of duty and passion,
00:55:58.340 | knowing that, I don't think my academic,
00:56:01.460 | I don't think the department here is gonna appreciate this,
00:56:03.940 | but it's the right thing to do.
00:56:05.900 | - Can we just link on that moment?
00:56:08.660 | - Yeah. - Because the importance
00:56:10.780 | of the way you thought and the action you took,
00:56:13.660 | I feel is understated and is rare,
00:56:17.960 | and I would love to see so much more of it,
00:56:19.900 | because what happens as the tools become more popular,
00:56:23.680 | there's a split that happens.
00:56:27.180 | And it's a truly heroic and impactful action
00:56:30.940 | to in that early split to step up,
00:56:34.780 | and it's like great leaders throughout history,
00:56:37.820 | like get, what is the brave heart,
00:56:39.660 | like get on a horse and rally the troops,
00:56:42.500 | because I think that can make a big difference.
00:56:46.060 | We have TensorFlow versus PyTorch
00:56:48.180 | in the machine learning community.
00:56:49.020 | - We have the same problem today.
00:56:50.380 | - Yeah, I wonder-- - It's actually bigger.
00:56:52.540 | - I wonder if it's possible in the early days
00:56:56.620 | to rally the troops.
00:56:58.220 | - It is possible, especially in the early days.
00:57:00.020 | The longer it goes, the harder, right?
00:57:01.620 | And the more energy in the factions, the harder.
00:57:03.980 | But in the early days, it is possible,
00:57:05.740 | and it's extremely helpful,
00:57:07.700 | and there's a willingness there,
00:57:09.140 | but the challenge is there's usually
00:57:11.220 | not a willingness to fund it.
00:57:13.020 | There's not a willingness to,
00:57:14.940 | like I was literally walking into a field,
00:57:17.580 | saying I'm gonna do this, and here I am,
00:57:20.220 | I have five kids at home now.
00:57:22.120 | (laughing)
00:57:23.780 | - Pressure builds.
00:57:24.860 | - Sometimes my wife hears these stories,
00:57:26.300 | and she's like, you did what?
00:57:28.140 | (laughing)
00:57:29.060 | I thought you were actually on a path
00:57:31.460 | to make sure we had resources and money.
00:57:33.420 | - Oh, wow.
00:57:34.260 | - But again, there's an aspect,
00:57:36.420 | I'm a very hopeful person.
00:57:37.860 | I'm an optimistic person by nature.
00:57:39.660 | I love people.
00:57:41.100 | I learned that about myself later on.
00:57:43.140 | Part of my, my religious beliefs actually lead to that,
00:57:48.380 | and it's why I hold them dear,
00:57:49.880 | because it's actually how I feel about,
00:57:51.300 | it's what leads me to these attitudes,
00:57:53.460 | sort of this hopefulness and this sense of,
00:57:55.900 | yeah, it may not work out for me financially,
00:57:58.620 | or maybe, but that's not the ultimate gain.
00:58:00.620 | Like, that's a thing, but it's not,
00:58:02.920 | you know, that's not the scorecard for me.
00:58:05.540 | And so I just wanted to be helpful,
00:58:07.060 | and I knew, and partly because these Sci-Py conferences,
00:58:09.300 | 'cause the mailing list conversations,
00:58:10.860 | I knew there was a lot of need for this, right?
00:58:13.300 | And so I had this, it wasn't like I was alone
00:58:15.500 | in terms of no feedback.
00:58:16.500 | I had these people who knew, but it was crazy.
00:58:19.480 | Like, people who, at the time, said,
00:58:20.780 | yeah, we didn't think you'd be able to do it.
00:58:22.340 | We thought it was crazy.
00:58:23.180 | - And also, instructive, like practically speaking,
00:58:26.760 | that you had a cool feature
00:58:28.740 | that you were chasing, the morphology, like the--
00:58:30.860 | - Yes.
00:58:31.700 | - Like, it's not just like-- - There's an end result.
00:58:33.500 | - It's not some visionary thing,
00:58:35.180 | I'm going to unite the community.
00:58:36.860 | You were like-- - Correct.
00:58:38.100 | - You were actually practically,
00:58:39.560 | this is what one person actually could do,
00:58:42.140 | and actually build.
00:58:42.980 | - 'Cause that is important,
00:58:44.260 | 'cause you can get over your skis.
00:58:46.340 | - Yeah.
00:58:47.500 | - You can definitely get over your skis.
00:58:49.100 | And I had, in fact, this almost got me over my skis, right?
00:58:52.200 | I would say, well, in retrospect, I hate looking back.
00:58:56.180 | I can tell you all the flaws with NumPy, right?
00:58:58.580 | When I go into it, there's lots of stuff
00:59:00.300 | that I'm like, oh man, that's embarrassing.
00:59:01.720 | That was wrong.
00:59:02.560 | I wish I had somebody to slap me the wet fish there.
00:59:04.340 | Like, I needed, like, what I'd wished I'd had
00:59:07.060 | was somebody with more experience,
00:59:09.420 | and certainly library writing and array library.
00:59:11.860 | Like, I wish I had me.
00:59:12.820 | I could go back in time and go, do this, do that.
00:59:14.560 | Here's a Morton Bean.
00:59:15.500 | 'Cause there's things we did that are still there
00:59:18.140 | that are problematic, that created challenges for later.
00:59:20.960 | And I didn't know it at the time.
00:59:22.500 | Didn't understand how important that was.
00:59:24.460 | And in many cases, didn't know what to do.
00:59:26.500 | Like, there was pieces of the design of NumPy,
00:59:29.100 | I didn't know what to do until five years ago.
00:59:31.360 | Now I know what they should have been,
00:59:32.900 | but I didn't know at the time, and I couldn't get the help.
00:59:35.380 | Anyway, so I wrote it.
00:59:36.700 | It took about, it took four months to write the first version
00:59:39.580 | and about 14 months to make it usable.
00:59:42.580 | But it was that first four months of intense writing,
00:59:47.820 | coding, getting something out the door that worked.
00:59:50.620 | That was, it was definitely challenging.
00:59:52.420 | And then the big thing I did was create a new type object
00:59:54.940 | called Dtype.
00:59:56.140 | That was probably the contribution.
00:59:58.820 | And then the fact that I added broad,
01:00:01.020 | not just broadcasting, but advanced indexing.
01:00:03.500 | So that you could do masked indexing and indirect indexing
01:00:08.460 | instead of just slicing.
01:00:09.940 | - So for people who don't know, and maybe you can elaborate.
01:00:12.820 | So NumPy, I guess the vision in the narrowest sense
01:00:17.660 | is to have this object that represents
01:00:21.460 | n-dimensional arrays.
01:00:23.180 | And like at any level of abstraction you want,
01:00:26.300 | but basically it could be a black box
01:00:28.220 | that you can investigate in ways
01:00:30.060 | that you would naturally want to investigate such objects.
01:00:33.340 | - Yes, exactly.
01:00:34.180 | So you could do math on it easily.
01:00:35.740 | - Math on it easily, yeah.
01:00:37.180 | - So it had an associated library of math operations.
01:00:39.860 | And effectively SciPy became an even larger
01:00:43.020 | set of math operations.
01:00:44.940 | So the key for me was, I was gonna write NumPy
01:00:48.020 | and then move SciPy to depend on NumPy.
01:00:50.380 | In fact, early on, one of the initial proposals
01:00:52.980 | was that we would just write SciPy
01:00:54.580 | and it would have the numeric object inside of it.
01:00:56.660 | And it'd be SciPy.array or something.
01:00:59.780 | That turned out to be problematic
01:01:01.420 | because numeric already had a little mini library
01:01:03.980 | of linear algebra and some functions.
01:01:06.380 | And it had enough momentum, enough users
01:01:08.900 | that nobody wanted to, they wanted backward compatibility.
01:01:12.060 | One of the big challenges of NumPy
01:01:13.740 | was that it had to be backward compatible
01:01:15.000 | with both numeric and numerary
01:01:16.980 | in order to allow both of those communities
01:01:18.500 | to come together.
01:01:19.340 | There was a ton of work in creating
01:01:21.140 | that backward compatibility.
01:01:22.580 | That also created echoes in today's object.
01:01:25.420 | Like some of the complexity in today's object
01:01:27.180 | is actually from that goal of backward compatibility
01:01:30.100 | to these other communities.
01:01:31.380 | Which, if you didn't have that,
01:01:33.020 | you'd do something different.
01:01:34.620 | Which is instructive because a lot of things are there.
01:01:37.740 | You're thinking, what is that there for?
01:01:38.920 | It's like, well, it's a remnant.
01:01:41.340 | It's an artifact of its historical existence.
01:01:45.180 | - By the way, I love the empathy
01:01:46.780 | and the lack of ego behind that.
01:01:48.460 | 'Cause I feel, you see that in the split
01:01:51.380 | in the JavaScript frameworks, for example,
01:01:53.340 | the arbitrary branching.
01:01:54.820 | - Right.
01:01:55.660 | - Is, I think in order to unite people,
01:01:59.020 | you have to kind of put your ego aside
01:02:00.580 | and truly listen to others.
01:02:01.820 | Like, what do you love about numerate?
01:02:04.820 | What do you love about numeric?
01:02:06.060 | Like actually get a sense,
01:02:07.500 | we're talking about languages earlier,
01:02:08.900 | sort of empathize to the culture of the people
01:02:11.580 | that love something about this particular API.
01:02:14.700 | Some, the naming style or the,
01:02:18.980 | the actual usage patterns and truly understand them.
01:02:22.860 | And so that you can create that same draw
01:02:26.780 | in the united thing. - I completely agree.
01:02:29.420 | And you have to also have enough passion that you'll do it.
01:02:32.500 | It can't be just like a perfunctory,
01:02:34.700 | oh yeah, so I'll listen to you.
01:02:35.700 | I'll listen to you and then,
01:02:36.980 | I'm not really that excited about it.
01:02:38.420 | So it really is an aspect, it's a philosophical,
01:02:41.060 | like there's a philia, there's a love,
01:02:42.980 | a esteeming of others.
01:02:44.300 | It's actually at the heart of what,
01:02:47.100 | it's sort of a life philosophy for me, right?
01:02:49.260 | That I'm constantly pursuing and that helped,
01:02:51.580 | absolutely helped.
01:02:52.700 | - Makes me wonder in a philosophical,
01:02:54.300 | like looking at human civilization as one object,
01:02:57.500 | it makes me wonder how we can copy and paste
01:02:59.420 | Travis's in this book.
01:03:01.180 | - Well, in some aspects, maybe.
01:03:03.340 | - Some aspects, right, right, exactly.
01:03:05.260 | Well, I, it's like-- - It's a good question.
01:03:07.340 | How do we teach this?
01:03:08.180 | How do we encourage it?
01:03:09.340 | How do we lift it?
01:03:10.180 | - Because so much of the software world,
01:03:12.780 | it's giant communities, right?
01:03:15.200 | But it seems like so much is moved
01:03:16.660 | by like little individuals.
01:03:18.220 | You talk about like Linus Torvald.
01:03:21.080 | It's like, can you, could you have not,
01:03:23.420 | could you have had Linux without him?
01:03:26.060 | Could you, it's like-- - Yeah, Guido and Python.
01:03:28.020 | - Guido and Python. - Guido and Python.
01:03:29.380 | I mean, the Sci-Fi community particularly,
01:03:30.980 | it's like I said, we wanted to build this big thing,
01:03:32.820 | but ultimately we didn't.
01:03:33.780 | What happened is we had Mavericks and champions
01:03:36.060 | like John Hunter, who created Matplotlib.
01:03:37.780 | We had Fernando Perez, who created IPython.
01:03:40.060 | And so we sort of inspired each other,
01:03:42.300 | but in the credit, there's sort of a culture
01:03:43.980 | of this selfless, give the stewardship mentality
01:03:47.860 | as opposed to ownership mentality,
01:03:49.140 | but stewardship and community-focused,
01:03:54.140 | but intentional work.
01:03:56.620 | Like not waiting for everybody else to do the work,
01:03:58.920 | but you're doing it for the benefit of others
01:04:00.700 | and not worried about what you're gonna get.
01:04:02.900 | You're not worried about the credit,
01:04:04.860 | you're not worried about what you're gonna get,
01:04:05.880 | you're worried about, I later realized
01:04:07.580 | that I have to worry a little about credit,
01:04:09.000 | not because I want the credit,
01:04:10.300 | because I want people to understand
01:04:11.380 | what led to the results.
01:04:12.580 | It's not about me, it's I wanna understand
01:04:15.820 | this is what led to the result.
01:04:17.380 | And this is what had no impact on the result.
01:04:21.500 | Let's promote, just like you said,
01:04:23.420 | I wanna promote the attributes
01:04:24.660 | that help make us better off.
01:04:26.540 | How do we make more of Wes McKinney?
01:04:28.820 | Like Wes McKinney was critical to the success of Python
01:04:31.620 | because of his creation of pandas,
01:04:33.420 | which is the roots of that were all the way back
01:04:36.420 | in numeric and num array and numpy,
01:04:40.260 | where numpy created an array of records.
01:04:43.180 | Wes started to use that almost like a data frame,
01:04:46.000 | except it's an array of records.
01:04:47.860 | And data frame, the challenge is,
01:04:49.780 | okay, if you wanna augment it at another column,
01:04:52.240 | you have to insert, you have to do all this memory movement
01:04:54.700 | to insert a column.
01:04:55.640 | Whereas data frames became,
01:04:57.180 | oh, I'm gonna have a loose collection of arrays.
01:05:00.480 | So it's a record of arrays that is a part of a data frame.
01:05:03.980 | And we thought about that back in the memory days,
01:05:05.560 | but Wes ended up doing the work to build it.
01:05:08.940 | And then also the operations that were relevant
01:05:11.300 | for data processing.
01:05:12.620 | What I noticed is just that each of these little things
01:05:15.220 | creates just another tick, another up.
01:05:17.380 | So numpy ultimately took a little while,
01:05:19.940 | about six months in, people started joining me.
01:05:22.700 | Francesc Alted, Robert Kern, Charles Harris.
01:05:27.300 | And these people are many of the unsung heroes, I would say.
01:05:30.300 | People who are, they sometimes don't get the credit
01:05:32.940 | they deserve because they were critical both to support,
01:05:36.540 | like, it's hard and you need some support,
01:05:39.060 | people need support.
01:05:40.340 | And I needed just encouragement.
01:05:41.600 | And they were helping, encouraged by contributing.
01:05:43.860 | And once, the big thing for me was when John Hunter,
01:05:47.260 | he had previously done kind of a simple thing
01:05:50.200 | called numerics to kind of, between numeric and number A,
01:05:53.060 | he had a little high level tool
01:05:55.120 | that would just select each one for Matplotlib.
01:05:57.900 | In 2006, he finally said, we're gonna just make numpy
01:06:01.420 | the dependency of Matplotlib.
01:06:03.220 | As soon as he did that,
01:06:04.420 | and I remember specifically when he did that,
01:06:06.060 | I said, okay, we've done it.
01:06:08.420 | That was when I knew we had succeeded, success.
01:06:11.220 | Before then, it was still, I didn't know, sure.
01:06:13.580 | But that kind of started a roller coaster.
01:06:15.020 | And then 2006 to 2009,
01:06:17.860 | and then I've been floored by what it's done.
01:06:21.380 | I knew it would help.
01:06:22.860 | I had no idea how much it would help.
01:06:24.740 | - And it has to do with, again, the language thing.
01:06:28.620 | It just, people started to think in terms of numpy.
01:06:31.900 | - Yes.
01:06:33.020 | - And that opened up a whole new way of thinking.
01:06:36.460 | And part of the story that you kind of mentioned,
01:06:39.220 | but maybe you can elaborate,
01:06:42.980 | is it seems like at some point in this story,
01:06:46.340 | Python took over science and data science.
01:06:50.780 | - Yes.
01:06:51.620 | - And bigger than that,
01:06:54.820 | the scientific community started to think like programmers
01:07:00.140 | or started to utilize the tools of computers to do,
01:07:04.300 | like at a scale that wasn't done with Fortran.
01:07:06.660 | Like at this gigantic scale,
01:07:09.300 | they started to opening their heart.
01:07:10.740 | And then Python was the thing.
01:07:12.020 | I mean, there's a few other competitors, I guess,
01:07:14.260 | but Python, I think, really, really took over.
01:07:16.940 | - I agree.
01:07:17.780 | There's a lot of stories here
01:07:18.600 | that are kind of during this journey,
01:07:19.700 | because this is sort of the start of this journey in 2005, '06.
01:07:23.260 | So my tenure committee, I applied for tenure in 2006, 2007.
01:07:28.180 | It came back, I split the department.
01:07:29.780 | I was very polarizing.
01:07:31.300 | I had some huge fans,
01:07:32.580 | and then some people said, "No way."
01:07:34.140 | Right, so I was a polarizing figure in the department.
01:07:36.860 | It went all the way up to the university president.
01:07:39.780 | Ultimately, my department chair had the sway.
01:07:42.780 | And they didn't say no,
01:07:43.780 | they said, "Come back in two years and do it again."
01:07:46.340 | And I went, "Eh."
01:07:48.100 | At that point, I was like,
01:07:49.700 | I mean, I had this interest in entrepreneurship,
01:07:52.820 | this interest in not the academic circles,
01:07:56.380 | not the, like, how do we make industry work?
01:07:59.700 | So I do have to give credit to that exploration of economics
01:08:03.060 | because that led me, oh, I had a lot of opinions.
01:08:06.540 | I was actually very libertarian at the time.
01:08:09.500 | And I still have some libertarian trends,
01:08:11.820 | but I'm more of a,
01:08:13.300 | I'm more of a collectivist libertarian.
01:08:15.740 | - So you value broadly, philosophically, freedom.
01:08:18.660 | - I value broadly, philosophically, freedom,
01:08:20.300 | but I also understand the power of communities,
01:08:23.420 | like the power of collective behavior.
01:08:26.160 | And so what's that balance, right, that makes sense?
01:08:29.780 | So by the time I was just,
01:08:31.500 | I gotta go out and explore this entrepreneur world.
01:08:33.340 | So I left academia, I said, "No, thanks."
01:08:35.500 | Called my friend, Eric, here, who had,
01:08:38.620 | his company was going, I said,
01:08:39.720 | "Hey, could I join you and start this trend?"
01:08:43.060 | And he, at that time, they were using Sci-Fi a lot,
01:08:45.900 | they were trying to get clients,
01:08:47.100 | and so I came down to Texas.
01:08:48.700 | And in Texas is where I sort of,
01:08:51.120 | it's my entrepreneur world, right?
01:08:53.380 | I left academia and went to entrepreneur world in 2007.
01:08:57.300 | So I moved here in 2007, kind of took a leap,
01:08:59.860 | knew nothing really about business,
01:09:01.540 | knew nothing about a lot of stuff there.
01:09:03.680 | There's, for a long time,
01:09:06.900 | I've kept some connections to a lot of academics,
01:09:08.920 | because I still value it,
01:09:10.020 | I still love the scientific tradition,
01:09:12.500 | I still value the essence and the soul and the heart
01:09:15.220 | of what is possible.
01:09:17.260 | Don't like a lot of the administration
01:09:21.340 | and the kind of, we can go into detail about why and where
01:09:24.540 | and how this happens, what are some of the challenges.
01:09:26.260 | - I mean, I don't know, but I'm with you.
01:09:28.460 | So I'm still affiliated with MIT,
01:09:31.820 | I still love MIT because there's magic there.
01:09:35.580 | There's people I talk to, like researchers, faculty,
01:09:40.340 | in those conversations and the whiteboard
01:09:42.700 | and just the conversation, that's magic there.
01:09:46.220 | All the other stuff, the administration,
01:09:48.140 | all that kind of stuff seems to,
01:09:51.020 | you don't wanna say too harshly criticize
01:09:54.900 | sort of bureaucracies, but there's a lag
01:09:57.700 | that seems to get in the way of the magic.
01:10:00.820 | And I'm still have a lot of hope that that can change
01:10:05.820 | because I don't often see that particular type of magic
01:10:10.780 | elsewhere in the industry.
01:10:12.820 | So we need that and we need that flame going.
01:10:15.780 | - I agree.
01:10:16.620 | - And it's the same thing as exactly as you said,
01:10:19.100 | it has the same kind of elements
01:10:20.540 | like the open source community does.
01:10:23.700 | But then if you, like the reason I stepped away,
01:10:27.180 | the reason I'm here just like you did in Austin
01:10:29.920 | is like if I wanna build one robot, I'll stay at MIT.
01:10:33.260 | But if I wanna build millions and make money
01:10:37.140 | enough to where I can explore the magic of that,
01:10:39.780 | then you can't.
01:10:40.980 | And I think that dance is--
01:10:44.140 | - That translational dance has been lost a bit.
01:10:46.740 | - Yeah.
01:10:47.580 | - Right, and there's a lot of reasons for that.
01:10:48.660 | I'm certainly not an expert on this stuff.
01:10:50.180 | I can opine like anybody else,
01:10:51.660 | but I realized that I wanted to explore entrepreneurship
01:10:55.820 | which I, and really figure out,
01:10:57.740 | and it's been a driving passion for 20 years, 25 years.
01:11:01.560 | How do we connect capital markets and company,
01:11:06.500 | 'cause again, I fell in love with the notion,
01:11:07.860 | oh, profit seeking on its own is not a bad thing.
01:11:11.140 | It's actually a coordination mechanism
01:11:13.520 | for allocating resources that, in an emergent way, right,
01:11:17.980 | that respects everybody's opinions, right?
01:11:20.740 | So this is actually powerful.
01:11:22.300 | So I say all the time when I make a company
01:11:25.300 | and we do something that makes a profit,
01:11:27.260 | what we're saying is, hey, we're collecting
01:11:28.820 | of the world's resources and voluntarily people
01:11:30.700 | are asking us to do something they like.
01:11:33.020 | And that's a huge deal.
01:11:34.020 | And so I really like that energy.
01:11:36.140 | So that's what I came to do and to learn
01:11:37.560 | and to try to figure out, and that's what I've been
01:11:39.180 | kind of stumbling through for the past 14 years.
01:11:41.100 | - And that's 2007.
01:11:42.540 | - 2007, yeah.
01:11:43.380 | - And so you were still--
01:11:44.980 | - So NumPy was just emerging, right?
01:11:47.860 | One of the things I had done, it's worth mentioning
01:11:49.980 | because it emphasizes the exploratory nature
01:11:52.500 | of my thinking at the time.
01:11:53.820 | I said, well, I don't know how to fund this thing.
01:11:55.240 | I've got a graduate student I'm paying for
01:11:56.700 | and I've got no funding for him.
01:11:57.860 | And I had done some fundraising from the public
01:12:00.500 | to try to get public fundraisers from my lab.
01:12:02.820 | I didn't really want to go out and just do
01:12:04.260 | the fundraising circuit the way it's traditionally done.
01:12:06.920 | So I wrote a book and I said, I'm gonna write a book
01:12:09.940 | and I'm gonna charge for it.
01:12:11.420 | It was called Guide to NumPy.
01:12:12.700 | And so ultimately NumPy became
01:12:14.020 | documentation-driven development
01:12:15.960 | because I basically wrote the book
01:12:17.260 | and made sure the stuff worked so the book would work.
01:12:19.740 | So it really helped actually make NumPy become a thing.
01:12:23.020 | So writing that book, and it was not a,
01:12:26.740 | I mean, it's not a page turner.
01:12:28.180 | Guide to NumPy is not a book you pick up and go,
01:12:29.900 | oh, this is great, over the fire.
01:12:31.520 | But it's where you could find the details.
01:12:33.660 | Like how'd all this work?
01:12:34.700 | - And a lot of people loved that book.
01:12:36.540 | - And so a lot of people ended up,
01:12:38.020 | so I said, look, I need to, so I'm gonna charge for it.
01:12:41.600 | And I got some flack for that.
01:12:42.740 | Not that much, just probably five angry messages,
01:12:45.900 | people yelling at me saying I was a bad guy
01:12:49.980 | for charging for this book.
01:12:51.380 | - Was one of them Richard Stallman?
01:12:53.300 | - No. - Just kidding.
01:12:54.140 | - No, I haven't really had any interaction
01:12:55.780 | with him personally, like I said.
01:12:57.780 | But there were a few, but actually surprisingly not.
01:13:01.300 | There was actually a lot of people like,
01:13:02.780 | no, it's fine, you can charge for a book.
01:13:04.280 | That's no big deal.
01:13:05.120 | We know that's a way you can try
01:13:06.660 | to make money around open source.
01:13:08.540 | So what I did, I did it in an interesting way.
01:13:10.180 | I said, well, kind of my ideas around IP law and stuff.
01:13:14.300 | I love the idea you can share something,
01:13:15.580 | you can spread it.
01:13:16.420 | Like once it's, the fact that you have a thing
01:13:18.300 | and copying is free, but the creation is not free.
01:13:21.660 | So how do you fund the creation and allow the copying?
01:13:25.400 | Right, and in software it's a little more complicated
01:13:26.820 | than that because creation is actually a continuous thing.
01:13:29.180 | You know, it's not like you build a widget and it's done.
01:13:31.220 | It's sort of a process of emerging
01:13:32.660 | and continuing to create.
01:13:34.580 | But I wrote the book and had this
01:13:35.980 | market-determined price thing.
01:13:37.540 | I said, look, I need, I think I said 250,000.
01:13:40.740 | If I make 250,000 from this book, I'll make it free.
01:13:44.300 | So as soon as I get that much money,
01:13:45.780 | or I said five years, right?
01:13:48.100 | So there's a time limit.
01:13:49.020 | - That's really cool. - Like forever.
01:13:50.020 | - I didn't know this story.
01:13:50.860 | - Yeah, so I released it on this.
01:13:53.140 | And it's actually interesting 'cause one of the people
01:13:55.860 | who also thought that was interesting ended up being
01:13:57.980 | Chris White, who was the director of DARPA project
01:14:01.380 | that we got funding through at Anaconda.
01:14:02.980 | And the reason he even called us back is 'cause he remembered
01:14:05.380 | my name from this book and he thought that was interesting.
01:14:08.140 | And so even though we hadn't gone to the demo days,
01:14:10.900 | we applied and the people said, yeah, nobody ever gets this
01:14:13.540 | without coming to the demo day first.
01:14:15.420 | This is the first time I've seen it.
01:14:16.340 | But it's because I knew Chris had done this,
01:14:18.940 | had this interaction.
01:14:19.780 | So it did have impact.
01:14:21.700 | I was actually really, really pleased by the result.
01:14:23.900 | I mean, I ended up in three years, I made 90,000.
01:14:27.380 | So sold 30,000 copies by myself.
01:14:29.500 | I just put it up on, used PayPal and sold it.
01:14:32.100 | And those are my first taste of kind of, okay,
01:14:36.060 | this can work to some degree.
01:14:37.620 | And all over the world, right?
01:14:40.380 | From Germany to Japan, it was actually, it did work.
01:14:44.500 | And so I appreciated the fact that PayPal existed
01:14:47.060 | and I had a way to make, to get the money.
01:14:49.140 | The distribution was simple.
01:14:51.180 | This is pre-Amazon book stuff.
01:14:53.460 | So it was just publishing a website.
01:14:55.300 | It was the popularity of Sci-Fi emerging
01:14:57.100 | and getting company usage.
01:14:58.940 | I ended up not letting it go to five years
01:15:00.580 | and not trying to make the full amount because,
01:15:03.300 | you know, a year and a half later, I was at Enthought.
01:15:05.300 | I had left academia as an Enthought
01:15:06.660 | and I kind of had a full-time job.
01:15:07.860 | And then actually what happened is the documentation people,
01:15:09.980 | there's a group that said, hey,
01:15:10.820 | we want to do documentation for Sci-Fi as a collective.
01:15:14.220 | And they were essentially needing the stuff in the book.
01:15:18.580 | And so they kind of asked,
01:15:20.260 | hey, can we just use the stuff from your book?
01:15:21.820 | And at that point I said, yeah, I'll just open it up.
01:15:24.060 | So that's, but it has served its purpose.
01:15:27.180 | And the money that I made actually funded my grad student.
01:15:30.900 | Like it was actually, you know,
01:15:32.060 | I paid him 25,000 a year out of that money.
01:15:35.300 | - The funny thing is if you do a very similar
01:15:37.340 | kind of experiment now with NumPy or something like it,
01:15:40.580 | you could probably make a lot more.
01:15:42.380 | - It's probably true.
01:15:43.700 | - Because of the tooling and the community building.
01:15:46.340 | - Yeah, I agree.
01:15:47.180 | - Like the, and social media,
01:15:49.140 | there's just a virality to that kind of idea.
01:15:51.500 | - I agree, there'd be things to do.
01:15:52.660 | I've thought about that.
01:15:53.500 | And really I've thought about a couple of books
01:15:56.020 | or a couple of things that could be done there.
01:15:57.420 | And I just haven't, right?
01:15:58.900 | Even, I tried to hire a ghostwriter this year too,
01:16:01.860 | to see if that would help, but it didn't.
01:16:04.780 | But part of my problem is this,
01:16:06.180 | I've been so excited by a number of things
01:16:08.060 | that stemmed from that.
01:16:09.740 | So I came here, worked at Enthought for four years.
01:16:12.980 | Graciously, you know, Eric made me president
01:16:14.980 | and we started to work closely together.
01:16:16.260 | We actually helped him buy out his partner.
01:16:18.460 | It didn't end great.
01:16:20.700 | Like unfortunately Eric and I aren't real,
01:16:22.860 | aren't friends now.
01:16:24.540 | I still respect him.
01:16:25.380 | I have a lot, I wish we were,
01:16:26.620 | but he didn't like the fact that I,
01:16:29.700 | that Peter and I started Anaconda, right?
01:16:31.700 | That was not, I mean,
01:16:33.760 | so there's two sides to that story.
01:16:36.200 | So I'm not gonna go into it, right?
01:16:37.360 | - Sure, but you, as human beings
01:16:40.560 | and you wish you still could be friends.
01:16:42.320 | - I do, I do.
01:16:43.920 | It saddens me.
01:16:45.160 | - I mean, that's a story of great minds
01:16:49.040 | building great companies.
01:16:51.480 | Somehow it's sad that when there's that kind of.
01:16:55.000 | - And I hold him in esteem.
01:16:57.360 | I'm grateful for him.
01:16:58.200 | I think he's, they're doing,
01:16:59.040 | you know, Enthought still exists.
01:17:00.320 | They're doing great work helping scientists.
01:17:02.520 | They still run the SciPy conference.
01:17:05.040 | They're in the, they have an R&D platform
01:17:06.520 | they're selling now that's a tool
01:17:08.640 | that you can go get today, right?
01:17:10.080 | So they've been, Enthought has played a role
01:17:13.520 | in the SciPy, in supporting the community around SciPy.
01:17:17.560 | I would say.
01:17:18.400 | They ended up not being able to,
01:17:20.560 | they ended up building a tool suite
01:17:22.000 | to write GUI applications.
01:17:24.040 | Like that's where they could actually make
01:17:25.440 | that the business could work.
01:17:26.640 | And so the supporting SciPy and NumPy itself
01:17:29.440 | wasn't as possible.
01:17:30.560 | Like they didn't, they tried.
01:17:31.960 | I mean, it was not just because,
01:17:33.280 | it was just 'cause the business aspect.
01:17:34.520 | So, and then I wanted to build a company
01:17:36.240 | that could do, that could get venture funding, right?
01:17:39.080 | Better for worse.
01:17:39.920 | I mean, that's a longer story.
01:17:41.040 | We could talk a lot about that, but.
01:17:42.360 | - And that's where Anaconda came to be.
01:17:44.200 | - That's where Anaconda came to be.
01:17:45.040 | - So let me ask you, it's a little bit for fun
01:17:48.040 | because you built this amazing thing.
01:17:50.040 | And so let's talk about like an old warrior
01:17:54.640 | looking over old battles.
01:17:56.260 | (Neil laughs)
01:17:58.320 | You know, there's a sad letter in 2012
01:18:01.480 | that you wrote to the NumPy mailing list
01:18:04.360 | announcing that you're leaving NumPy.
01:18:06.320 | And some of the things you've listed
01:18:08.560 | is some of the things you regret
01:18:10.720 | or not regret necessarily, but some things to think about.
01:18:14.440 | If you could go back and you could fix stuff about NumPy
01:18:17.640 | or both sort of in a personal level,
01:18:20.640 | but also like looking forward,
01:18:21.960 | what kind of things would you like to see changed?
01:18:24.560 | - Good question.
01:18:25.400 | So I think there's technical questions
01:18:26.360 | and social questions right there.
01:18:28.200 | First of all, you know, I wrote NumPy as a service
01:18:33.440 | and I spent a lot of time doing it
01:18:35.040 | and then other people came help make it happen.
01:18:36.800 | NumPy succeeded because the work of a lot of people, right?
01:18:39.880 | So it's important to understand that.
01:18:42.240 | I'm grateful for the opportunity, the role I could play
01:18:45.120 | and grateful that things I did had an impact,
01:18:47.620 | but they only had the impact they had
01:18:49.240 | because the other people that came to the story.
01:18:52.240 | And so they were essential.
01:18:53.480 | But the way data types were handled,
01:18:55.760 | the way data types, we had array scalers, for example,
01:18:59.320 | that are really just a substitute for a type concept.
01:19:03.800 | Right, so we had array scalers or actual Python objects
01:19:06.960 | so that there's for every, for a 32 bit float
01:19:09.480 | or a 16 bit float or a 16 bit integer,
01:19:13.120 | Python doesn't have a natural, it's just one integer,
01:19:15.960 | it's one float.
01:19:17.000 | Well, what about these lower precision types,
01:19:19.920 | these larger precision types?
01:19:21.560 | So we had them in NumPy
01:19:23.640 | so that you could have a collection of them,
01:19:25.300 | but then have an object in Python that was one of them.
01:19:28.740 | And there's questions about, like in retrospect,
01:19:31.400 | I wouldn't have created those
01:19:32.920 | if I'd improved the type system.
01:19:34.880 | And like made the type system actually a Python type system
01:19:38.020 | as opposed to currently, it's a Python one level type system.
01:19:41.400 | I don't know if you know the difference
01:19:42.240 | between Python one, Python two, it's kind of technical,
01:19:44.280 | kind of depth, but Python two,
01:19:45.880 | one of its big things that Guido did,
01:19:47.320 | it was really brilliant, it was the actually,
01:19:50.220 | Python one, all classes, new objects were one.
01:19:55.080 | So he was a user, wrote a class,
01:19:56.880 | it was an instance of a single Python type
01:19:59.600 | called the class type.
01:20:02.000 | In Python two, he used a meta typing hook
01:20:06.240 | to actually go, oh, we can extend this
01:20:07.960 | and have users write classes that are new types.
01:20:10.800 | So he was able to have your user classes be actual types
01:20:13.320 | and the Python type system got a lot more rich.
01:20:16.480 | I barely understood that at the time that NumPy was written
01:20:19.160 | and so I essentially, in NumPy,
01:20:21.400 | created a type system that was Python one era.
01:20:24.420 | It was every D type is an instance of the same type
01:20:29.240 | as opposed to having new D types be really just Python types
01:20:33.160 | with additional metadata.
01:20:34.280 | - What's the cost of that?
01:20:35.440 | Is it efficiency, is it usability?
01:20:37.200 | - It's usability primarily.
01:20:38.840 | The cost isn't really efficiency,
01:20:40.320 | it's the fact that it's clumsy to create new types.
01:20:45.080 | It's hard, and then one of the challenges,
01:20:47.560 | you wanna create new types, you wanna quaternion type
01:20:49.500 | or you wanna add a new posit type or you wanna,
01:20:54.200 | so it's hard.
01:20:55.040 | Now, if we had done that well,
01:20:59.200 | when Numba came on the scene,
01:21:00.440 | we could actually compile Python code,
01:21:02.880 | it would integrate with that type system much cleaner
01:21:05.160 | and now all of a sudden you could
01:21:06.860 | do gradual typing more easily.
01:21:08.520 | You could actually have Python when you add Numba
01:21:10.560 | plus better typing, could actually be a,
01:21:13.560 | you'd smooth out a lot of rough edges.
01:21:16.880 | - But there's already, there's like,
01:21:18.880 | but are you talking about from the perspective
01:21:20.960 | of developers within NumPy or users of NumPy?
01:21:23.880 | - Developers of new, not really users of NumPy so much,
01:21:27.120 | it's the development of NumPy.
01:21:28.720 | - So you're thinking about like how to design NumPy
01:21:32.200 | so that its contributors--
01:21:33.920 | - Yeah, the contributors, it's easier.
01:21:35.920 | - It's easier.
01:21:36.760 | - It's less work to make it better and to keep it maintained
01:21:39.360 | and where that's impacted things, for example, is the GPU.
01:21:43.440 | Like all of a sudden GPUs start getting added
01:21:45.560 | and we don't have them in NumPy.
01:21:48.400 | Like NumPy should just work on GPUs.
01:21:50.600 | The fact that we have to download a whole other object
01:21:52.720 | called Coupie to have arrays on GPUs
01:21:54.800 | is just an artifact of history.
01:21:57.440 | Like there's no fundamental reason for it.
01:21:59.200 | - Well, that's really interesting
01:22:00.200 | if we could sort of go on that tangent briefly
01:22:02.520 | is you have PyTorch and other library like TensorFlow
01:22:07.520 | that basically tried to mimic NumPy.
01:22:11.840 | Like you've created a sort of platonic form
01:22:15.720 | of what a multi-dimensional--
01:22:16.920 | - Yeah, exactly.
01:22:17.760 | Well, and the problem was they didn't realize that.
01:22:19.520 | - Yeah.
01:22:20.360 | - The platonic form has a lot of edges.
01:22:21.800 | They're like, "Well, we should cut those out
01:22:23.360 | before we present it."
01:22:24.200 | - So I wonder if you can comment,
01:22:26.960 | is there like a difference between their implementations?
01:22:29.400 | Do you wish that they were all using NumPy
01:22:31.440 | or like in this abstraction of GPU?
01:22:34.040 | And sorry to interrupt that there's GPUs, ASICs,
01:22:38.240 | there might be other neuromorphic computing,
01:22:40.080 | there might be other kind of,
01:22:41.600 | or the aliens will come with a new kind of computer,
01:22:43.960 | like an abstraction that NumPy should just operate nicely
01:22:47.880 | over the things that are more and more
01:22:50.280 | and smarter and smarter with this multi-dimensional arrays.
01:22:54.200 | - Yeah, yeah.
01:22:55.520 | There's several comments there.
01:22:56.920 | We are working on something now called data-apis.org,
01:23:00.400 | data-api.org, you can go there today.
01:23:02.600 | And it's our answer, it's my answer.
01:23:05.320 | It's not just me, it's me and Rolf and Athen and Aaron
01:23:09.160 | and a lot of companies are helping us at Quantsight Labs.
01:23:12.200 | It's not unifying all the arrays,
01:23:14.560 | it's creating an API that is unified.
01:23:17.200 | So we do care about this
01:23:19.400 | and we're trying to work through it.
01:23:21.320 | I actually had the chance to go and meet
01:23:22.600 | with the TensorFlow team and the PyTorch team
01:23:25.400 | and talk to them after exiting Anaconda.
01:23:29.160 | Just talking to them,
01:23:30.000 | 'cause the first year after leaving Anaconda in 2018,
01:23:34.000 | I became deeply aware of this and realized that,
01:23:36.040 | oh, this split in the array community that exists today
01:23:39.000 | makes what I was concerned about in 2005 pretty parochial.
01:23:43.060 | It's a lot worse, right?
01:23:45.920 | Now there's a lot more people,
01:23:47.360 | so perhaps the industry can sustain more stacks, right?
01:23:51.520 | There's a lot of money,
01:23:52.640 | but it makes it a lot less efficient.
01:23:54.440 | But I've also learned to appreciate,
01:23:56.800 | it's okay to have some competition,
01:23:58.520 | it's okay to have different implementations,
01:24:00.840 | but it's better if you can at least refactor some parts.
01:24:03.640 | I mean, you're gonna be more efficient
01:24:05.040 | if you can refactor parts.
01:24:07.080 | - It's nice to have competition over things,
01:24:09.640 | over what is nice to have competition.
01:24:11.880 | - They're innovative.
01:24:12.720 | - Yeah, innovative and then maybe on the infrastructure,
01:24:16.000 | whatever, however you define infrastructure,
01:24:18.240 | that maybe it's nice to have come together.
01:24:21.440 | - Exactly, I agree.
01:24:22.480 | And I think, but it was interesting to hear the stories.
01:24:24.680 | I mean, TensorFlow came out of a C++ library,
01:24:29.100 | Jeff Dean wrote, I think,
01:24:30.240 | that was basically how they were doing inference, right?
01:24:33.600 | And then they realized,
01:24:34.440 | oh, we could do this TensorFlow thing.
01:24:36.480 | That C++ library, then what was interesting to me
01:24:38.440 | was the fact that both Google and Facebook did not,
01:24:42.640 | it's not like they supported Python or NumPy initially,
01:24:45.000 | they just realized they had to.
01:24:47.240 | They came to this world and then all these were like,
01:24:48.880 | hey, where's the NumPy interface?
01:24:50.720 | Oh, and then they kind of came late to it
01:24:52.600 | and then they had these bolt-ons.
01:24:54.840 | TensorFlow's bolt-on, I don't mean to offend,
01:24:57.320 | but it was so bad.
01:24:58.520 | It's the first time that I, I'm usually,
01:25:01.800 | I mean, one of the challenges I have
01:25:04.200 | is I don't criticize enough,
01:25:05.800 | 'cause in the sense that I don't give people input enough.
01:25:09.000 | - I think it's universally agreed upon
01:25:11.720 | that the bolt-ons on TensorFlow.
01:25:13.680 | - When I went to, it was a talk given at Mallorca in Spain
01:25:17.080 | and a great guy came and gave a talk.
01:25:19.560 | I said, you should never show that API again
01:25:21.400 | at a PyData conference.
01:25:23.040 | Like that was, that's terrible.
01:25:24.840 | Like you're taking this beautiful system we've created
01:25:27.080 | and like you're corrupting all these poor Python people,
01:25:29.440 | forcing them to write code like that
01:25:30.840 | or thinking they should.
01:25:32.640 | Fortunately, they adopted Keras as their,
01:25:35.640 | and Keras is better.
01:25:36.760 | And so Keras TensorFlow is fine, is reasonable.
01:25:40.360 | But they bolted it on.
01:25:42.680 | Facebook did too.
01:25:43.640 | Like Facebook had their own C++ library for doing inference
01:25:48.160 | and they also had the same reaction, they had to do this.
01:25:51.160 | One big difference is Facebook,
01:25:52.840 | maybe because the way it's situated in part of FAIR,
01:25:55.240 | part of their research library,
01:25:56.600 | TensorFlow is definitely used and they have to make,
01:25:59.760 | they couldn't just open it up
01:26:00.680 | and let the community change what that is
01:26:03.160 | 'cause I guess they were worried
01:26:04.640 | about disrupting their operations.
01:26:06.880 | Facebook's been much more open to having community input
01:26:10.680 | on the structure itself.
01:26:12.360 | Whereas Google and TensorFlow,
01:26:14.200 | they're really eager to have community users.
01:26:15.960 | People use it and build the infrastructure,
01:26:17.480 | but it's much more walled.
01:26:18.800 | Like it's harder to become a contributor to TensorFlow.
01:26:21.600 | - And it's also, this is a very difficult question to answer
01:26:24.760 | and I don't mean to be throwing shade at anybody,
01:26:27.080 | but you have to wonder, it's the Microsoft question,
01:26:30.320 | of when you have a tool like PyTorch or TensorFlow,
01:26:33.920 | how much are you tending to the hackers
01:26:36.280 | and how much are you tending to the big corporate clients?
01:26:39.200 | - Correct.
01:26:40.040 | - And so like the ones that,
01:26:42.560 | do you tend to the millions of people
01:26:44.160 | that are giving you almost no money
01:26:46.440 | or do you tend to the few that are giving you a ton of money?
01:26:50.320 | I tend to stand with the people.
01:26:54.000 | - Right.
01:26:54.840 | - 'Cause I feel like if you nurture the hackers,
01:26:57.760 | you will make the right decisions in the longterm
01:27:00.200 | that will make the companies happy.
01:27:02.000 | - I lean that way too.
01:27:03.280 | I totally agree.
01:27:04.120 | - But then you have to find the right dance.
01:27:05.680 | - But it's a balance.
01:27:07.080 | 'Cause you can lean to the hackers and run out of money.
01:27:08.960 | - Yeah, exactly.
01:27:10.280 | Exactly.
01:27:11.480 | Which has been some of the challenge I've faced.
01:27:13.560 | - Yes.
01:27:14.400 | - In the sense that,
01:27:15.760 | I would look at some of the experiments like NumPy,
01:27:18.000 | the fact that we have the split is a factor
01:27:20.000 | of I wasn't able to collect more money
01:27:21.800 | towards NumPy development.
01:27:22.880 | - Yeah.
01:27:23.720 | - Right, I mean, I didn't succeed in the early days
01:27:26.560 | of getting enough financial contribution to NumPy
01:27:29.640 | so they make me work on it.
01:27:31.160 | I couldn't work on it full time.
01:27:32.480 | I had to just catch an hour here, an hour there.
01:27:35.720 | And I basically not like that.
01:27:37.960 | Like I've wanted to be able to do something about that
01:27:40.000 | for a long time and trying to figure out
01:27:41.360 | well there's lots of ways.
01:27:42.880 | I mean, possibly one could say,
01:27:44.560 | we had an offer from Microsoft early days of Anaconda.
01:27:48.080 | 2014 they offered to come buy us, right?
01:27:50.240 | The problem was the right people at Microsoft
01:27:52.640 | didn't offer to buy us.
01:27:53.480 | And they were still,
01:27:54.800 | it was really, we were like a second,
01:27:57.960 | they had really bought, they just bought R,
01:27:59.600 | the R company called, it was not RStudio
01:28:02.720 | but it was another R company that was emergent.
01:28:05.600 | And it was kind of a,
01:28:07.080 | well, we should also get a Python play.
01:28:09.280 | But they were really doubling down on R, right?
01:28:11.680 | And so it was like--
01:28:12.520 | - It was where you would go to die.
01:28:14.280 | So it's not, it wasn't, it was before Satya was there.
01:28:17.080 | - Satya had just started.
01:28:18.560 | - Just started.
01:28:19.400 | - Right, and the offer was coming from someone
01:28:21.680 | two levels down from him.
01:28:23.000 | - Gotcha.
01:28:23.840 | - Right, and if it had come from Scott Guthrie,
01:28:26.520 | so I got a chance to meet Scott Guthrie,
01:28:28.200 | great guy, I like him.
01:28:29.680 | If it had offered to come from him,
01:28:31.440 | probably would be at Microsoft right now.
01:28:33.080 | - That'd be fascinating.
01:28:34.400 | That would be really nice actually,
01:28:36.040 | especially given what Microsoft has since done
01:28:38.640 | for the open source community and all those things.
01:28:40.080 | - Yes, I think they're doing well.
01:28:41.520 | I really like some of the stuff they've been doing.
01:28:43.640 | They're still working, and they've hired Guido now,
01:28:46.360 | and they've hired a lot of Python developers.
01:28:48.120 | - Wait, Guido's not at Microsoft?
01:28:49.440 | - Yeah, he works at Microsoft.
01:28:52.400 | Which means he retired, then he came out of retirement,
01:28:54.600 | and he's working on--
01:28:55.440 | - I was just talking to him
01:28:56.280 | and he didn't mention this part.
01:28:57.720 | - Well.
01:28:58.560 | - I should investigate this further.
01:29:01.200 | 'Cause I know he loved Dropbox,
01:29:02.520 | but I wasn't sure what he was doing,
01:29:03.920 | who he was up to.
01:29:05.040 | - Well, he was kind of saying he would retire,
01:29:06.440 | and it's literally been five years
01:29:09.640 | since I last sat down and really talked to Guido.
01:29:12.080 | Guido's a technology expert.
01:29:16.040 | He's a, so I came, I was excited
01:29:18.200 | because I'd finally figured out the type system for NumPy.
01:29:20.720 | I wanted to kind of talk about that with him,
01:29:22.280 | and I kind of overwhelmed him.
01:29:24.000 | - Could you stay in that, just for a brief moment,
01:29:26.640 | 'cause you're a fascinating person
01:29:28.240 | in the history of programming.
01:29:29.400 | He is a fascinating person.
01:29:31.280 | What have you learned from Guido about programming,
01:29:36.560 | about life?
01:29:37.600 | - Yeah, yeah, a lot, actually.
01:29:39.200 | I've been a fan of Guido's.
01:29:41.280 | We have a chance to talk.
01:29:42.560 | Some, I wouldn't say, we talk all the time,
01:29:44.840 | not really at all.
01:29:45.680 | He may, but we talk enough to, I respect his,
01:29:48.880 | like when I first started NumPy,
01:29:49.720 | one of the first things I did was I had,
01:29:51.320 | I asked Guido for a meeting with him
01:29:53.720 | and Paul Dubois in San Mateo.
01:29:55.440 | And I went and met him for lunch.
01:29:56.960 | And basically to say, maybe we can actually,
01:29:59.240 | part of the strategy for NumPy
01:30:00.760 | was to get it into Python 3,
01:30:02.480 | and maybe be part of Python.
01:30:04.120 | And so we talked about that.
01:30:04.960 | - That's a cool conversation.
01:30:06.080 | - And about that approach, right?
01:30:06.920 | - I would have loved to be a fan in the wild.
01:30:09.200 | - That was good.
01:30:10.040 | And over the years for Guido, I learned,
01:30:12.720 | so he was open.
01:30:14.840 | Like he was willing to listen to people's ideas, right?
01:30:18.720 | And over the years.
01:30:19.720 | Now generally, I'm not saying universally that's been true,
01:30:22.600 | but generally that's been true.
01:30:24.360 | So he's willing to listen.
01:30:25.680 | He's willing to defer.
01:30:27.240 | Like on the scientific side, he would just kind of defer.
01:30:29.080 | He didn't really always understand what we were doing.
01:30:31.760 | And he'd defer.
01:30:32.840 | One place where he didn't enough
01:30:35.640 | was we missed a matrix multiply operator.
01:30:37.720 | Like that finally got added to Python,
01:30:39.640 | but about 10 years later than it should have.
01:30:42.240 | But the reason was because nobody,
01:30:44.800 | it takes a lot of effort.
01:30:46.240 | And I learned this while I was writing NumPy.
01:30:48.200 | I also wrote tools to, I became a Python dev,
01:30:50.200 | and I added some pieces of Python.
01:30:52.360 | Like the memory view object.
01:30:53.440 | I wanted the structure of NumPy into Python.
01:30:55.720 | So we didn't get NumPy into Python,
01:30:57.000 | but we got the basic structure of it into Python.
01:30:59.800 | So you could build on it.
01:31:01.040 | Nobody did for a while,
01:31:01.920 | but eventually database authors started to.
01:31:04.760 | And it's a lot better, they did.
01:31:06.160 | And also Antoine Petreau and Stephan Krah
01:31:09.000 | actually fixed the memory view object.
01:31:10.800 | 'Cause I wrote the underlying infrastructure in C,
01:31:13.320 | but the Python exposure was terrible
01:31:15.560 | until they came in and fixed it.
01:31:16.680 | Partly because I was writing NumPy,
01:31:18.120 | and NumPy was the Python exposure.
01:31:20.000 | I didn't really care about
01:31:21.240 | if you didn't have NumPy installed.
01:31:22.840 | Anyway, Guido opened up ideas,
01:31:25.400 | technologically brilliant.
01:31:27.280 | Like really, I really got a lot of respect for him
01:31:29.480 | when I saw what he did with this type class merger thing.
01:31:33.400 | It was actually tricky.
01:31:34.640 | And then willing to share.
01:31:37.480 | Willing to share his ideas.
01:31:38.480 | So the other thing, early on in 1998,
01:31:40.280 | I said I wrote my first extension module.
01:31:42.320 | The reason I could is 'cause he'd written this blog post
01:31:44.880 | on how to do reference counting.
01:31:46.480 | And without it, I would have been lost.
01:31:50.120 | But he was willing to at least try to write this post.
01:31:53.320 | And so he's been motivated early on with Python.
01:31:56.160 | It was like computer science for everybody.
01:31:58.240 | He kind of had this early on desire to,
01:31:59.920 | oh, maybe we should be pushing programming to more people.
01:32:02.080 | So he had this populist notion, I guess, or populist sense.
01:32:06.520 | To learn that there's a certain skill,
01:32:08.720 | and I've seen it in other people too,
01:32:10.600 | of engaging with contributors sufficiently to,
01:32:14.000 | 'cause when somebody engages with you
01:32:15.680 | and wants to contribute to you,
01:32:16.520 | if you ignore them, they go away.
01:32:18.440 | So building that early contributor base
01:32:19.800 | requires real engagement with other people.
01:32:23.360 | And he would do that.
01:32:24.560 | - Can you also comment on this tragic stepping down
01:32:29.120 | from his position as the benevolent dictator for life
01:32:32.920 | over the wars?
01:32:34.440 | - The Walrus operator?
01:32:36.600 | - The Walrus operator was the last battle.
01:32:39.240 | I don't know if that's the cause of it,
01:32:40.920 | but there's this, for people who don't know,
01:32:43.680 | you can look up, there's the Walrus operator,
01:32:45.680 | which looks like a colon and equal sign.
01:32:49.600 | - Yeah, colon, equal sign.
01:32:50.840 | And it actually does maybe the thing
01:32:54.720 | that an equal sign should be doing.
01:32:57.600 | - Yeah, maybe, right, exactly.
01:33:00.280 | - But it's just historically,
01:33:02.080 | equal sign means something else.
01:33:03.600 | It just means assignment.
01:33:05.280 | So he stepped down over this.
01:33:07.320 | What do you think about the pressure of leadership?
01:33:10.400 | - It's something that, you mentioned the letter I wrote
01:33:12.320 | in NumPy at the time.
01:33:13.680 | That was a hard time, actually.
01:33:15.280 | I mean, there's been really hard times.
01:33:17.120 | It was hard.
01:33:19.560 | You get criticized, right, and you get pushed,
01:33:21.680 | and you get, not everybody loves what you do.
01:33:23.840 | Like, any time you do anything that has impact at all,
01:33:26.920 | you're not universally loved, right?
01:33:28.600 | You get some real critics.
01:33:29.800 | And that's an important energy
01:33:32.000 | because it's impossible if you did everything right.
01:33:35.120 | You need people to be pushing.
01:33:37.200 | But sometimes people can get mean, right?
01:33:39.320 | People can, I prefer to get people to benefit the doubt.
01:33:43.120 | I don't immediately assume they have bad intentions.
01:33:45.840 | And maybe for other, you know, maybe that doesn't happen
01:33:48.200 | for everybody, for whatever reason, their past,
01:33:50.240 | their experience with people, they sometimes have bad,
01:33:53.080 | so they immediately attribute to you bad intentions.
01:33:54.920 | You're like, where did this come from?
01:33:56.120 | I mean, I'm definitely open to criticism,
01:33:57.800 | but I think you're misinterpreting the whole point.
01:34:00.560 | 'Cause I would get that.
01:34:01.800 | You know, certainly when I started Anaconda,
01:34:03.680 | you know, I've been, sometimes I say to people,
01:34:07.160 | I know I'm, I care enough about entrepreneurship
01:34:09.840 | to make some open source people uncomfortable.
01:34:12.280 | And I care enough about open source
01:34:13.560 | to make investors uncomfortable.
01:34:15.600 | So I sort of, you know, create,
01:34:17.640 | you create kind of doubters on both sides.
01:34:19.920 | - So when you have, and this is just a plea
01:34:23.920 | to the listener and the public.
01:34:26.080 | I've noticed this too, that there's a tendency,
01:34:29.960 | and social media makes this worse,
01:34:31.840 | when you don't have perfect information about the situation,
01:34:35.600 | you tend to fill the gaps with the worst possible,
01:34:39.320 | or at least the bad story that fills those gaps.
01:34:43.120 | And I think it's good to live life,
01:34:47.000 | maybe not fully naively, but filling in the gaps
01:34:49.760 | with the good, with the best, with the positive,
01:34:54.760 | with the hopeful explanation of why you see this.
01:34:57.320 | So if you see somebody like you trying to make money
01:35:00.280 | on a book about NumPy, there's a million stories around that
01:35:04.080 | that are positive, and those are good to think about,
01:35:07.880 | to project positive intent on other people.
01:35:10.640 | Because for many reasons,
01:35:12.560 | usually because people are good
01:35:13.960 | and they do have good intent.
01:35:15.600 | And also when you project that positive intent,
01:35:17.500 | people will step up to that too.
01:35:19.440 | - Yes, it's a great point.
01:35:21.760 | - It has this kind of viral nature to it.
01:35:24.280 | And of course, what Twitter early on figured on,
01:35:27.760 | Facebook, is that they can make a lot of money
01:35:30.360 | in engagement from the negative.
01:35:32.280 | - Yes.
01:35:33.120 | - And so like, there's this, we're fighting this mechanism.
01:35:35.440 | - I agree. - Which is challenging.
01:35:36.560 | - It's like easier.
01:35:37.560 | - It's just easier to be. - To be negative.
01:35:39.560 | - And then for some reason, something in our minds
01:35:41.920 | really enjoys sharing that and getting all excited
01:35:45.280 | about the negativity.
01:35:46.280 | - We do, yeah.
01:35:47.120 | - But-- - Some protective mechanism
01:35:48.600 | perhaps that we're gonna get eaten if we don't.
01:35:50.880 | - Exactly, for us to be effective as a group of people
01:35:53.200 | in a software engineering project,
01:35:54.600 | you have to project positive intent, I think.
01:35:56.880 | - I totally agree, totally agree.
01:35:58.280 | And I think that's very,
01:35:59.400 | and so that happens in this space.
01:36:01.640 | But Python has done a reasonable job in the past,
01:36:03.840 | but here is a situation where I think
01:36:05.440 | it started to get this pressure where it didn't.
01:36:07.840 | I really didn't know enough about what happened.
01:36:10.440 | I've talked to several people about it,
01:36:12.160 | and I know most of the steering committee members today,
01:36:15.240 | one person nominated me for that role,
01:36:17.920 | but it's the wrong role for me right now, right?
01:36:20.880 | I have a lot of respect for the Python developer space
01:36:24.040 | and the Python developers.
01:36:25.460 | I also understand the gap between
01:36:26.920 | computer science Python developers
01:36:28.840 | and array programming developers or science developers.
01:36:31.440 | And in fact, Python succeeds in the array space
01:36:34.600 | the more it has people in that boundary.
01:36:36.560 | And there's often very few.
01:36:38.000 | Like I was playing a role in that boundary,
01:36:39.440 | and working like everything to try to keep up
01:36:42.600 | with even what Gita was saying.
01:36:45.520 | Like I'm a C programmer, but not a computer scientist.
01:36:49.080 | Like I was an engineer and physicist and mathematician,
01:36:52.600 | and I didn't always understand what they were talking about
01:36:56.400 | and why they would have opinions the way they did.
01:36:58.360 | So you have to listen and try to understand.
01:37:00.280 | Then you also have to explain your point of view
01:37:02.120 | in a way they can understand.
01:37:03.520 | And that takes a lot of work.
01:37:04.840 | And that communication is always the challenge.
01:37:07.900 | And it's just what we're describing here
01:37:09.200 | about the negativity is just another form of that.
01:37:11.520 | Like how do we come together?
01:37:12.560 | And it does appear we're wired anyway to at least have a,
01:37:15.880 | there's a part of us that will enemy, friend, enemy.
01:37:18.880 | And we see, yeah, it's like,
01:37:21.360 | why are we wiring on the enemy front?
01:37:22.960 | - Yeah.
01:37:23.800 | - So why are we pushing that?
01:37:24.760 | Why are we promoting that so deeply?
01:37:26.680 | - Assume friend until proven otherwise.
01:37:28.440 | - Yes, yes.
01:37:30.000 | - So 'cause you have such a fascinating mind
01:37:31.640 | and all of this, let me just ask you these questions.
01:37:34.120 | So one interesting side on the Python history
01:37:38.000 | is the move from Python two to Python three.
01:37:41.000 | You mentioned move from Python one to Python two,
01:37:43.720 | but the move from Python two to Python three
01:37:46.800 | is a little bit interesting
01:37:47.920 | because it took a very long time.
01:37:50.040 | It broke in quite a small way backward compatibility,
01:37:55.040 | but even that small way seemed to have been
01:37:56.960 | very painful for people.
01:37:58.640 | Is there lessons you draw--
01:38:00.000 | - Oh, man, tons of lessons.
01:38:01.480 | - From how long it took and how painful it seemed to be?
01:38:05.520 | - Yeah, tons of lessons.
01:38:07.000 | Well, I mentioned here earlier that NumPy was written in 2005.
01:38:11.840 | It was in 2005 that I actually went to Guido
01:38:15.520 | to talk about getting NumPy into Python three.
01:38:17.240 | Like my strategy was to,
01:38:18.880 | oh, we were moving to Python three, let's have that be,
01:38:20.920 | and it seems funny in retrospect
01:38:22.200 | because like, wait, Python three, that was in 2020, right?
01:38:25.480 | When we finally ended support for Python two,
01:38:27.760 | or at least 2017.
01:38:29.000 | The reason it took a long time,
01:38:30.880 | a lot of time, I think it was because,
01:38:32.720 | well, one of the things is there wasn't much
01:38:34.000 | to like about Python three.
01:38:36.240 | 3.0, 3.1, it really wasn't until 3.3.
01:38:40.280 | Like I consider Python 3.3 to be Python 3.0.
01:38:43.600 | But it wasn't until Python 3.3
01:38:44.880 | that I felt there was enough stuff in it
01:38:47.200 | to make it worth anybody using it, right?
01:38:49.800 | And then 3.4 started to be, oh, yeah, I want that.
01:38:52.600 | And then 3.5 as the matrix multiply operator,
01:38:54.880 | and now it's like, okay, we gotta use that.
01:38:56.520 | - Plus the libraries that started leveraging
01:38:58.400 | some of the features of Python three.
01:38:59.640 | - Exactly.
01:39:00.760 | So it really, the challenge was, it was,
01:39:03.800 | but it also illustrated a truism
01:39:05.680 | that when you have inertia,
01:39:08.200 | when you have a group of people using something,
01:39:10.480 | it's really hard to move them away from it.
01:39:11.920 | You can't just change the world on them.
01:39:13.880 | And Python three, it made some,
01:39:15.440 | I think it fixed some things Guido had always hated.
01:39:17.200 | I think he didn't like the fact that print was a statement.
01:39:19.440 | He wanted to make it a function.
01:39:20.760 | But in some sense, that's a bit of gratuitous change
01:39:23.200 | to the language.
01:39:24.120 | And you could argue, and people have.
01:39:27.320 | But one of the challenges was there wasn't enough features
01:39:31.520 | and too many just changes without features.
01:39:34.960 | And so that empathy for the end user
01:39:37.440 | as to why they would switch wasn't there.
01:39:40.480 | I think also it illustrated just the funding realities.
01:39:42.920 | Like Python wasn't funded.
01:39:45.040 | Like it was also a project
01:39:46.160 | with a bunch of volunteer labor, right?
01:39:48.280 | It had more people, so more volunteer labor,
01:39:50.240 | but it was still, it was funded in the sense
01:39:52.240 | that at least Guido had a job.
01:39:53.440 | And I've learned some of the behind the scenes
01:39:55.280 | on that now since, since talking to people
01:39:57.080 | who have lived through it.
01:39:57.920 | And maybe not on air, we can talk about some of that.
01:40:01.520 | But it's interesting to see, but Guido had a job,
01:40:03.640 | but his full-time job wasn't just work on Python.
01:40:07.080 | Like he had other things to do.
01:40:08.920 | - Just wild.
01:40:09.920 | - It is wild, isn't it?
01:40:10.920 | - It's wild how few people are funded.
01:40:13.360 | - Yes.
01:40:14.200 | - And how much impact they have.
01:40:15.240 | - Yes.
01:40:16.200 | - Maybe that's a feature and not a bug, I don't know.
01:40:17.920 | - Maybe, yes, exactly.
01:40:19.080 | At least early on, it's sort of, I know, yeah.
01:40:21.840 | - It's like Olympic athletes are often severely underfunded,
01:40:25.200 | but maybe that's what brings out the greatness.
01:40:27.920 | - Yes, correct, no, exactly.
01:40:29.680 | Maybe this is the essential part of it.
01:40:31.880 | 'Cause I do think about that in terms of,
01:40:33.680 | I currently have an incubator for open source startups.
01:40:36.200 | Like what I'm trying to do right now
01:40:37.640 | is create the environment I wished had existed
01:40:40.480 | when I was leaving academia with NumPy
01:40:42.880 | and trying to figure out what to do.
01:40:44.120 | I'm trying to create those opportunities and environments.
01:40:46.120 | So, and that's what drives me still,
01:40:49.320 | is how do I make the world easier
01:40:50.760 | for the open source entrepreneur?
01:40:52.600 | - So let me stay, I mean, I could probably stay
01:40:55.480 | at NumPy for a long time, but this is fun question.
01:41:00.920 | So Andrej Kapothy leads the Tesla Autopilot team,
01:41:04.640 | and he's also one of the most like legit programmers I know.
01:41:09.640 | It's like he builds stuff from scratch a lot,
01:41:13.720 | and that's how he builds intuition
01:41:15.000 | about how a problem works.
01:41:16.160 | He just builds it from scratch, and I always love that.
01:41:18.280 | And the primary language he uses is Python
01:41:21.280 | for the intuition building.
01:41:23.000 | But he posted something on Twitter
01:41:27.160 | saying that they got a significant improvement
01:41:31.240 | on some aspect of their like data loading, I think,
01:41:35.600 | by switching away from np.squareroot.
01:41:39.800 | So the NumPy's implementation of square root
01:41:42.120 | to math.squareroot.
01:41:43.280 | And then somebody else commented
01:41:44.480 | that you can get even a much greater improvement
01:41:48.080 | by using the vanilla Python square root,
01:41:51.680 | which is like-- - Power 0.5.
01:41:53.640 | - Power 0.5. - Yeah, yeah.
01:41:55.200 | - And it's fascinating to me.
01:41:56.520 | I just wanted to--
01:41:57.480 | So that was-- - Absolutely.
01:41:59.480 | - That was some shade throwing at some--
01:42:02.080 | - No, no, and yes, we're talking about.
01:42:04.640 | - It's a good way to ask the trade-off
01:42:08.040 | between usability and efficiency broadly in NumPy,
01:42:12.080 | but also on these like specific weird quirks
01:42:14.880 | of like a single function.
01:42:16.680 | - Yep, so on that point,
01:42:19.080 | if you use a NumPy math function on a scalar,
01:42:23.600 | it's gonna be slower than using a Python function
01:42:26.680 | on that scalar. - Yeah.
01:42:27.920 | - But because the math object in NumPy
01:42:32.160 | is more complicated, right?
01:42:33.760 | 'Cause you can also call that math object on an array.
01:42:36.720 | And so effectively, it goes through a similar machine.
01:42:39.520 | There aren't enough of the, which you would do,
01:42:42.160 | and you could do like checks and fast paths.
01:42:45.920 | So yeah, if you're basically doing a list,
01:42:48.760 | if you run over a list, in fact,
01:42:50.640 | for problems that are less than 1,000,
01:42:53.680 | even maybe 10,000 is probably the,
01:42:55.320 | if you're going more than 10,000,
01:42:56.880 | that's where you definitely need to be using arrays.
01:42:59.040 | But if you're less than that, and for reading,
01:43:01.160 | if you're doing a reading process,
01:43:02.720 | and essentially it's not compute bound, it's IO bound,
01:43:05.560 | and so you're really taking lists of 1,000 at a time
01:43:08.440 | and doing work on it, yeah, you could be faster
01:43:10.440 | just using Python, straight up Python.
01:43:12.720 | - See, but also-- - And then--
01:43:15.040 | - This is the, sorry to interrupt,
01:43:16.320 | but there's the fundamental questions
01:43:18.640 | when you look at the long arc of history,
01:43:21.200 | it's very possible that np.squareroot is much faster.
01:43:25.520 | - It could be.
01:43:26.360 | - So in terms of don't worry about it,
01:43:29.440 | it's the evils of over-optimization,
01:43:31.640 | or whatever, all the different quotes around that,
01:43:34.040 | is sometimes obsessing about this particular little quark
01:43:39.040 | is not sufficient.
01:43:41.680 | - For somebody like, if you're trying to optimize your path,
01:43:45.200 | I mean, I agree, premature optimization
01:43:47.640 | creates all kinds of challenges, right,
01:43:49.320 | because now, but you may have to do it.
01:43:51.840 | - I believe the quote is, it's the root of all evil.
01:43:53.880 | - It's the root of all evil, right.
01:43:55.560 | Let's give Donald Knuth, I think,
01:43:57.040 | or we should go to somebody else.
01:43:59.080 | - Well, Doc Knuth is kind of like Mark Twain,
01:44:00.800 | people just attribute stuff to him, I don't--
01:44:02.840 | - And it's fine, because he's brilliant, so.
01:44:05.040 | No, I was a LaTeX user myself,
01:44:07.640 | and so I have a lot of respect,
01:44:09.280 | and you did more than that, of course,
01:44:10.820 | but yeah, someone I really appreciate
01:44:14.120 | in the computer science space.
01:44:15.640 | Yeah, I don't, I think that's appropriate,
01:44:17.080 | there's a lot of little things like that,
01:44:18.320 | where people actually, if you understood it,
01:44:20.120 | you go, yeah, of course that's the case.
01:44:22.640 | And the other part, the other part I didn't mention,
01:44:25.040 | and Numba was a thing we wrote early on,
01:44:27.960 | and I was really excited by Numba,
01:44:29.040 | because it's something we wanted,
01:44:30.040 | it was a compiler for Python syntax,
01:44:32.120 | and I wanted it from the beginning of writing NumPy,
01:44:35.440 | because of this function question,
01:44:38.280 | like, taking, the power of arrays
01:44:41.880 | is really that you can write functions using all of it.
01:44:45.120 | It has implicit looping, right,
01:44:47.000 | so you don't worry about,
01:44:47.840 | I write this N-dimensional for loop with four loops,
01:44:50.400 | for four statements, you just say,
01:44:51.960 | oh, big four-dimensional array,
01:44:53.560 | I'm gonna do this operation, this plus, this minus,
01:44:55.720 | this reduction, and you get this,
01:44:57.640 | it's called vectorization in other areas,
01:44:59.520 | but you can basically think at a high level
01:45:01.400 | and get massive amounts of computation done,
01:45:03.600 | with the added benefit of, oh, it can be parallelized,
01:45:07.560 | easily, it can be put in parallel,
01:45:09.000 | you don't have to think about that.
01:45:09.960 | In fact, it's worse to go decompose your,
01:45:12.720 | you write the for loops,
01:45:14.120 | and then try to infer parallelism from for loops.
01:45:16.240 | So that's actually a harder problem,
01:45:17.560 | than to take the array problem,
01:45:19.600 | and just automatically parallelize that problem.
01:45:22.000 | That's what, and so functions in NumPy
01:45:25.280 | are called universal functions, Ufuncs,
01:45:27.040 | so square root is an example of a Ufunc,
01:45:28.960 | there are others, sine, cosine, add, subtract,
01:45:32.360 | in fact, one of the first libraries to SciPy
01:45:34.480 | was something called special,
01:45:35.480 | where I added Bessel functions,
01:45:36.880 | and all these special functions that come up in physics,
01:45:40.200 | and I added them as Ufuncs, so they could work on arrays.
01:45:43.000 | So I understood Ufuncs very, very well
01:45:44.680 | from day one inside of Numeric,
01:45:45.920 | that was one of the things we tried to make better in NumPy
01:45:47.760 | was how do they work, can they do broadcasting,
01:45:50.360 | what does broadcasting mean?
01:45:51.960 | But one of the problems is,
01:45:53.880 | okay, what do I do with a Python scaler?
01:45:56.520 | So what happens, the Python scaler gets broadcast
01:45:59.800 | to a zero dimensional array,
01:46:01.320 | and then it goes through the whole same machinery
01:46:02.800 | as if it were a 10,000 dimensional array,
01:46:05.040 | and then it kind of unpacks the element,
01:46:07.640 | and then does the addition.
01:46:09.000 | That's not to mention the function it calls,
01:46:12.580 | in the case of square root,
01:46:13.640 | is just the CLIB square root.
01:46:15.960 | In some cases, like Python's Power,
01:46:18.160 | there's some optimizations they're doing
01:46:20.360 | that can be faster than just calling the CLIB square root.
01:46:23.760 | - In the interpreter, or in the--
01:46:25.320 | - No, in the C code, in the Python runtime.
01:46:27.640 | - In the Python runtime, so they really optimize it,
01:46:30.960 | and they have the freedom to do that,
01:46:32.120 | 'cause they don't have to worry about--
01:46:32.960 | - 'Cause it's just a scaler.
01:46:34.080 | - It's just a scaler.
01:46:35.200 | - They don't have to worry about the fact that,
01:46:36.320 | oh, this could be an object with many pieces.
01:46:39.360 | The Ufunc machiner is also generic,
01:46:41.060 | in the sense that typecasting and broadcasting,
01:46:44.600 | broadcasting's idea of, I'm gonna go,
01:46:46.160 | I have a zero dimensional array,
01:46:47.360 | I have a scaler with a four dimensional array,
01:46:49.240 | and I add them.
01:46:50.480 | Oh, I have to kind of coerce the shape of this guy
01:46:54.640 | to make it work against the whole four dimensional array.
01:46:56.880 | So it's the idea of, I can do a one dimensional array
01:46:59.680 | against a two dimensional array, and have it make sense.
01:47:02.200 | - Well, that's what NumPy does,
01:47:03.200 | is it challenges you to reformulate, rethink your problem,
01:47:07.040 | as a multi-dimensional array problem,
01:47:09.040 | versus move away from scalers completely.
01:47:12.640 | - Right, exactly.
01:47:13.480 | - Yeah.
01:47:14.320 | - Exactly, in fact, that's where some of the edge cases,
01:47:16.000 | boundaries are, is that, well, they're still there,
01:47:18.960 | and this is where array scalers are particular.
01:47:21.080 | So array scalers are particularly bad,
01:47:23.120 | in the sense that they were written
01:47:24.360 | so that you could optimize the math on them,
01:47:26.820 | but that hasn't happened.
01:47:28.300 | And so their default is to coerce the array scaler
01:47:32.800 | to a zero dimensional array,
01:47:33.760 | and then use the NumPy machinery.
01:47:36.000 | That's what, and you could specialize,
01:47:38.200 | but it doesn't happen all the time.
01:47:39.960 | So in fact, when we first wrote Numba,
01:47:41.760 | we do comparisons and say, look, it's a thousand X speed up.
01:47:44.760 | We're lying a little bit in the sense that,
01:47:47.160 | well, first do the 40 X slowdown
01:47:50.240 | of using array scalers inside of a loop.
01:47:52.280 | 'Cause if you used to use Python scalers,
01:47:53.560 | you'd already be 10 times faster.
01:47:56.200 | But then we would get a hundred times faster
01:47:58.080 | over that using just compilation.
01:48:00.320 | But what we do is compile the loop
01:48:01.600 | from out of the interpreter to machine code.
01:48:04.020 | And then that's always been the power of Python,
01:48:06.280 | is this extensibility so that you can,
01:48:08.280 | 'cause people say, oh, Python's so slow.
01:48:09.680 | Well, sure, if you do all your logic
01:48:11.520 | in the runtime of the Python interpreter, yeah.
01:48:13.920 | But the power is that you don't have to.
01:48:15.800 | You write all the logic,
01:48:17.260 | which you do in the high level is just high level logic.
01:48:19.860 | And the actual calls you're making
01:48:21.920 | could be on gigabyte arrays of data,
01:48:24.400 | and that's all done at compiled speeds.
01:48:26.880 | And the fact that integration is,
01:48:29.120 | one, can happen, but two, is separable.
01:48:32.400 | That's one of the,
01:48:34.040 | the language like Julia says, we're gonna be all in one.
01:48:36.400 | You can do all of it together.
01:48:37.400 | And then there's, the jury's out, is that possible?
01:48:39.880 | I tend to think that you're gonna,
01:48:41.760 | there's separate concerns there.
01:48:43.280 | You wanna precompile,
01:48:44.320 | and generally you will wanna precompile
01:48:46.520 | some of your loops.
01:48:48.380 | Like SciPy is a compilation step.
01:48:50.160 | To install SciPy, it takes about two hours.
01:48:53.240 | If you have many, many machines,
01:48:54.080 | maybe you can get it down to one hour.
01:48:55.440 | But to compile those libraries takes a while.
01:48:57.920 | You don't wanna do that at runtime.
01:48:59.920 | You don't wanna do that all the time.
01:49:00.800 | You wanna have this precompiled binary available
01:49:02.720 | that you're then just linking into.
01:49:04.400 | So there's real questions about the whole,
01:49:07.360 | source code, code is,
01:49:09.800 | running binary code is more than source code.
01:49:11.820 | It's creating object code, it's the linker,
01:49:13.840 | it's the loader, it's the how does that interpret it
01:49:15.600 | inside of a virtual memory space.
01:49:17.640 | There's a lot of details there that actually,
01:49:19.160 | I didn't understand for a long time
01:49:20.520 | until I read books on the topic.
01:49:23.000 | And it led to, the more you know,
01:49:25.160 | the better off you are, and you can do more details.
01:49:28.440 | But sometimes it helps with abstractions too.
01:49:31.280 | - Well, the problem, as we mentioned earlier
01:49:33.480 | with abstractions is you kind of sometimes assume
01:49:36.760 | that whoever implemented this thing
01:49:41.520 | had your case in mind and found the optimal solution.
01:49:45.000 | - Yes.
01:49:45.840 | - Or like you assume certain things.
01:49:47.320 | I mean, there's a lot of,
01:49:48.160 | - Correct.
01:49:49.000 | - One of the really powerful things to me early on,
01:49:52.800 | I mean, it sounds silly to say, but with Python,
01:49:55.480 | probably one of the reasons I fell in love with it
01:49:58.440 | is dictionaries.
01:49:59.800 | - Yes.
01:50:00.920 | - So obviously probably most languages have some,
01:50:04.520 | - Mapping concept.
01:50:05.360 | - Some mapping concept,
01:50:06.480 | but it felt like it was a first-class citizen.
01:50:09.040 | And it was just my brain was able to think in dictionaries.
01:50:12.200 | But then there is the thing that I guess I still use
01:50:14.640 | to this day is order dictionaries,
01:50:16.920 | because that seems like a more natural way
01:50:20.120 | to construct dictionaries.
01:50:21.680 | - Yeah.
01:50:22.520 | - And from a computer science perspective,
01:50:23.760 | the running time cost is not that significant.
01:50:26.040 | There's a lot of things to understand about dictionaries
01:50:30.400 | that the abstraction kind of doesn't necessarily
01:50:34.840 | incentivize you to understand.
01:50:37.400 | - Right, do you really understand the notion of a hash map
01:50:39.400 | and how that dictionary is implemented?
01:50:41.080 | But you're right, dictionaries are a good example
01:50:43.480 | of an abstraction that's powerful.
01:50:44.960 | And I agree with you, I love dictionaries too.
01:50:47.840 | Took me a while to understand that once you do,
01:50:49.160 | you realize, oh, they're everywhere.
01:50:50.320 | And Python uses them everywhere too.
01:50:52.800 | Like it's actually constructed,
01:50:54.120 | that one of the foundational things is dictionaries,
01:50:55.800 | and it does everything with dictionaries.
01:50:57.000 | - Yeah.
01:50:57.840 | - So it is, it's powerful.
01:50:58.680 | Order dictionaries came later,
01:51:00.200 | but it is very, very powerful.
01:51:02.240 | It took me a little while coming from
01:51:04.160 | just the array programming entirely
01:51:06.000 | to understand these other objects,
01:51:07.400 | like dictionaries and lists and tuples and binary trees.
01:51:11.640 | Like I said, I wasn't a computer scientist,
01:51:13.040 | but I studied arrays first.
01:51:15.160 | And so I was very array-centric.
01:51:16.840 | And you realize, oh, these others
01:51:18.000 | don't have purposes and value actually.
01:51:20.160 | I agree.
01:51:22.080 | - There's a friendliness about,
01:51:24.340 | like one way to think about arrays is,
01:51:28.280 | arrays are just like full of numbers,
01:51:31.920 | but to make them accessible to humans
01:51:35.000 | and make them less error-prone to human users,
01:51:38.680 | sometimes you want to attach names,
01:51:41.480 | human interpretable names that are sticky to those arrays.
01:51:44.680 | So that's how you start to think about dictionaries.
01:51:47.480 | - Yeah, that's a good point.
01:51:48.520 | - You start to convert numbers
01:51:50.520 | into something that's human interpretable.
01:51:52.120 | And that's actually the tension I've had with NumPy
01:51:55.320 | because I've built so much tooling around
01:51:58.800 | human interpretability and also protecting me
01:52:04.540 | from a year later not making the mistakes
01:52:07.200 | by being, I wanted to force myself
01:52:09.240 | to use English versus numbers.
01:52:12.920 | - Yes, so there's a project called Labeled Arrays.
01:52:15.680 | Like very early it was recognized that,
01:52:18.040 | oh, we need, we're indexing NumPy with just numbers,
01:52:21.320 | all the columns and particularly the dimensions.
01:52:23.660 | I mean, if you have an image,
01:52:25.520 | you don't necessarily need to label each column or row,
01:52:27.680 | but if you have a lot of images
01:52:29.160 | or you have another dimension,
01:52:30.440 | you'd at least like to label the dimension
01:52:31.680 | as this is X, this is Y, this is Z,
01:52:33.120 | or this is, give us some human meaning
01:52:34.640 | or some domain-specific meaning.
01:52:36.760 | That was one of the impetuses for pandas actually,
01:52:39.680 | was just, oh, we do need to label these things.
01:52:43.040 | And Labeled Array was an attempt to add that,
01:52:45.800 | like a lighter weight version of that.
01:52:47.680 | And there's been, like that's an example of something
01:52:49.360 | I think NumPy could add, could be added to NumPy.
01:52:53.080 | But one of the challenges again, how do you fund this?
01:52:55.000 | Like I said, one of the tragedies I think is that,
01:52:58.280 | so I never had the chance to,
01:53:00.240 | I was never paid to work on NumPy, right?
01:53:02.380 | So I've always just done it in my spare time,
01:53:04.400 | always taken from one thing,
01:53:05.880 | taken from another thing to do it.
01:53:07.920 | And at the time, I mean, today,
01:53:09.760 | it would be the wrong time today,
01:53:11.000 | like paying me to work on NumPy now
01:53:12.160 | would not be a good use of effort.
01:53:13.480 | But we are finally at QuantSight Labs,
01:53:16.640 | I'm actually paying people to work on NumPy and SciPy,
01:53:19.480 | which is I'm thrilled with, I'm excited by.
01:53:21.980 | I've wanted to do that.
01:53:22.820 | That's what I always wanted to do from day one,
01:53:24.280 | it just took me a while to figure out a mechanism to do that.
01:53:27.640 | - Even like in the university setting, respecting that,
01:53:31.600 | like pushing students, young minds,
01:53:34.560 | the young graduate students to contribute
01:53:38.000 | and then figuring out financial mechanisms
01:53:41.160 | that enable them to contribute
01:53:43.280 | and then sort of reward them
01:53:45.280 | for their innovative scientific journey,
01:53:48.000 | that would be nice.
01:53:49.180 | But then also there's just a better allocation of resources.
01:53:52.480 | It's 20 year anniversary since 9/11,
01:53:55.740 | and I was just looking,
01:53:57.020 | we spent over $6 trillion in the Middle East after 9/11
01:54:02.020 | in the various efforts there.
01:54:04.580 | And sort of to put politics and all that aside,
01:54:08.060 | it's just, you think about the education system,
01:54:10.120 | all the other ways we could have possibly allocated
01:54:12.280 | that money.
01:54:13.120 | To me, to take it back,
01:54:18.320 | the amount of impact you would have
01:54:21.220 | by allocating a little bit of money to the programmers
01:54:26.220 | that build the tools that run the world, it's fascinating.
01:54:30.140 | I mean-- - It is.
01:54:31.480 | - I don't know, I think, again,
01:54:34.980 | there is some aspect to being broke
01:54:38.060 | as somewhat of a feature, not a bug,
01:54:40.260 | that you make sure that your--
01:54:42.100 | - You still manage that.
01:54:43.460 | - Right, no, I know.
01:54:45.340 | But I don't think that's a big part.
01:54:47.060 | So it's like, I think you can have enough money
01:54:50.760 | and actually be wealthy while maintaining your values.
01:54:53.920 | - Agreed, agreed.
01:54:55.560 | There's an old adage that nations that trade together
01:54:57.840 | don't go to war together.
01:54:59.480 | I've often thought about nations that code together.
01:55:01.760 | - Yeah, code together.
01:55:02.600 | (laughing)
01:55:03.880 | - Because one of the things I love about open source
01:55:05.400 | is it's global, it's multinational.
01:55:08.040 | There aren't national boundaries.
01:55:09.200 | One of the challenges with business and open source
01:55:10.800 | is the fact that, well, business is national.
01:55:13.000 | Businesses are entities that are recognized
01:55:14.600 | in legal jurisdictions, right,
01:55:16.280 | and have laws that are respected in those jurisdictions,
01:55:18.320 | and hiring, and yet the open source ecosystem
01:55:21.400 | is not, it's not there.
01:55:23.080 | Like, currently, one of the problems we're solving
01:55:25.120 | is hiring people all over the world, right,
01:55:27.240 | 'cause we, it's a global effort.
01:55:29.640 | And I've had the chance to work,
01:55:30.720 | and I've loved the chance.
01:55:31.960 | I've never been to, like, Iran,
01:55:35.300 | but I once had a conference where I was able
01:55:37.080 | to talk to people there, right,
01:55:38.700 | and talk to folks in Pakistan.
01:55:40.960 | Never been there, but we had a call,
01:55:44.080 | and there were people there,
01:55:45.320 | like, just scientists and normal people,
01:55:47.600 | and, you know, and it's, there's a certain amount
01:55:50.520 | of humanizing, right, that gets away from the,
01:55:54.360 | like, we often get the memes of society
01:55:56.200 | that bubble up and get discussed,
01:55:58.560 | but the memes are not even an accurate reflection
01:56:00.760 | of the reality of what people are.
01:56:02.440 | - Well, if you look at the major power centers
01:56:05.440 | that are leading to something like cyber war
01:56:08.240 | in the next few decades, it's the United States,
01:56:11.400 | it's Russia, and China.
01:56:13.360 | And those three countries in particular
01:56:16.120 | have incredible developers.
01:56:18.280 | So if they work together, I think that's one way,
01:56:21.360 | the politicians can do their stupid bickering,
01:56:23.380 | but, like, there's a layer of infrastructure, of humanity,
01:56:27.380 | if they collaborate together,
01:56:29.440 | that I think can prevent major military conflict,
01:56:34.100 | which would, I think, most likely happen at the cyber level
01:56:37.880 | versus the actual hot war level.
01:56:39.820 | - You're right.
01:56:40.660 | No, I think that's good prediction.
01:56:43.320 | Nations that code together don't go to war together.
01:56:46.560 | - Don't go to war together.
01:56:47.400 | That's a hope, right?
01:56:48.620 | That's one of the philosophical hopes, but yeah.
01:56:51.140 | - So you mentioned the project of Numba,
01:56:55.640 | which is fascinating.
01:56:58.520 | So from the early days, there was kind of a pushback
01:57:01.560 | on Python that it's not fast.
01:57:04.560 | You know, you see C,
01:57:05.560 | if you wanna write something that's fast,
01:57:06.920 | you use C, C++.
01:57:08.240 | If you wanna write something that's usable and friendly,
01:57:11.320 | but slow, you use Python.
01:57:13.240 | And so what is Numba?
01:57:15.840 | What is its goal?
01:57:16.800 | How does it work?
01:57:17.620 | - Great, yeah.
01:57:18.460 | Yes, that's what the argument.
01:57:19.760 | And the reality was people would write high-level code
01:57:22.320 | and use compiled code,
01:57:23.440 | but there's still user story, use cases,
01:57:25.240 | where you want to write Python,
01:57:27.440 | but then have it still be fast.
01:57:28.880 | You still need to write a for loop.
01:57:30.720 | Like before Numba, it was always don't write a for loop.
01:57:33.920 | You know, write it in a vectorized way,
01:57:35.800 | you know, put it in an array.
01:57:37.220 | And often that can make a memory trade-off.
01:57:39.600 | Like quite often you can do it,
01:57:41.040 | but then you make maybe use more memory
01:57:42.680 | because you have to build this array of data
01:57:44.880 | that you don't necessarily need all the time.
01:57:46.640 | So Numba was, it started from a desire
01:57:49.440 | to have kind of a vectorized that worked.
01:57:52.800 | A vectorized was a tool in NumPy, it was released.
01:57:56.240 | You give it a Python function
01:57:57.760 | and it gave you a universal function,
01:57:59.640 | a Ufunc that would work on arrays.
01:58:01.080 | So you get a function that just worked on a scalar.
01:58:03.600 | Like you could make a,
01:58:04.840 | like the classic case was a simple function
01:58:06.680 | that had an if then statement in it.
01:58:08.280 | So a sine X over X function, sinc function.
01:58:12.020 | If X equals zero, return one, otherwise do sine X over X.
01:58:15.040 | The challenge is you don't want that loop
01:58:17.720 | to have one in Python.
01:58:18.720 | So you want a compiled version of that.
01:58:21.480 | But the Ufunc, the vectorized in NumPy
01:58:23.120 | would just give you a Python function.
01:58:24.840 | So it'd take the array of numbers
01:58:26.720 | and at every call do a loop back into Python.
01:58:29.560 | So it was very slow.
01:58:30.440 | It gave you the appearance of a Ufunc, but it was very slow.
01:58:32.840 | So I always wanted a vectorized
01:58:34.600 | that would take that Python scalar function
01:58:36.280 | and produce a Ufunc working on binary native code.
01:58:39.480 | So in fact, I had somebody work on that with PyPy
01:58:42.780 | and see if PyPy could be used to produce a Ufunc like that
01:58:45.620 | early on in 2009 or something like that, 2010.
01:58:50.580 | It didn't work that well.
01:58:51.460 | It was kind of pretty bulky.
01:58:52.860 | But in 2012, Peter and I had just started Anaconda.
01:58:57.000 | We had, I had just, I'd learned to raise money.
01:59:00.700 | That's a different topic, but I'd learned to, you know,
01:59:03.060 | raise money from friends, family, and fools, as they say.
01:59:05.980 | (laughing)
01:59:07.180 | - That's a good line.
01:59:09.000 | - So we were trying to do something.
01:59:10.720 | We were trying to change the world.
01:59:11.840 | Peter and I are super ambitious.
01:59:13.040 | We wanted to make array computing,
01:59:14.840 | and we had ideas for really what's still,
01:59:16.640 | what's still the energy right now.
01:59:17.840 | How do you do at scale data science?
01:59:20.600 | And we had a bunch of ideas there, but one of them,
01:59:23.000 | I had just talked to people about LLVM,
01:59:24.960 | and I was like, there's a way to do this.
01:59:27.240 | I just, I went, I heard about my friend Dave Beasley
01:59:30.040 | at a compiler course.
01:59:31.280 | So I was looking at compilers, like,
01:59:32.920 | and I realized, oh, this is what you do.
01:59:34.960 | And so I wrote a book about it.
01:59:36.040 | And I was like, oh, this is what you do.
01:59:37.660 | And so I wrote a version of Numba
01:59:40.060 | that just basically mapped Python bytecode to LLVM.
01:59:45.060 | - Nice.
01:59:46.500 | - Right, so, and the first version is like, this works,
01:59:49.220 | and it produces code that's fast.
01:59:50.860 | This is cool for, you know,
01:59:51.940 | obviously a reduced subset of Python.
01:59:53.420 | I didn't support all the Python language.
01:59:55.360 | There had been efforts to speed up Python in the past,
01:59:57.500 | but those efforts were, I would say,
01:59:59.220 | not from the array computing perspective,
02:00:00.820 | not from the perspective of wanting
02:00:01.820 | to produce a vectorized improvement.
02:00:03.580 | They were from a perspective of speeding up
02:00:05.140 | the runtime of Python, which is fundamentally hard,
02:00:07.520 | because Python allows for some constructs
02:00:10.520 | that aren't, you can't speed up.
02:00:12.160 | Like, it's generic, you know, when it does this variable.
02:00:15.540 | So I, from the start, did not try
02:00:17.160 | to replicate Python's semantics entirely.
02:00:20.280 | I said, I'm gonna take a subset of the Python syntax
02:00:23.000 | and let people write syntax in Python,
02:00:25.080 | but it's kind of a new language, really.
02:00:27.400 | - So it's almost like for loops, like focusing on for loops.
02:00:30.440 | - For loops, scalar arithmetic, you know, typed,
02:00:34.400 | you know, really typed language, a typed subset.
02:00:37.220 | That was the key.
02:00:39.360 | So, but we wanted to add inference of types.
02:00:41.880 | So you didn't have to spell all the types out,
02:00:43.400 | because when you call a function, so Python is typed.
02:00:47.000 | It's just dynamically typed.
02:00:48.040 | So you don't tell it what the types are,
02:00:49.380 | but when it runs, every time an object runs,
02:00:52.060 | there's a type for the variables.
02:00:53.360 | You know what it is.
02:00:54.560 | And so that was, the design goals of Numba
02:00:56.800 | were to make it possible to write functions
02:00:59.180 | that could be compiled and have them used for NumPy arrays.
02:01:03.440 | Like they needed to support NumPy arrays.
02:01:05.520 | - And so how does it work?
02:01:07.040 | Do you add a comment within Python that tells it to do,
02:01:10.200 | like how do you help out a compiler?
02:01:11.880 | - Yeah, so there isn't much actually.
02:01:15.840 | You don't, it's kind of magical in the sense
02:01:17.760 | that it just looks at the type of the objects
02:01:19.600 | and then does type inference to determine
02:01:21.320 | any of the other variables it needs.
02:01:23.440 | And then it was also, because we had a use case
02:01:26.080 | that could work early, like one of the challenges
02:01:29.040 | of any kind of new development is if you have something
02:01:31.520 | that to make it work, it was gonna take you
02:01:33.200 | a long time, it's really hard to get it off the ground.
02:01:35.960 | If you have a project where there's some incremental story,
02:01:39.200 | it can start working today and solve a problem,
02:01:42.280 | then you can start getting it out there, getting feedback.
02:01:44.640 | Because Numba today, now Numba is nine years old today,
02:01:47.640 | right, the first two, three versions were not great, right?
02:01:52.120 | But they solved a problem and so people could try it
02:01:54.080 | and we could get some feedback on it.
02:01:55.560 | - Not great in that it was very focused.
02:01:57.520 | - Very fragile, very, the subset,
02:02:00.600 | the subset it would actually compile was small.
02:02:03.040 | And so if you wrote Python code and said,
02:02:05.000 | so the way it worked is you write a function
02:02:06.880 | and you say @JIT, use decorators.
02:02:09.000 | So decorators are just these little constructs
02:02:11.040 | that let you decorate code with an @ and then a name.
02:02:15.040 | The @JIT would take your Python function
02:02:17.760 | and actually just compile it and replace the Python function
02:02:20.240 | with another function that interacts
02:02:23.200 | with this compiled function.
02:02:24.920 | And it would just do that and we went from Python bytecode,
02:02:28.520 | then we went to AST, I mean, writing compilers
02:02:30.720 | actually, I learned a lot about why computer science
02:02:32.920 | is taught the way it is because compilers
02:02:35.520 | can be hard to write.
02:02:36.600 | They use tree structures, they use all the concepts
02:02:39.080 | of computer science that are needed.
02:02:40.480 | And it's actually hard to, it's easy to write a compiler
02:02:44.600 | and then have it be spaghetti code.
02:02:46.000 | Like the passes become challenging
02:02:47.600 | and we ended up with three versions of Numba, right?
02:02:49.940 | Numba got written three times.
02:02:51.540 | - What's, what programming language is Numba written in?
02:02:55.520 | - Python.
02:02:56.440 | - Wait, okay.
02:02:57.440 | - Yeah, Python.
02:02:58.600 | So. (laughs)
02:03:00.040 | - Really?
02:03:00.880 | - Yeah. - That's fascinating.
02:03:01.700 | - Yeah, so Python, but then the whole goal of Numba
02:03:03.520 | is to translate Python bytecode to LLVM.
02:03:07.480 | And so LLVM actually does the code generation.
02:03:09.400 | In fact, a lot of times I'd say,
02:03:10.760 | yeah, it's super easy to write a compiler
02:03:12.760 | if you're not writing the parser nor the code generator.
02:03:15.840 | Right?
02:03:16.680 | - So for people who don't know, LLVM is a compiler itself.
02:03:19.440 | So you're compiling to--
02:03:20.280 | - Yeah, it's really badly named low-level virtual machine,
02:03:22.680 | which that part of it is not used.
02:03:24.480 | It's really low-level.
02:03:25.320 | - Chris, he doesn't mean that.
02:03:26.140 | (laughs)
02:03:27.600 | - Yeah, love Chris.
02:03:29.280 | But the name makes you imply that the virtual machine
02:03:31.680 | is what it's all about.
02:03:32.520 | It's actually the IR and the library,
02:03:34.560 | the code generation.
02:03:36.040 | That's the real beauty of it.
02:03:37.720 | The fact that, what I love about LLVM
02:03:39.400 | was the fact that it was a plateau
02:03:41.740 | you could collaborate on.
02:03:43.240 | Right?
02:03:44.080 | Instead of the internals of GCC
02:03:45.920 | or the internals of the Intel compiler,
02:03:47.520 | like how do I extend that?
02:03:49.160 | And it was a place where you could collaborate.
02:03:51.040 | And we were early, I mean, people had started before.
02:03:54.040 | It's a slow compiler.
02:03:55.280 | Like it's not a fast compiler.
02:03:56.880 | So for some kind of JITs,
02:03:59.520 | like JITs are common in a language
02:04:01.040 | because one, every browser has a JavaScript JIT.
02:04:04.760 | It does real-time compilation
02:04:06.560 | of the JavaScript to machine code.
02:04:09.080 | - For people who don't know,
02:04:09.920 | JIT is just-in-time compilation.
02:04:11.520 | - Thank you, yeah, just-in-time compilation.
02:04:13.240 | They're actually really sophisticated.
02:04:14.840 | In fact, I got jealous of how much effort
02:04:17.100 | was put into the JavaScript JITs.
02:04:18.600 | - Yes, well, it's kind of incredible what they've done.
02:04:20.800 | - Yes. - JavaScript JITs, yeah.
02:04:22.000 | - I completely agree.
02:04:22.840 | I'm very impressed.
02:04:24.760 | But Numba was an effort to make that happen with Python.
02:04:29.320 | And so we used some of the money
02:04:30.960 | we raised from Anaconda to do it.
02:04:32.440 | And then we also applied for this DARPA grant
02:04:34.800 | and used some of that money to continue the development.
02:04:36.800 | And then we used proceeds from service projects we would do.
02:04:40.680 | We get consulting projects
02:04:41.840 | on that we would then use some of the profits
02:04:44.480 | to invest in Numba.
02:04:45.400 | So we ended up with a team of two or three people
02:04:47.160 | working on Numba.
02:04:48.880 | It was a fits and starts, right?
02:04:50.720 | And ultimately, the fact that we had a commercial version
02:04:53.560 | of it also we were writing.
02:04:54.720 | So part of the way I was trying to fund Numba
02:04:56.640 | is say, well, let's do the free Numba
02:04:58.560 | and then we'll have a commercial version of Numba
02:04:59.920 | called Numba Pro.
02:05:00.840 | And what Numba Pro did is it targeted GPUs.
02:05:03.240 | So we had the very first CUDA JIT
02:05:05.520 | and the very first AppJIT compiler that in 2012, 2013,
02:05:10.520 | you could run not just a VueFunk on CPU,
02:05:14.120 | but a VueFunk on GPUs.
02:05:15.640 | And it would automatically parallelize it
02:05:17.480 | and get 1000x speedup.
02:05:18.800 | - And that's an interesting funding mechanism
02:05:21.120 | because large companies or larger companies care about speed
02:05:26.120 | in just this way.
02:05:30.120 | So it's exactly a really good way.
02:05:33.160 | - Yeah, there's been a couple of things
02:05:34.240 | you know people will pay for.
02:05:35.200 | One, they'll pay for really good user interfaces, right?
02:05:37.960 | And so I'm always looking for what are the things
02:05:40.160 | people will pay for that you could actually adapt
02:05:41.720 | to the open source infrastructure?
02:05:43.240 | One is definitely user interfaces.
02:05:45.560 | The second is speed, like a better runtime, faster runtime.
02:05:49.120 | - And then when you say people,
02:05:50.000 | you mean like a small number of people pay a lot of money,
02:05:52.280 | but then there's also this other mechanism
02:05:54.200 | that a ton of people pay a little bit.
02:05:57.560 | First, I gotta, we mentioned Anaconda,
02:06:00.320 | we mentioned Friends, Family and Fools.
02:06:04.280 | So Anaconda is yet another.
02:06:06.800 | So there's a company, but there's also a project
02:06:09.720 | that is exceptionally impactful in terms of,
02:06:14.600 | for many reasons, but one of which is bringing
02:06:16.880 | a lot more people into the community
02:06:21.880 | of folks who use Python.
02:06:23.600 | So what is Anaconda?
02:06:26.880 | What is its goals?
02:06:28.920 | Maybe what is Conda versus Anaconda?
02:06:31.520 | - Yeah, I'll tell you a little bit of the history of that.
02:06:33.080 | 'Cause Anaconda, we wanted to do,
02:06:35.240 | we wanted to scale Python 'cause we, you know,
02:06:38.200 | that was the goal.
02:06:39.040 | Peter and I had the goal of when we started Anaconda,
02:06:40.680 | we actually started as Continuum Analytics
02:06:42.440 | was the name of the company that started.
02:06:43.960 | It got renamed Anaconda in 2015.
02:06:46.840 | But we said, we want to scale analytics.
02:06:49.720 | NumPy's great, Pandas is emerging,
02:06:52.520 | but these need to run at scale with lots of machines.
02:06:55.160 | The other thing we wanted to do was make user interfaces
02:06:57.760 | that were web.
02:06:59.200 | We wanted to make sure the web did not pass by
02:07:01.480 | the Python community, that we had a ways
02:07:03.480 | to translate your data science to the web.
02:07:05.840 | So those are the two kind of technical areas.
02:07:07.560 | We thought, oh, we'll build products in this space.
02:07:09.760 | And that was the idea.
02:07:12.360 | Very quickly in, but of course, the thing I knew how to do
02:07:14.520 | was to do consulting to make money
02:07:16.360 | and to make sure my family and friends
02:07:18.800 | and fools that had invested didn't lose their money.
02:07:21.560 | So it's a little different than if you take money
02:07:23.160 | from a venture fund.
02:07:24.240 | If you take money from a venture fund,
02:07:25.360 | the venture fund, they want you to go big or go home.
02:07:27.560 | They're kind of like expecting nine out of 10 to fail
02:07:30.120 | or 99 out of 100 to fail.
02:07:32.960 | It's different, I was in a barbell strategy.
02:07:35.360 | I was like, I can't fail.
02:07:37.120 | I mean, I may not do super well,
02:07:38.520 | but I cannot lose their money.
02:07:40.280 | So I'm gonna do something I know can return a profit,
02:07:43.440 | but I want to have exposure to an upside.
02:07:46.120 | So that's what happened at Anaconda.
02:07:48.160 | There was lots of things we did not well
02:07:50.160 | in terms of that structure.
02:07:51.120 | And I've learned from since and have it better.
02:07:53.560 | But we did a really good job
02:07:56.520 | of kind of attracting the interest around the area
02:07:58.960 | to get good people working
02:08:00.160 | and then funnel some money on some interesting projects.
02:08:02.920 | Super excited about what came out of our energy there.
02:08:05.040 | Like a lot did.
02:08:06.640 | - So what are some of the interesting projects?
02:08:08.120 | - So Dask, Numba, Bokeh, Conda.
02:08:11.960 | There was a data shader, Panel, HoloViz.
02:08:15.880 | These are all tools that are extremely relevant
02:08:19.040 | in terms of helping you build applications,
02:08:21.400 | build tools, build faster code.
02:08:24.280 | - So Bokeh is the plotting--
02:08:26.040 | - Oh, JupyterLab, JupyterLab came out of this too.
02:08:28.680 | - That's fascinating.
02:08:30.320 | Okay, so Bokeh does plotting?
02:08:33.160 | - Bokeh does plotting.
02:08:34.000 | So Bokeh was one of the foundational things to say,
02:08:35.880 | I want to do plot in Python,
02:08:37.400 | but have the thing show up in a web.
02:08:39.160 | - Right, that's right, that's right, that's right.
02:08:41.040 | And plotting to me still,
02:08:43.280 | with all due respect to Matplotlib and Bokeh,
02:08:46.480 | it feels like still an unsolved problem,
02:08:48.760 | not a solved problem. - Oh, it is.
02:08:50.240 | It is, it's a big problem.
02:08:52.160 | - Right, 'cause you're, I mean, I don't know.
02:08:55.640 | It's visualization broadly, right?
02:08:58.640 | - I think we've got a pretty good API story
02:09:00.960 | around certain use cases of plotting.
02:09:03.440 | But there's a difference between static plots
02:09:04.920 | versus interactive plots versus,
02:09:06.920 | I'm an end user, I just want to write a simple,
02:09:09.760 | for Panda started the idea of here's a data frame,
02:09:12.040 | I'm gonna dot plot.
02:09:12.880 | I'm just gonna attach plot as a method to my object,
02:09:16.400 | which was a little bit controversial, right?
02:09:18.280 | But works pretty well, actually,
02:09:20.200 | 'cause there's a lot less you have to pass in, right?
02:09:23.680 | You can just say, here's my object, you know what you are.
02:09:26.280 | You tell the visualization what to do.
02:09:29.000 | So that, and there's things like that
02:09:31.320 | that have not been super well-developed entirely.
02:09:33.720 | But Bokeh was focused on interactive plotting.
02:09:36.320 | So you could, it's a short path
02:09:38.400 | between interactive plotting and application,
02:09:41.080 | dashboard application.
02:09:42.680 | And there's some incredible work that got done there, right?
02:09:44.760 | And it was a hard project
02:09:45.800 | because then you're basically doing JavaScript and Python.
02:09:49.440 | So we wanted to tackle some of these hard problems
02:09:51.560 | and try to just go after them.
02:09:53.440 | We got some DARPA funding to help,
02:09:54.920 | and it was super helpful.
02:09:56.160 | It's a funny story there,
02:09:57.000 | we actually did two DARPA proposals,
02:09:58.320 | but one we were five minutes late for.
02:10:00.600 | And DARPA has a very strict cutoff window.
02:10:03.040 | And so we had two proposals, one for the Bokeh
02:10:05.560 | and one for actually Numba and the other work.
02:10:09.320 | - Which one were you late for?
02:10:10.920 | - The foundational numerical work.
02:10:12.920 | So Bokeh got funded. - Oh no.
02:10:14.880 | - Fortunately, Chris let us use some of the money to fund
02:10:17.120 | still some of the other foundational work.
02:10:19.280 | But it wasn't as, yeah, his hands were tied,
02:10:22.040 | he couldn't do anything about it.
02:10:23.880 | That was a whole interesting story.
02:10:25.840 | - So one of the incredible projects
02:10:27.720 | that you worked on is Conda.
02:10:29.200 | - Yes.
02:10:30.040 | - So what is Conda? - So how did that
02:10:30.880 | come about?
02:10:31.720 | Yeah, Conda, it was early on, like I said, with SciPy.
02:10:35.480 | SciPy was a distribution, mass generation library.
02:10:37.880 | And you heard me talking about compiler issues
02:10:40.320 | and trying to get the stuff shipped
02:10:41.480 | and the fact that people can use your libraries
02:10:43.320 | if they have it.
02:10:44.640 | So for a long time, we'd understood
02:10:46.000 | the packaging problem in Python.
02:10:47.800 | And one of the first things we did at Continuum Analytics
02:10:50.680 | became Anaconda, was organize the PyData ecosystem
02:10:54.200 | in conjunction with NumFocus.
02:10:56.160 | We actually started NumFocus
02:10:57.960 | with some other folks in the community
02:11:00.480 | the same year we started Anaconda.
02:11:02.840 | I said, we're gonna build a corporation,
02:11:04.200 | but we're also gonna reify the community aspect
02:11:07.040 | and build a nonprofit.
02:11:08.280 | So we did both of those.
02:11:09.400 | - Can we pause real quick and can you say
02:11:12.040 | what is PyPy, the Python package index,
02:11:14.720 | like this whole story of packaging in Python?
02:11:19.280 | - Yeah, that's what I'm gonna get to, actually.
02:11:20.880 | This is exactly the journey I'm on.
02:11:22.240 | It's to sort of explain packaging in Python.
02:11:24.200 | I think it's best expressed through the conversation
02:11:26.080 | I had with Guido at a conference, where I said,
02:11:28.520 | so, you know, packaging's kind of a problem.
02:11:31.280 | And Guido said, I don't ever care about packaging.
02:11:34.080 | I don't use it.
02:11:34.920 | I don't install new libraries.
02:11:36.320 | I'm like, I guess if you're the language creator
02:11:38.200 | and if you need something, you just put it
02:11:39.720 | in the distribution, maybe you don't worry about packaging.
02:11:42.520 | But Guido has never really cared about packaging, right?
02:11:45.200 | And never really cared about the problem of distribution.
02:11:47.400 | It's somebody else's problem.
02:11:48.480 | And that's a fair position to take, I think,
02:11:50.240 | as a language creator.
02:11:51.440 | In fact, there's a philosophical question about
02:11:54.160 | should you have different development packaging managers?
02:11:56.680 | Should you have a package manager per language?
02:11:58.400 | Is that really the right approach?
02:11:59.800 | I think there are some answers of
02:12:01.900 | it is appropriate to have development tools.
02:12:04.200 | And there's an aspect of the development tool
02:12:06.040 | that is related to packaging.
02:12:07.680 | And every language should have some story there
02:12:10.600 | to help their developers create.
02:12:12.120 | - So you should have language-specific development tools.
02:12:14.960 | - Development tools that relate to package managers.
02:12:17.080 | But then there's a very specific user story
02:12:19.500 | around package management that those language-specific
02:12:21.640 | package managers have to interact with
02:12:23.560 | and currently aren't doing a good job of that.
02:12:25.920 | That was one of the challenges of not seeing that difference
02:12:29.140 | and it still exists in the difference today.
02:12:31.720 | Conda always was a user,
02:12:34.480 | I'm gonna use Python to do data science.
02:12:36.560 | I'm gonna use Python to do something.
02:12:38.240 | How do I get this installed?
02:12:39.560 | It was always focused on that.
02:12:41.160 | So it didn't have a develop.
02:12:43.880 | Classic example is Pip has a pip develop.
02:12:45.960 | It's like, I wanna install this
02:12:47.480 | into my current development environment today.
02:12:49.160 | Now, Conda doesn't have that concept
02:12:51.520 | because it's not part of the story.
02:12:52.840 | - For people who don't know,
02:12:54.640 | Pip is a Python-specific package manager
02:12:59.640 | that's exceptionally popular.
02:13:04.640 | That's probably the default thing you learn.
02:13:06.160 | - It's the default user.
02:13:07.200 | Yeah, and so the story there emerged
02:13:08.840 | because what happened is in 2012,
02:13:11.440 | we had this meeting at the Googleplex
02:13:13.760 | and Guido was there to come talk about
02:13:15.120 | what are we gonna do?
02:13:15.960 | How are we gonna make things work better?
02:13:17.220 | And Wes McKinney, me, Peter.
02:13:19.960 | Peter has a great photo of me talking to Guido
02:13:21.880 | and he pretends we're talking about this story.
02:13:23.560 | Maybe we were, maybe we weren't.
02:13:24.400 | But we did at that meeting talk about it
02:13:26.320 | and ask Guido, "Guido, we need to fix packaging in Python.
02:13:29.840 | "People can't get the stuff."
02:13:31.040 | And he said, "Go fix it yourself.
02:13:32.320 | "I don't think we're gonna do it."
02:13:33.520 | (laughing)
02:13:34.920 | All right.
02:13:35.760 | - That's the origin story right there.
02:13:36.920 | - All right, you said, okay, you said to do this ourselves.
02:13:39.440 | So at the same time, people did start to work
02:13:42.840 | on the packaging story in Python.
02:13:44.600 | It just took a little longer.
02:13:45.680 | So in 2012, kind of motivated by our training courses
02:13:49.120 | we were teaching, like how do we,
02:13:50.200 | very similar to what you just mentioned about your mother.
02:13:52.240 | Like it was motivated by the same purpose.
02:13:54.120 | Like how do we get this into people's hands?
02:13:56.040 | And it's this big, long process.
02:13:57.160 | It takes too expensive.
02:13:58.560 | It was actually hurting NumPy development
02:14:00.240 | because I would hear people were saying,
02:14:02.260 | "Don't make that change to NumPy
02:14:03.440 | "because I just spent a week getting my Python environment
02:14:05.440 | "and if you change NumPy, I have to reinstall everything.
02:14:09.120 | "And reinstalling is such a pain, don't do it."
02:14:10.920 | I'm like, wait, okay, so now we're not making changes
02:14:14.060 | to a library because of the installation problem
02:14:16.040 | that it'll cause for end users.
02:14:17.480 | Okay, there's a problem with installation.
02:14:19.440 | We gotta fix this.
02:14:20.540 | So we said, we're gonna make a distribution of Python.
02:14:23.800 | And we'd previously done that at mDot.
02:14:26.920 | I wanted to make one that we gave away for free
02:14:28.520 | that everyone could just get.
02:14:29.820 | Like that was critical that we just get it.
02:14:32.040 | It wasn't tied to a product.
02:14:33.860 | It was just you could get it.
02:14:35.360 | And then we had constantly thought about,
02:14:36.960 | well, do we just leverage RPM?
02:14:39.120 | But the challenge had always been,
02:14:40.380 | we want a package manager that works on Windows,
02:14:42.240 | Mac OS X and Linux the same.
02:14:43.960 | And it wasn't there.
02:14:46.560 | Like you don't have anything like that.
02:14:47.960 | - And for people who don't know,
02:14:49.520 | RPM is operating system specific package manager.
02:14:54.520 | - Correct, it's an operating specific, yes.
02:14:56.480 | - So do you create the design questions?
02:15:00.160 | Do you create an umbrella package manager
02:15:02.240 | that works across operating systems?
02:15:03.840 | - Yes, that was the decision.
02:15:05.720 | - And a neighboring design question is,
02:15:08.120 | do you also create a package manager
02:15:09.960 | that spans multiple programming languages?
02:15:11.880 | - Correct, exactly.
02:15:12.800 | That was the world we faced.
02:15:14.360 | And we decided to go multiple operating systems,
02:15:17.120 | multiple and programming language independent.
02:15:19.240 | Because even Python, and particularly what was important
02:15:21.840 | was SciPy has a bunch of Fortran in it, right?
02:15:24.960 | And scikit-learn has links to a bunch of C++.
02:15:27.800 | There's a lot of compiled code.
02:15:30.000 | And the Python package managers, especially early on,
02:15:32.960 | didn't even support that.
02:15:34.360 | So in 2000, so we released Anaconda,
02:15:38.520 | which was just a distribution of libraries,
02:15:39.960 | but we started to work on Conda in 2012.
02:15:42.480 | First version of Conda came out early 2013,
02:15:44.680 | summer of 2013.
02:15:46.640 | And it was a package manager.
02:15:47.840 | So you could say Conda install scikit-learn.
02:15:49.560 | In fact, scikit-learn was a fantastic project that emerged.
02:15:54.280 | It was the classic example of the scikits.
02:15:57.120 | I talked to you earlier about SciPy being too big
02:15:59.760 | to be a single library.
02:16:01.240 | Well, what the community had done is said,
02:16:02.680 | let's make scikits.
02:16:04.120 | And there's scikit-image, there's scikit-learn,
02:16:05.800 | there's a lot of scikits.
02:16:07.600 | And it was a fantastic move that the community did.
02:16:10.160 | I didn't do it.
02:16:11.440 | I was like, okay, that's a good idea.
02:16:12.520 | I didn't like the name.
02:16:13.520 | I didn't like the fact you typed scikit-image.
02:16:15.480 | I was like, it has got to be simpler, sklearn.
02:16:17.960 | We've got to make that smaller.
02:16:19.760 | I don't like typing all this stuff, it's important.
02:16:21.920 | So I was kind of a pressure that way.
02:16:23.200 | But I love the energy.
02:16:24.680 | I love the fact that they went out and they did it.
02:16:26.160 | And lots of people, Jared Millman,
02:16:27.520 | and then of course, Gael,
02:16:29.400 | and there's people I'm not even naming.
02:16:31.240 | Scikit-learn really emerged as a fantastic project.
02:16:34.600 | - And the documentation around that is also incredible.
02:16:36.440 | - And the documentation was incredible, exactly.
02:16:37.800 | - I don't know who did that, but they did a great job.
02:16:40.080 | - A lot of people in INRIA,
02:16:41.400 | a lot of people, a lot of European contributors.
02:16:44.560 | Andreas, there's some Andreas in the US.
02:16:47.080 | There's a lot of people I just adore,
02:16:48.880 | I think are amazing people.
02:16:50.240 | Awesome use of SciPy, right?
02:16:52.440 | I love the fact that they were using SciPy
02:16:54.160 | effectively to do something I love,
02:16:55.240 | which is machine learning, but couldn't install it.
02:16:58.920 | - 'Cause there's so many pieces involved.
02:17:00.560 | - So many dependencies, right?
02:17:02.120 | So our use case of Conda was Conda install Scikit-learn.
02:17:06.040 | Right, and it was the best way to install Scikit-learn
02:17:09.400 | in 2013 to really 2018, '17, '18.
02:17:14.400 | Pip finally caught up.
02:17:16.720 | I still think you should Conda install Scikit-learn
02:17:19.320 | instead of Pip install Scikit-learn,
02:17:20.400 | but you can Pip install Scikit-learn.
02:17:22.200 | The issue is the package they created was wheels,
02:17:24.720 | and Pip does not handle the multi-vendor approach.
02:17:27.320 | They don't handle the fact you have C++ libraries
02:17:29.400 | you're depending on.
02:17:30.520 | They just stop at the Python boundary.
02:17:32.200 | And so what you have to do in the wheel world
02:17:34.120 | is you have to vendor.
02:17:36.080 | You have to take all the binary and vendor it.
02:17:38.480 | Now if a change happens in underlying dependency,
02:17:41.280 | you have to redo the whole wheel.
02:17:43.120 | So TensorFlow is a good example,
02:17:44.920 | but you should not Pip install TensorFlow.
02:17:47.320 | It's a terrible idea.
02:17:48.160 | People do it because the popularity of Pip,
02:17:51.440 | many people think, "Oh, of course that's how
02:17:52.560 | "I install everything in Python."
02:17:54.240 | - Yeah, this is one of the big challenges.
02:17:56.440 | You know, you take a GitHub repository
02:17:58.680 | or just a basic blog post,
02:18:00.640 | the number of time Pip is mentioned over Conda
02:18:03.560 | is like 100x to one.
02:18:05.400 | - Correct, correct.
02:18:06.240 | - So it just has to do with--
02:18:07.080 | - And that was increasing.
02:18:07.960 | It wasn't true early because Pip didn't exist.
02:18:10.240 | Like, Conda came first.
02:18:11.600 | - So but that's like the long tail
02:18:13.040 | of the internet documentation user-generated.
02:18:15.840 | So that, like you think, how do I install,
02:18:18.600 | you Google, "How do I install TensorFlow?"
02:18:20.400 | You're just not gonna see Conda in that first page.
02:18:22.800 | - It's not correct, exactly.
02:18:24.160 | - And that--
02:18:25.000 | - Not today, you would have in 2016, 2017.
02:18:29.480 | - And it's sad because you saw the,
02:18:32.160 | Conda solves a lot of usability issues.
02:18:34.240 | - Correct.
02:18:35.080 | - Like for especially super challenging thing,
02:18:36.560 | I don't know, one of the big pain points for me
02:18:38.880 | was just on the computer vision side,
02:18:41.640 | OpenCV installation that--
02:18:43.720 | - Perfect example, yes.
02:18:44.720 | - I think Conda, I don't know if Conda has solved that one.
02:18:47.440 | - Conda has an OpenCV package.
02:18:49.120 | - I don't know, I certainly know Pip has not solved,
02:18:53.480 | I mean, there's complexities there because--
02:18:55.880 | - Right.
02:18:56.720 | - I actually don't know, I should probably know
02:18:58.360 | a good answer for this, but if you compile OpenCV
02:19:03.320 | with certain dependencies,
02:19:05.480 | you'll be able to do certain things.
02:19:07.480 | So there's this kind of flexibility of what you,
02:19:09.880 | like what options you compile with.
02:19:13.000 | - Yes.
02:19:13.840 | - And I don't think it's trivial to do that
02:19:16.200 | in a with Conda or with--
02:19:17.880 | - So Conda has a notion of variance of a package.
02:19:20.560 | You can actually have different compilation versions
02:19:23.160 | of a package, so not just the version's different,
02:19:24.720 | but oh, this is compiled with these optimizations on it.
02:19:26.920 | So Conda does have an answer.
02:19:27.760 | - Has those flavors, yeah.
02:19:28.920 | - Has flavors, basically.
02:19:30.120 | - Well, Pip, as far as I know, does not have flavors.
02:19:32.880 | - No, Pip generally hasn't thought deeply
02:19:36.480 | about the binary dependency problem, right?
02:19:38.440 | That's why fundamentally it doesn't work
02:19:41.880 | for the Sci-Fi ecosystem.
02:19:43.640 | It barely, you can sort of paper over it and duct tape it
02:19:46.240 | and it kind of works, until it doesn't
02:19:48.040 | and it falls apart entirely.
02:19:49.600 | So it's been a mixed bag.
02:19:52.600 | And I've been having lots of conversations
02:19:54.400 | with people over the years because, again,
02:19:56.160 | it's an area where if you understand some things,
02:19:58.400 | but not all the things, but they've done a great job
02:20:00.520 | of community appeal.
02:20:02.200 | This is an area where I think Anaconda,
02:20:04.960 | as a company, needed to do some things
02:20:07.040 | in order to make Conda more community-centric, right?
02:20:10.440 | And this is, I talk about this all the time,
02:20:13.080 | there's a balance between you have,
02:20:15.600 | every project starts with what I call
02:20:17.080 | company-backed open source.
02:20:18.280 | Even if the company is yourself, it's just one person.
02:20:20.360 | Just doing business as.
02:20:23.320 | But ultimately for products to succeed virally
02:20:26.040 | and become massive influencers,
02:20:28.280 | they have to get community people on board.
02:20:30.480 | They have to get other people on board.
02:20:32.080 | So it has to become community-driven.
02:20:33.680 | And a big part of that is engagement with those people,
02:20:35.520 | empowering people, governance around it.
02:20:38.560 | And what happened with Conda in the early days,
02:20:41.320 | Pip emerged and we did do some good things,
02:20:43.680 | Conda Forge, Conda Forge community
02:20:46.360 | is sort of the community recipe creation community.
02:20:49.840 | But Conda itself, I still believe,
02:20:52.160 | and Peter is CEO of Anaconda, he's my co-founder.
02:20:55.120 | I ran Anaconda until 2017, 2018.
02:20:58.160 | - Is Peter still in it?
02:20:59.000 | - Peter's still in Anaconda, right?
02:21:00.000 | We're still great friends, we talk all the time.
02:21:02.560 | I love him to death.
02:21:03.600 | There's a long story there about why and how,
02:21:06.040 | and we can cover it in some other podcast perhaps.
02:21:08.640 | - Yeah. (laughs)
02:21:09.480 | - It's sort of a more, maybe a more business-focused one.
02:21:11.400 | But this is one area where I think Conda
02:21:15.160 | should be more community-driven.
02:21:17.280 | Like he should be pushing more
02:21:18.960 | to get more community contributors to Conda.
02:21:21.200 | And let the, Anaconda shouldn't be fighting this battle.
02:21:26.080 | - Yeah. - Right?
02:21:26.920 | It's actually, it's really a developer's.
02:21:28.600 | Like you said, help the developers,
02:21:30.440 | and then they'll actually move us the right direction.
02:21:32.480 | - That was the problem I have,
02:21:33.480 | is many of the cool kids I know don't use Conda.
02:21:36.520 | And that, to me, is confusing.
02:21:38.880 | - It is confusing.
02:21:39.800 | It's really a matter of, Conda has some challenges,
02:21:42.640 | first of all.
02:21:43.480 | Conda still needs to be improved,
02:21:44.300 | there's lots of improvements to be made.
02:21:45.400 | And it's that aspect of, wait, who's doing this?
02:21:47.600 | And the fact that then the PyPA really stepped up.
02:21:50.960 | Like they were not solving the problem at all.
02:21:53.400 | And now they kinda got to where they're solving it
02:21:55.620 | for the most part.
02:21:56.680 | And then effectively, you could get,
02:21:58.160 | like Conda solved a problem that was there,
02:22:00.360 | and it still does, and it's still,
02:22:02.160 | there's still great things it can do.
02:22:03.960 | But, and we still use it all the time at Quantsite
02:22:07.160 | with other clients, but with,
02:22:09.000 | but you can kinda do similar things with Pip and Docker.
02:22:12.160 | Right?
02:22:13.000 | So, especially with the web development community,
02:22:15.280 | that part of it again is the,
02:22:17.080 | there's a lot of different kind of developers
02:22:19.160 | in the Python ecosystem.
02:22:20.200 | And there's still a lack of some clear understanding.
02:22:23.720 | I go to the Python conference all the time,
02:22:25.320 | and there's only a few people in the PyPA who get it.
02:22:28.280 | And then others who are just massively trumpeting
02:22:30.680 | the power of Pip, but just do not understand the problem.
02:22:32.840 | - Yeah, so one of the obvious things to me from a mom,
02:22:36.040 | from a non-programmer perspective,
02:22:37.840 | is the across operating system usability
02:22:41.760 | that's much more natural.
02:22:42.680 | So, there's people that use Windows,
02:22:44.960 | and just, it seems much easier to recommend Conda there.
02:22:49.040 | But then you should also recommend it across the board.
02:22:51.840 | So, I'll definitely sort of--
02:22:53.520 | - But what I recommend now is a hybrid.
02:22:55.280 | I do, I mean, I have no problem--
02:22:56.680 | - Is it possible to use--
02:22:57.760 | - Oh, it is, it is.
02:22:58.720 | What I, like, build the environment with Pip, with Conda,
02:23:01.560 | build an environment with Conda,
02:23:03.320 | and then Pip install on top of that, that's fine.
02:23:05.280 | Be careful about Pip installing OpenCV or TensorFlow,
02:23:09.360 | or, because if somebody's allowed that,
02:23:11.320 | it's gonna be most surely done in a way
02:23:13.280 | that can't be updated that easily.
02:23:15.080 | - So, install, like, the big packages,
02:23:17.640 | the infrastructure with Conda, and then the weirdos,
02:23:21.560 | that, like, the weird, like, implementation for some.
02:23:24.680 | I had, there's a cool library I used
02:23:28.400 | that, based on your location and time of day and date,
02:23:33.400 | tells you the exact position of the sun relative to the--
02:23:36.520 | - Oh, very cool. - To the earth.
02:23:38.120 | And it's just, like, a simple library,
02:23:39.680 | but it's very precise, and I was like, all right.
02:23:42.040 | But that was, and it's Pip, and it's very nice.
02:23:45.120 | - Well, the thing they did really well is,
02:23:46.920 | Python developers who wanna get their stuff published,
02:23:50.560 | you have to have a Pip recipe, right?
02:23:52.840 | I mean, even if it's, you know, the challenge is,
02:23:56.440 | and there's a key thing that needs to be added to Pip,
02:23:58.760 | just simply add to Pip the ability to defer
02:24:01.640 | to a system package manager.
02:24:03.400 | Like, 'cause it's, you know, recognize you're not
02:24:05.200 | gonna solve all the dependency problem.
02:24:07.240 | So, let, like, give up, and allow a system packager to work.
02:24:12.240 | That way, an Akanda's installed, and it has Pip,
02:24:15.120 | it would default to Kanda to install its stuff,
02:24:16.920 | but Red Hat RPM would default to RPM
02:24:19.200 | to install more things.
02:24:20.560 | Like, that's the, that's a key, not difficult,
02:24:23.440 | but somewhat work, some work, feature needs to be added.
02:24:25.920 | That's an example of something, like,
02:24:27.400 | I've known we need to re-do it.
02:24:28.600 | I mean, it's where I wish I had more money.
02:24:30.880 | I wish I was more successful in the business side,
02:24:33.440 | trying to get there, but I wish my, you know,
02:24:35.040 | my family, friends, and full community that I know--
02:24:37.280 | - Was larger.
02:24:38.120 | - Was larger, and had more money,
02:24:39.320 | 'cause I know tons of things to do, effectively,
02:24:42.680 | with more resources, but, you know,
02:24:46.280 | I have not yet been successful at Channel.
02:24:48.720 | Tons of, you know, some, you know,
02:24:49.960 | I'm happy with what we've done.
02:24:51.500 | We've created again, at QuantSite,
02:24:54.840 | what we created to get Anaconda started.
02:24:56.460 | We created a community analytics to get Anaconda started.
02:24:58.160 | Done it again with QuantSite, super excited by that.
02:25:00.480 | - By the way-- - It took three years
02:25:01.440 | to do it.
02:25:02.280 | - What is QuantSite, what is its mission?
02:25:04.320 | We've talked a few times about different,
02:25:06.440 | fascinating aspects of it, but let's like,
02:25:08.320 | big picture, what is QuantSite?
02:25:09.160 | - Big picture, QuantSite.
02:25:10.200 | QuantSite is, its mission is to connect data
02:25:13.480 | to an open economy, so it's basically consulting
02:25:16.120 | of the PID ecosystem, right?
02:25:17.680 | It's a consulting company, and what I've said
02:25:20.220 | when I started it was we're trying to create products,
02:25:22.520 | people, and technology, so it's divided into two groups,
02:25:26.680 | and a third one as well.
02:25:28.300 | The two groups are a consulting services company
02:25:30.360 | that just helps people do data science,
02:25:31.960 | and data engineering, and data management better,
02:25:35.080 | and more efficiently.
02:25:35.920 | - Like full stack, like full thing.
02:25:36.760 | - Full stack, data science, full thing,
02:25:38.200 | which will help you build a infrastructure
02:25:40.000 | if you're using Jupyter, we do staff augmentation,
02:25:42.880 | need more programmers, help you use DAS more effectively,
02:25:44.880 | help you use GPUs more effectively,
02:25:46.520 | just basically a lot of people need help.
02:25:48.400 | So we do training as well to help people,
02:25:50.800 | both immediate help, and then learn from somebody.
02:25:55.000 | We've added a bunch of stuff too,
02:25:57.080 | we've kind of separated some of these other things
02:25:58.600 | into another company called OpenTeams
02:26:00.120 | that we currently started.
02:26:01.760 | One of the things I loved about what we did at Anaconda
02:26:03.400 | was create a community innovation team.
02:26:05.480 | And so I wanted to replicate that.
02:26:06.720 | This time we did a lot of innovation at Anaconda.
02:26:09.360 | I wanted to do innovation, but also contribute
02:26:12.320 | to the projects that existed, like create a place
02:26:15.220 | where maintainers, so that SciPy, and NumPy,
02:26:17.720 | and all these projects we already started
02:26:20.400 | can pay people to work on them and keep them going.
02:26:22.680 | So that's Labs, QuantSite Labs is a separate organization,
02:26:25.960 | it's a non-profit mission,
02:26:28.080 | the profits of QuantSite help fund it,
02:26:29.940 | and in fact, every project that we have at QuantSite,
02:26:33.240 | a portion of the money goes directly to QuantSite Labs
02:26:36.060 | to help keep it funded.
02:26:37.060 | So we've gotten several mechanisms
02:26:38.320 | we keep QuantSite Labs funded.
02:26:40.040 | And currently, so I'm really excited about Labs,
02:26:41.960 | 'cause it's been a mission for a long time.
02:26:43.680 | - What kind of projects are within Labs?
02:26:45.240 | - So Labs is working to make the software better,
02:26:47.680 | like make NumPy better, make SciPy better,
02:26:49.480 | make it's only works on open source.
02:26:52.320 | So if somebody wants to, so companies do,
02:26:55.400 | we have a thing called a community work order, we call it.
02:26:57.440 | If a company says, I wanna make Spyder better,
02:26:59.980 | okay, cool, you can pay for a month
02:27:03.020 | of a developer of Spyder, or a developer of NumPy,
02:27:06.460 | or a developer of SciPy, you can't tell them
02:27:09.080 | what you want them to do, you can give them your priorities
02:27:10.980 | and things you wish existed, and they'll work
02:27:14.480 | on those priorities with the community
02:27:16.060 | to get what the community wants,
02:27:17.500 | and what emerges is what the community wants.
02:27:18.900 | - Is there some aspect on the consulting side
02:27:21.100 | that is helping, as we were talking about morphology
02:27:24.340 | and so on, is there specific applications
02:27:26.620 | that are particularly like driving,
02:27:29.100 | sort of inspiring the need for updates to SciPy?
02:27:31.980 | - Correct, absolutely, absolutely,
02:27:33.340 | GPUs are absolutely one of them,
02:27:34.820 | and new hardware beyond GPUs, I mean,
02:27:37.660 | Tesla's Dojo chip, I'm hoping we'll have a chance
02:27:39.700 | to work on that perhaps.
02:27:41.060 | Things like that are definitely driving it.
02:27:43.820 | The other thing that's driving it is scalable,
02:27:45.500 | like speed and scale, how do I write NumPy code
02:27:49.200 | or NumPy light code if I want it to run across a cluster?
02:27:52.180 | You know, that's Dask, or maybe it's Ray,
02:27:54.220 | I mean, there's sort of ways to do that now,
02:27:56.340 | or there's Modin, and there's, so Pandas code,
02:27:59.700 | NumPy code, SciPy code, Scikit-learn code,
02:28:02.060 | that I want to scale, so that's one big area.
02:28:04.860 | - Have you gotten a chance to chat with Andre and Elon
02:28:08.380 | about, 'cause like--
02:28:09.820 | - No, I would love to, by the way.
02:28:11.340 | I have not, but I'd love to.
02:28:12.260 | I just saw their Tesla AI Days video, super exciting.
02:28:16.260 | - So this one of the, you know, I love great engineering,
02:28:18.540 | software engineering teams, and engineering teams
02:28:20.580 | in general, and they're doing a lot of incredible stuff
02:28:22.460 | with Python, they're like--
02:28:23.620 | - They are. - Revolutionary,
02:28:25.020 | so many aspects of the machine learning pipeline
02:28:29.140 | that's operating in the real world,
02:28:30.540 | and so much of that is Python.
02:28:31.820 | Like you said, the guy running, you know,
02:28:33.980 | Andre Karpathy running Autopilot is tweeting
02:28:37.340 | about optimization of NumPy versus--
02:28:41.140 | - I would love to talk to him.
02:28:42.820 | In fact, we have at Quantsight, we've been fortunate enough
02:28:45.020 | to work with Facebook on PyTorch directly.
02:28:47.540 | So we have about 13 developers at Quantsight.
02:28:49.860 | Some of them are in labs working directly on PyTorch.
02:28:52.540 | - On PyTorch, that's great. - On PyTorch, right?
02:28:54.100 | So I basically, when we started Quantsight,
02:28:55.660 | I went to both TensorFlow and PyTorch and said,
02:28:57.140 | hey, I wanna help connect what you're doing
02:29:00.220 | to the broader SciPy ecosystem,
02:29:01.900 | because I see what you're doing,
02:29:03.220 | we have this bigger mission, we wanna make sure
02:29:04.740 | we don't, you know, lose energy here.
02:29:06.740 | So, and Facebook responded really positively,
02:29:09.860 | and I didn't get the same reaction.
02:29:12.380 | - Not yet, not yet. - Not yet.
02:29:13.980 | - I love the folks at TensorFlow.
02:29:15.660 | - I really love the folks at TensorFlow too,
02:29:17.460 | they're fantastic.
02:29:18.460 | I think it's the, just how it integrates
02:29:21.140 | with their business, I mean, like I said,
02:29:22.500 | there's a lot of reasons, just the timing,
02:29:24.660 | the integration with their business,
02:29:25.700 | what they're looking for.
02:29:27.140 | They're probably looking for more users,
02:29:28.740 | and I was looking to kind of cut some development effort,
02:29:31.580 | and they couldn't receive that as easily, I think.
02:29:33.820 | So I'm hoping, I'm really hopeful,
02:29:35.980 | and love the people there.
02:29:37.620 | - What's the idea behind OpenTeams?
02:29:39.740 | - So OpenTeams, I'm super excited about OpenTeams,
02:29:41.940 | because it's one of the, I mentioned my idea
02:29:44.620 | for investing directly in open source,
02:29:46.740 | so that's a concept called FerroSS.
02:29:48.860 | But one of the things we, when we started Quantsight,
02:29:50.980 | we knew we would do, is we'd develop products and ideas,
02:29:53.660 | and new companies might come out.
02:29:55.420 | At Anaconda, this was clear, right?
02:29:57.660 | Anaconda, we did so much innovation,
02:30:00.260 | that like five or six companies could have come out of that.
02:30:02.940 | And we just didn't structure it so they could.
02:30:05.020 | But in fact, they have, you look at Dask,
02:30:07.240 | there's two companies coming out of Dask.
02:30:08.900 | You know, Bokeh could be a company.
02:30:10.060 | There's like lots of companies that could exist
02:30:11.700 | off the work we did there.
02:30:13.100 | And so I thought, oh, here's a recipe for an incubation,
02:30:16.420 | a concept that we could actually spawn new companies,
02:30:19.500 | and new innovations.
02:30:20.820 | And then the idea has always been, well,
02:30:23.100 | money they earn should come back to fund
02:30:25.540 | the open source projects.
02:30:26.540 | So labs, I think there should be a lot of things
02:30:29.860 | like Quantsight labs.
02:30:30.740 | I think this concept is one that scales.
02:30:32.540 | You could have a lot of open source research labs.
02:30:35.100 | Along the way, so in 2018, when the bigger idea came,
02:30:37.480 | how to make open source investable, I said,
02:30:38.820 | oh, I need to create a venture fund.
02:30:41.120 | So we created a venture fund called Quantsight Initiate
02:30:43.860 | at the same time.
02:30:44.700 | It's an angel fund, really.
02:30:46.540 | We started to learn that process.
02:30:47.860 | How do we actually do this?
02:30:48.700 | How do we get LPs?
02:30:49.540 | How do we actually go in this direction and build a fund?
02:30:52.460 | And I'm like, every venture fund should have
02:30:54.260 | an associated open source research lab.
02:30:55.740 | There's just no reason.
02:30:56.580 | Like our venture fund, the carried interest portion of it
02:31:00.340 | goes to the lab.
02:31:01.840 | It directly will fund the lab.
02:31:03.300 | - That's fascinating, by the way.
02:31:04.120 | So you use the power of the organic formation of teams
02:31:06.780 | in the open source community, and then naturally,
02:31:10.660 | that leads to a business that can make money.
02:31:13.340 | - There are some, yeah, correct.
02:31:14.180 | - And then it always maintains and loops back
02:31:16.700 | to the open source. - Loops back to open source.
02:31:18.140 | Exactly, and to me, it's a natural fit.
02:31:19.700 | There's something, there's absolutely
02:31:20.980 | a repeatable pattern there.
02:31:22.380 | And it's also beneficial because, oh, I have natural
02:31:26.260 | connections to the open source,
02:31:27.460 | so I have an open source research lab.
02:31:29.220 | Like, they'll all be out there talking to people.
02:31:31.900 | And so we've had a chance to talk to a lot
02:31:34.980 | of early stage companies.
02:31:35.940 | And our fund focused on the early stage.
02:31:37.900 | So Quantsight has the services, the lab, the fund, right?
02:31:41.900 | In that process, a lot of stuff started to happen.
02:31:44.220 | They're like, oh, you know, we started to do recruiting
02:31:46.340 | and support and training.
02:31:48.060 | And I was starting to build a bigger sales team
02:31:50.220 | and marketing team and people besides just developers.
02:31:52.900 | And one of the challenges with that is you end up
02:31:54.500 | with different cultural aspects.
02:31:55.980 | You know, developers, you know, there's a,
02:31:58.820 | in any company you go to, you kind of go, look,
02:32:00.780 | is this a business led company, a developer led company?
02:32:03.100 | Do they kind of coexist?
02:32:04.300 | Are they, what's the interface between them?
02:32:06.140 | There's always a bit of a tension there,
02:32:07.300 | like we were talking about before.
02:32:08.780 | You know, what is the tension there?
02:32:10.220 | With OpenTeams, I thought, wait a minute,
02:32:11.380 | we can actually just create, like this concept
02:32:13.900 | of Quantsight plus labs, it's, well, we're,
02:32:16.260 | well, it's specific to the PyData ecosystem.
02:32:18.460 | The concept is general for all open source.
02:32:20.820 | So OpenTeams emerged as a, oh, we can create
02:32:23.100 | a business development company for many, many Quantsights,
02:32:27.020 | like thousands of Quantsights.
02:32:28.420 | And it can be a marketplace to connect,
02:32:30.820 | essentially be the enterprise software company
02:32:33.460 | of the future.
02:32:34.460 | If you look at what enterprise software wants
02:32:36.740 | from the customer side, and during this journey,
02:32:38.620 | I've had the chance to work and sell to lots of companies,
02:32:42.340 | Exxon and Shell and Davey Morgan, Bank of America,
02:32:45.220 | like the Fortune 100, and talk to a lot of people
02:32:47.660 | in procurement and see what are they buying
02:32:49.300 | and why are they buying?
02:32:50.380 | So, you know, I don't know everything,
02:32:51.780 | but I've learned a lot about, oh,
02:32:52.820 | what are they really looking for?
02:32:54.500 | And they're looking for solutions.
02:32:56.380 | They're constantly given products
02:32:58.140 | from the, from enterprise software.
02:33:01.140 | Here's open source, lead to enterprise software,
02:33:02.580 | now I buy it, and then they have to stitch it together
02:33:04.220 | into a solution.
02:33:05.060 | Open source is fantastic for gluing
02:33:07.380 | those solutions together.
02:33:08.780 | So, whereas they keep getting new platforms
02:33:11.500 | they're trying to buy, what most open source,
02:33:13.020 | what most enterprises want is tools
02:33:15.780 | that they can customize that are as inexpensive as they can.
02:33:18.900 | - Yeah, and so you always want to maintain
02:33:20.420 | the connection to the open source
02:33:21.540 | because that's-- - Yes, so open teams
02:33:23.500 | is about solving enterprise software problems.
02:33:26.740 | - Brilliant, brilliant idea, by the way.
02:33:28.140 | - With a connect, but we do it honoring the topology.
02:33:30.940 | We don't hire all the people.
02:33:32.380 | We are a network connecting the sales energy
02:33:35.140 | and the procurement energy, and we were on the business side
02:33:37.980 | get the deals closed, and then have a network of partners
02:33:40.580 | like QuantSite and others who we hand the deals to, right?
02:33:44.100 | To actually do the work, and then we have to maintain,
02:33:46.500 | I feel like we have to maintain
02:33:47.340 | some level of quality control
02:33:48.780 | so that the client can rely on open teams
02:33:50.980 | to ensure the delivery.
02:33:52.100 | It's not just, here's a lead, go figure that out,
02:33:54.660 | but no, we're gonna make sure you get what you need.
02:33:56.620 | - Yeah, by the way, it's such a skill,
02:33:58.860 | and I don't know if I have the patience,
02:34:00.660 | or ever will have the patience to talk
02:34:02.180 | to the business people, or more specific,
02:34:05.300 | I mean, there's all kinds of flavors of business people,
02:34:07.460 | or like marketing people.
02:34:10.580 | (laughing)
02:34:11.940 | - There's a challenge, I hear what you're saying,
02:34:13.260 | because I've had the same challenge, and it's true.
02:34:15.580 | There's sometimes you think, okay, this is way overwrought.
02:34:18.460 | - Yeah, but you have to become an adult,
02:34:20.220 | and you have to, 'cause the companies have needs.
02:34:22.340 | They have ways to make money,
02:34:24.340 | and they also wanna learn and grow,
02:34:26.460 | and it's your job to kind of educate them
02:34:28.300 | in the best way, like the value of open source, for example.
02:34:30.980 | - Right, and I'm really grateful for all my experiences
02:34:32.940 | over the past 14 years, understanding that side of it,
02:34:35.700 | and still learning, for sure,
02:34:37.120 | but not just understanding from companies,
02:34:38.620 | but also dealing with marketing professionals,
02:34:40.540 | and sales professionals, and people that make a career
02:34:42.860 | out of that, and understanding what they're thinking about,
02:34:44.340 | and also understanding, well, let's make this better.
02:34:46.620 | Like, we can really make a place,
02:34:47.940 | like OpenTeams, I see as the transmission layer
02:34:50.440 | between companies and open source communities,
02:34:53.680 | producing enterprise software solutions.
02:34:55.460 | Like, eventually, we wanna, like today,
02:34:57.180 | we're taking on SaaS, and MATLAB,
02:34:59.300 | and tools that we know we can replace for folks.
02:35:01.700 | Really, any time you have a software tool
02:35:03.300 | in an organization, where you have to do
02:35:05.260 | a lot of customization to make it work for you.
02:35:07.300 | Like, it's not you're just buying this thing
02:35:08.480 | off the shelf, and it works.
02:35:09.360 | It's like, okay, you buy the system,
02:35:11.040 | and then you customize it a lot,
02:35:12.800 | usually with expensive consultants,
02:35:15.240 | to actually make it work for you.
02:35:17.160 | All of those should be replaced by open source foundations,
02:35:19.720 | with the same customization. - You're doing
02:35:21.200 | such important work, such important work
02:35:23.160 | in these giant organizations that do exactly that,
02:35:26.480 | taking some proprietary software,
02:35:28.320 | and hiring a huge team of consultants that customize it,
02:35:31.360 | and then that whole thing gets outdated quickly.
02:35:33.680 | - Correct.
02:35:34.520 | - And so, I mean, that's brilliant.
02:35:36.760 | - Right. - So, the one solution
02:35:38.320 | to that is kind of what Tesla's doing a little bit of,
02:35:43.240 | which is basically build up a software engineering team.
02:35:46.440 | - Yeah. - Like, build a team
02:35:47.600 | from scratch. - Build a team from scratch.
02:35:48.960 | And companies that are doing it well,
02:35:49.880 | that's what they're doing right now.
02:35:50.800 | - Yeah, exactly. - Right, and that's okay.
02:35:52.400 | - And you're creating a topology for some of that.
02:35:54.320 | - You're right, you just don't have to do it.
02:35:55.640 | That's not the only answer, right?
02:35:57.080 | And so, other companies can access this,
02:35:58.880 | be more accessible.
02:35:59.880 | We usually, it's really, really safe.
02:36:01.120 | Open Team's the future of enterprise software.
02:36:03.920 | We're still early.
02:36:04.760 | Like, this idea just percolated over the past year
02:36:07.400 | as we've kind of grown QuantSite
02:36:08.520 | and realized the extensibility of it.
02:36:10.440 | We just finished in our seed round
02:36:12.240 | to help get more salespeople
02:36:15.160 | and then push the messaging correctly.
02:36:17.640 | And there's lots of tools we're building
02:36:19.160 | to make this easier.
02:36:20.000 | Like, we wanna automate the processes.
02:36:21.700 | We feel like a lot of the power
02:36:23.560 | is the efficiency of the sales process.
02:36:25.560 | There's a lot of wasted energy in small teams
02:36:29.380 | and the sales energy to get into large companies
02:36:31.600 | and make a deal.
02:36:32.640 | There's a lot of money spent on that process.
02:36:34.680 | - Creating the tools and processes for that sales.
02:36:36.520 | - So, make that super seamless.
02:36:38.120 | So, a single company can go,
02:36:39.560 | "Oh, I've got my contract with Open Team.
02:36:41.320 | "We've got a subscription they can get."
02:36:43.040 | They can make that procurement seamless.
02:36:45.200 | And then, the fact they have access
02:36:46.660 | to the entire open source ecosystem.
02:36:48.800 | And we have a, you know, so we have a part of our work
02:36:51.200 | that's embracing open source ecosystems
02:36:53.360 | and making sure we're doing things useful for them,
02:36:55.040 | we're serving them.
02:36:56.140 | And then, companies making sure
02:36:57.560 | they're getting solutions they care about.
02:36:59.240 | And then, figuring out which targets we have.
02:37:01.900 | You know, we're not taking on all of open source,
02:37:04.760 | all of enterprise software yet.
02:37:06.040 | But we're, you know, step by step.
02:37:07.480 | - Well, this feels like the future.
02:37:08.520 | The idea and the vision is brilliant.
02:37:10.600 | Can I ask you, why do you think Microsoft bought GitHub
02:37:14.440 | and what do you think is the future of GitHub?
02:37:16.560 | - Great point, great point.
02:37:17.400 | I thought it was a brilliant move.
02:37:18.220 | I think they did because Microsoft has always
02:37:20.620 | had a developer-centric culture.
02:37:22.680 | Like, they always have.
02:37:23.520 | Like, one of the things Microsoft's always done well
02:37:25.160 | is understand their power as developers, right?
02:37:27.440 | It's been, you know,
02:37:28.640 | Balmer didn't necessarily make a good meme
02:37:31.600 | about how he approached that.
02:37:32.560 | But, they're broadening that.
02:37:34.520 | I think that's why.
02:37:35.360 | Because they recognize GitHub is where developers are at.
02:37:38.080 | Right?
02:37:38.920 | And so--
02:37:39.760 | - But do they have a vision, like,
02:37:40.580 | open teams type of situation?
02:37:42.000 | - I don't think so, yet.
02:37:43.600 | - Are they just basically throwing money at developers
02:37:46.640 | to show their support?
02:37:47.960 | - I think so.
02:37:48.800 | - Without a topology, like you put it.
02:37:50.800 | Like, a way to leverage that,
02:37:53.280 | like, to give developers actual money.
02:37:55.480 | - Right.
02:37:56.320 | I don't think so.
02:37:57.160 | I think they're still,
02:37:58.000 | it's an enterprise software company.
02:37:59.440 | And they make a bunch of money.
02:38:00.520 | They make a bunch of games.
02:38:01.360 | They have a, you know,
02:38:02.200 | they're a big company and they sell products.
02:38:03.760 | I think part of it is,
02:38:04.680 | they know there's opportunity to make money from GitHub.
02:38:07.760 | Right?
02:38:08.600 | There's definitely a business there, you know,
02:38:10.080 | to sell to developers,
02:38:11.340 | or to sell to people using development.
02:38:13.280 | I think there's part of that.
02:38:14.240 | I think part of it is also,
02:38:15.360 | there's, they had definitely wanted to recognize
02:38:18.120 | that you need to value open source
02:38:20.560 | to get great developers.
02:38:21.920 | Which is an important concept
02:38:23.360 | that was emerging over the past 10 years.
02:38:25.000 | That, you know,
02:38:26.160 | PyData, we were able to convince JP Morgan
02:38:29.880 | to support PyData because of that fact.
02:38:31.480 | Right?
02:38:32.320 | That was where the money for them
02:38:33.200 | putting a couple hundred thousand
02:38:34.280 | into supporting PyData for several conferences
02:38:36.660 | was they want developers.
02:38:37.800 | And they realized that developers
02:38:39.080 | want to participate in open source.
02:38:40.720 | So, enterprise software folks
02:38:42.520 | don't always understand how their software gets used.
02:38:44.600 | Having spent a lot of time on the floors
02:38:46.580 | at JP Morgan, at InShell, at ExxonMobil,
02:38:49.600 | you see, oh, these companies have large development teams.
02:38:52.880 | And then you're,
02:38:53.720 | they're kind of dealing with the,
02:38:55.280 | what's being delivered to them.
02:38:56.720 | So I really feel kind of a privilege
02:38:58.340 | that I had a chance to learn some of these people
02:39:00.480 | and see what they're doing.
02:39:01.320 | And even work alongside them, you know, as a consultant,
02:39:05.080 | using my, using open source and trying to think,
02:39:07.640 | how do we make this work
02:39:08.480 | inside of our large organization?
02:39:09.960 | - Some of it is actually, for a large organization,
02:39:13.000 | some of it is messaging to the world
02:39:14.800 | that you care about developers
02:39:16.280 | and you're the cool, you care.
02:39:18.840 | Like, for example, like if Ford,
02:39:21.040 | 'cause I talked to them,
02:39:22.840 | like car companies, right?
02:39:23.880 | They want to attract, you know,
02:39:26.680 | you want to take on Tesla
02:39:28.080 | and autopilot, you want to take that, right?
02:39:29.940 | And so what do you do there?
02:39:31.720 | You show that you're cool.
02:39:32.960 | Like you try to show off that you care about developers
02:39:36.480 | and they have a lot of trouble doing that.
02:39:39.040 | And like one way, I think like Ford should have bought GitHub
02:39:42.720 | but it's just a show off.
02:39:43.960 | - Yeah, that's a better, yeah.
02:39:45.160 | - Like these old school companies
02:39:46.880 | and it's in a lot of different industries.
02:39:49.980 | There's probably different ways.
02:39:51.080 | It's probably an art to show that you care to developers
02:39:54.080 | and the developers, it's exactly what you,
02:39:57.000 | like for example, just spitballing here,
02:40:00.520 | but like Ford or somebody like that
02:40:02.520 | could give a hundred million dollars
02:40:05.960 | to the development of NumPy
02:40:07.880 | and like literally look at like the top most popular projects
02:40:12.880 | in Python and just say, we're just gonna give money.
02:40:17.040 | - Right.
02:40:17.880 | - Like that's gonna immediately make you cool.
02:40:20.240 | - They could actually, yeah.
02:40:21.600 | And in fact, they set up NumFocus to make it easy.
02:40:24.400 | But the challenge was,
02:40:26.060 | is also you have to have some business development.
02:40:28.480 | Like it's a bit of a seeding problem, right?
02:40:31.280 | And you look at how I've talked to the folks
02:40:32.680 | at Linux Foundation, know how they're doing it.
02:40:34.360 | I know how, and starting NumFocus
02:40:36.600 | 'cause we had two babies in 2012.
02:40:39.400 | One was Anaconda, one was NumFocus, right?
02:40:41.120 | And they were both important efforts.
02:40:42.740 | They had distinct journeys
02:40:44.000 | and super grateful that both existed
02:40:46.240 | and still grateful both exist.
02:40:47.800 | But there's different energies in getting donations
02:40:51.920 | as there is getting, this is important to my business.
02:40:55.400 | Like I'm selling you something that,
02:40:57.300 | this is a, I'm gonna make money this way.
02:41:00.320 | Like if you can tie it,
02:41:01.160 | if you can tie the message to an ROI for the company,
02:41:04.080 | it becomes a brainer. - That's more effective.
02:41:05.240 | - It's much more effective, right?
02:41:07.000 | So, and there are rational arguments to make.
02:41:09.600 | I've tried to have conversations with marketing,
02:41:11.200 | especially marketing departments.
02:41:12.300 | Like very early on, it was clear to me that,
02:41:14.880 | oh, you could just take a fraction of your marketing budget
02:41:18.200 | and just spend it on open source development
02:41:20.280 | and you get better results from your marketing.
02:41:23.800 | Like, because--
02:41:24.640 | - What are those, can I, sorry,
02:41:26.040 | I'm gonna try not to go on a rant here.
02:41:27.960 | What have you learned from the interaction
02:41:29.840 | with the marketing folks on that kind of,
02:41:31.480 | 'cause you gave a great example of something
02:41:34.560 | that will obviously be a much better investment
02:41:37.280 | in terms of marketing is supporting open source projects.
02:41:40.400 | - The challenge is not dissimilar
02:41:41.900 | from the challenge you have in academia
02:41:44.520 | at the different colleges, right?
02:41:46.520 | Knowledge gets very specific and very channeled, right?
02:41:50.000 | And so people get, they get a lot of learning
02:41:52.280 | in the thing they know about.
02:41:53.940 | And it's hard then to bridge that
02:41:56.200 | and to get them to think differently enough
02:41:58.200 | to have a sense that you might have something to offer.
02:42:02.160 | 'Cause it's different.
02:42:03.000 | It's like, well, how do I implement that?
02:42:04.280 | How do I, what do I do with that?
02:42:06.120 | Do I, which budget do I take from?
02:42:07.840 | Do I slow down my spend on Google ads
02:42:10.320 | or my spend on Facebook ads?
02:42:11.600 | Or do I not hire a content creator?
02:42:13.440 | And so like, there's an operational aspect to that
02:42:16.160 | that you have to be the CMO, right?
02:42:19.080 | Or the CEO, you have to get the right level.
02:42:21.020 | - So you have to hire at a high position level
02:42:24.400 | where they care about this and they specialize.
02:42:26.080 | - Or they won't know how, right?
02:42:27.680 | And because you can also do it very clumsily, right?
02:42:30.480 | And I've seen, 'cause you can.
02:42:32.120 | You absolutely have to honor and recognize
02:42:33.800 | the people you're going to and the fact
02:42:36.680 | that if you just throw money at them,
02:42:37.840 | it could actually create more problems.
02:42:39.280 | - Can I just say, this is not you saying.
02:42:40.820 | Can I just, 'cause I just need, I need to say this.
02:42:44.440 | I've been very surprised how often marketing people
02:42:49.940 | are terrible at marketing.
02:42:51.800 | I feel like the best marketing is doing something novel
02:42:55.680 | and unique that anticipates the future.
02:42:58.320 | It feels like so much of the marketing practice
02:43:01.560 | is like what they took in school,
02:43:04.400 | or maybe they're studying for what was the best thing
02:43:06.720 | that was done in the past decade.
02:43:08.480 | And they're just repeating that over and over
02:43:10.840 | as opposed to innovating, like taking the risk.
02:43:13.820 | To me, marketing is taking the big risk.
02:43:17.120 | - That's a great point.
02:43:17.960 | - And being the first one to risk.
02:43:18.840 | - Yeah, there's an aspect of data observation
02:43:21.220 | from that risk, right, that's, I think,
02:43:23.420 | 'cause shared what they're doing already.
02:43:25.140 | But it absolutely, it's about, I think it's content.
02:43:27.700 | Like, there's this whole world on content marketing
02:43:30.220 | that you could almost say, well, yeah, it can get over,
02:43:33.580 | you can get inundated with stuff that's not relevant to you.
02:43:36.420 | Whereas what you're saying would be highly relevant
02:43:39.180 | and highly useful and highly beneficial.
02:43:41.540 | - Yeah, but it's a risk.
02:43:42.900 | I mean, that's why I sort of, there's a lot
02:43:45.060 | of innovative ways of doing that.
02:43:46.220 | Tesla's an example of people that basically
02:43:48.460 | don't do marketing.
02:43:49.940 | They do marketing in a very, like,
02:43:52.740 | it's like Elon hired a person who's just good at Twitter
02:43:55.740 | for running Tesla's Twitter account.
02:43:57.540 | (laughing)
02:43:58.380 | - No, right, right.
02:43:59.220 | - I mean, that's exactly what you wanna be doing.
02:44:00.820 | You want to be constantly innovating in--
02:44:03.100 | - Right, there's an aspect of telling,
02:44:04.260 | I mean, I've definitely seen people doing great work
02:44:06.900 | where you're not talking about it.
02:44:08.340 | Like, I would say that's actually a problem
02:44:09.540 | I have right now with Quansight Labs.
02:44:11.340 | Quansight Labs has been doing amazing work,
02:44:12.680 | really excited about it, but we have not been talking
02:44:14.380 | about it enough.
02:44:15.440 | We haven't been--
02:44:16.280 | - And there's different ways to talk about it.
02:44:17.740 | There's different ways to, there's different channels
02:44:19.580 | through which to communicate.
02:44:20.740 | There's also, like, I'll just throw some shade
02:44:25.540 | at companies I love.
02:44:26.880 | So for example, iRobot, I just had a conversation
02:44:30.140 | with them, they make Roombas.
02:44:31.820 | - Sure.
02:44:32.660 | - And I think I love, they're incredible robots,
02:44:35.380 | but like, every time they do, like, advertisement,
02:44:38.900 | not advertisement, but like, marketing type stuff,
02:44:41.860 | it just looks so corporate.
02:44:44.060 | And to me, the incredible, I may be wrong
02:44:48.540 | in the case of iRobot, I don't know,
02:44:50.240 | but to me, when you're talking about engineering systems,
02:44:53.980 | it's really nice to show off the magic of the engineering
02:44:56.980 | and the software and all the geniuses behind this product
02:45:01.980 | and the tinkering and like, the raw authenticity
02:45:06.100 | of what it takes to build that system
02:45:07.920 | versus the marketing people who want to have, like,
02:45:11.060 | pretty people, like, standing there all pretty
02:45:13.220 | with the robots, like, moving perfectly.
02:45:15.700 | So to me, there's some aspect, it's like,
02:45:18.060 | speaking to the hackers, you have to throw some bones,
02:45:22.140 | some care towards the engineers, the developers,
02:45:26.660 | because there's some aspect, one, for the hiring,
02:45:29.820 | but two, there's an authenticity to that kind
02:45:32.260 | of communication that's really inspiring
02:45:34.580 | to the end user as well.
02:45:36.100 | Like, if they know that brilliant people,
02:45:38.420 | the best in the world are working at your company,
02:45:40.660 | they start to believe that that product
02:45:42.660 | that you're creating is really good.
02:45:43.940 | - It's interesting, 'cause your initial reaction would be,
02:45:45.660 | wait, there's different users here, why would you do that?
02:45:48.260 | You know, my wife bought a Roomba,
02:45:50.660 | and she loves developers, she loves me,
02:45:52.540 | but she doesn't care about that hacker culture.
02:45:56.580 | So essentially what you said is actually the authenticity,
02:45:59.620 | 'cause everyone has a friend, everyone knows people,
02:46:01.140 | there's word of mouth, I mean, if you--
02:46:02.660 | - Word of mouth is so, so powerful.
02:46:04.180 | - Yeah, exactly, that's interesting.
02:46:05.020 | - And then--
02:46:05.840 | - 'Cause I think it's the lack of that realization,
02:46:07.580 | there's this halo effect.
02:46:08.580 | - Right, and also like--
02:46:09.420 | - That influences your general marketing, interesting.
02:46:11.740 | - For some stupid reason, I do have a platform,
02:46:14.660 | and it seems that the reason I have a platform,
02:46:16.980 | many others like me, millions of others,
02:46:19.540 | is like the authenticity,
02:46:21.260 | and like we get excited naturally about stuff.
02:46:23.980 | And like, I don't wanna get excited
02:46:25.780 | about that iRobot video, because it's boring,
02:46:29.380 | it's marketing, it's corporate,
02:46:30.820 | as opposed to, I wanted to do some fun,
02:46:33.620 | this is me, like a shout out to iRobot,
02:46:36.260 | is they're not letting me get into the robot.
02:46:39.380 | - Yeah, well, there's an aspect of,
02:46:40.900 | they could be benefiting from a culture of modularity,
02:46:44.860 | like add-ons, and that could actually dramatically help.
02:46:47.860 | You've seen that over history, I mean,
02:46:49.500 | Apple is an example of a company like that,
02:46:51.140 | or the, like, I can see what your point is,
02:46:54.380 | is that you have something that needs to be,
02:46:56.940 | it needs to be adopted broadly,
02:46:58.220 | the concept needs to be adopted broadly.
02:47:00.020 | And if you wanna go beyond this one device,
02:47:01.660 | you need to engage this community.
02:47:04.220 | - Yeah, and connecting to the open source, as you said.
02:47:07.540 | I gotta ask you, you're a programmer,
02:47:11.780 | one of the most impactful programmers ever.
02:47:14.820 | You've led many programmers, you lead many programmers.
02:47:18.540 | What are some, from a programmer perspective,
02:47:21.180 | what makes a good programmer?
02:47:23.380 | What makes a productive programmer?
02:47:25.020 | Is there a device you can give
02:47:27.140 | to be a great programmer this way?
02:47:27.980 | - That's a great, great question.
02:47:30.260 | And there are times in my life,
02:47:31.620 | I'd probably answer this even better
02:47:32.940 | than I hope maybe give an answer today.
02:47:35.060 | 'Cause I thought about this numerous times,
02:47:36.700 | like right now I've spent on so much time
02:47:38.260 | recently hiring salespeople that--
02:47:40.100 | - That your mind is a little bit on something else.
02:47:43.420 | - On something else.
02:47:44.260 | But I reflected on the past, and also,
02:47:47.300 | you know, I have some really, the only way I can do this,
02:47:49.260 | I have some really great programmers that I work with
02:47:51.460 | who lead the teams that they lead.
02:47:53.260 | And my goal is to inspire them and hopefully help them,
02:47:56.580 | encourage them, and help them encourage with their teams.
02:47:59.620 | I would say there's a number of things, a couple things.
02:48:01.180 | One is curiosity.
02:48:03.860 | Like I think a programmer without curiosity is mundane.
02:48:08.860 | Like you'll lose interest, you won't do your best work.
02:48:12.220 | So it's an affect, it's sort of,
02:48:13.980 | are you, have some curiosity about things.
02:48:16.800 | I think two, don't try to do everything at once.
02:48:19.600 | Recognize that you're, we're limited as humans.
02:48:21.980 | You're limited as a human.
02:48:23.220 | And each one of us are limited in different ways.
02:48:24.940 | You know, we all have our different strengths and skills,
02:48:26.620 | so it's adapting the art of programming to your skills.
02:48:29.900 | One of the things that always works
02:48:31.260 | is to limit what you're trying to solve, right?
02:48:33.900 | So if you're part of a team,
02:48:36.700 | usually maybe somebody else has put the architecture
02:48:38.620 | together and they've gotten given a portion for you
02:48:40.500 | if you're young.
02:48:41.780 | If you're not part of a team,
02:48:43.500 | it's sort of breaking down the problem into smaller parts
02:48:46.680 | is essential for you to make progress.
02:48:48.660 | It's very easy to take on a big project
02:48:50.740 | and try to do it all at once and you get lost
02:48:52.860 | and then you do it badly.
02:48:53.700 | And so thinking about, you know,
02:48:57.740 | very concretely what you're doing,
02:48:59.460 | defining the inputs and outputs,
02:49:01.460 | defining what you want to get done.
02:49:03.980 | Even just talking about that and like writing down
02:49:07.300 | before you write code, just what are you trying to accomplish?
02:49:09.480 | I mean, very specific about it really, really helps.
02:49:12.820 | I think using other people's work, right?
02:49:17.040 | Don't be afraid that somehow you're,
02:49:20.060 | like you should do it all.
02:49:21.300 | Like nobody does.
02:49:23.260 | - Stand on the shoulders of giants.
02:49:24.100 | - Stand on the shoulders of giants.
02:49:25.260 | - And copy and paste from Stack Overflow.
02:49:26.100 | - Copy and paste from Stack Overflow.
02:49:27.660 | It's like, but don't just copy and paste.
02:49:30.100 | It's particularly relevant in the era of codex
02:49:31.780 | and the auto-generated code,
02:49:34.260 | which is essentially I see as an indexing of Stack Overflow.
02:49:36.820 | - Right, exactly.
02:49:37.660 | - It's like--
02:49:38.480 | - It's a search engine.
02:49:39.320 | - It's a search engine over Stack Overflow basically.
02:49:41.300 | So it's not, I mean, we've had this for a while.
02:49:43.520 | But really you want to cut and paste, but not blindly.
02:49:47.340 | Like absolutely have cut and paste to understand,
02:49:51.060 | but then you understand, oh, this is what this means.
02:49:53.660 | Oh, this is what it's doing.
02:49:54.940 | And understand as much as you can.
02:49:56.700 | So it's critical, that's where the curiosity comes in.
02:49:59.100 | If you're just blindly cutting and pasting,
02:50:01.060 | you're not gonna understand.
02:50:01.980 | And so understand, and then be sensitive to hype cycles.
02:50:06.980 | Right, every few often there's always a,
02:50:10.940 | oh, test-driven development's the answer.
02:50:12.540 | Oh, object-oriented's the answer.
02:50:13.860 | Oh, there's always an answer.
02:50:15.740 | You know, agile's the answer.
02:50:18.420 | Be cautious of jumping onto a hype cycle.
02:50:20.860 | Like likely there's signal, like there's a thing there
02:50:23.460 | that's actually valuable you can learn from,
02:50:25.340 | but it's almost certainly not the answer
02:50:27.780 | to everything you need.
02:50:28.980 | - What lessons do you draw from you having created NumPy
02:50:32.820 | and SciPy, like in service of sort of answering
02:50:36.820 | the question of what it takes to be a great programmer
02:50:38.900 | and giving advice to people?
02:50:40.620 | How can you be the next person to create a SciPy?
02:50:42.980 | - Yeah, so one is listen.
02:50:44.860 | - To? - Listen.
02:50:46.620 | - To who?
02:50:47.460 | - To people that have a problem, right?
02:50:51.460 | Which is everybody, right?
02:50:52.520 | But listen and listen to many.
02:50:54.980 | And then try to, and then do.
02:50:57.460 | Like you're gonna have to do an experiment.
02:50:59.460 | You know, do, fall down.
02:51:00.580 | Don't be afraid to fall down.
02:51:01.940 | Don't be afraid, the first thing you do
02:51:04.240 | is probably gonna suck, and that's okay, right?
02:51:07.580 | It's honestly, I think iteration is the key to innovation.
02:51:11.220 | And it's almost that psychological hesitation we have
02:51:16.220 | to just iterate.
02:51:18.500 | Like yeah, we know it's not great,
02:51:20.540 | but next time it'll be better.
02:51:22.020 | I mean, just keep learning and keep improving
02:51:24.580 | and keep improving.
02:51:25.580 | So it's an attitude.
02:51:27.700 | And then it doesn't take intense concentration, right?
02:51:31.820 | Good things don't happen just,
02:51:34.540 | it's not quite like TikTok or like Facebook.
02:51:38.180 | You can't scroll your way to good programming, right?
02:51:40.500 | There are sincere hours of deep,
02:51:44.720 | don't be afraid of the deep problem.
02:51:46.020 | Like often people will run away from something
02:51:47.700 | because oh, I can't solve this.
02:51:49.020 | And you might be right, but give it an hour.
02:51:51.340 | Give it a couple of hours and see.
02:51:53.340 | And just five minutes, not gonna give you that.
02:51:56.540 | - Was it lonely when you were building SciPy and NumPy?
02:52:00.500 | - Hugely, yeah, absolutely lonely in the sense of
02:52:04.020 | you had to have an inner drive.
02:52:05.780 | And that inner drive for me always comes from,
02:52:07.980 | I have to see that this is right in some angle.
02:52:11.620 | I have to believe it, that this is the right approach,
02:52:13.340 | the right thing to do.
02:52:14.620 | With SciPy, it was like, oh yeah,
02:52:16.380 | the world needs libraries and Python.
02:52:19.080 | Clearly Python's popular enough
02:52:20.740 | with enough influential people to start.
02:52:22.980 | And it needs more libraries.
02:52:24.660 | So that is a good in and of itself.
02:52:26.620 | So I'm gonna go do that good.
02:52:28.380 | So find a good, find a thing that you know is good
02:52:30.380 | and just work on it.
02:52:32.140 | So that has to happen, and it is.
02:52:34.700 | And you kind of have to have enough realization
02:52:37.020 | of your mission to be okay with the naysayer
02:52:40.260 | or the fact that not everybody joins you up front.
02:52:42.180 | In fact, one thing I've talked to people a lot,
02:52:43.500 | I've seen a lot of projects come and some fail.
02:52:45.500 | Not everything I've done has actually worked perfectly.
02:52:47.620 | I've tried a bunch of stuff that, okay,
02:52:49.140 | that didn't really work, or this isn't working and why.
02:52:51.940 | But you see the patterns.
02:52:53.660 | And one of the key things is you can't even know
02:52:57.340 | for six months, I say 18 months right now.
02:53:00.220 | If you're just starting a new project,
02:53:01.780 | you gotta give it a good 18-month run
02:53:03.200 | before you even know if the feedback's there.
02:53:05.300 | Like, you're not gonna know in six months.
02:53:07.860 | You might have the perfect thing,
02:53:08.740 | but six months from now, it's still kinda still emerging.
02:53:11.500 | So give it time, 'cause you're dealing with humans,
02:53:13.360 | and humans have an inertia energy
02:53:15.940 | that just doesn't change that quickly.
02:53:18.980 | - Let me ask a silly question.
02:53:20.900 | But, you know, like you said,
02:53:23.540 | you're focused on the sales side of things currently.
02:53:26.100 | But back when you were actively programming,
02:53:28.940 | maybe in the '90s, you talked about IDs.
02:53:31.660 | What's your, a setup that you have that brings you joy?
02:53:36.180 | Keyboard, number of screens, Linux?
02:53:39.620 | - I do still like to program some,
02:53:40.900 | I just not as much as I used to.
02:53:42.140 | I have two projects I'm super interested in,
02:53:44.500 | trying to find funding for 'em,
02:53:45.620 | trying to figure out some good teams for 'em,
02:53:47.140 | but I could talk about those.
02:53:48.980 | But what I, yeah, what, I'm an Emacs guy.
02:53:51.940 | - Great, thank the superior editor, everybody.
02:53:56.060 | I've got, I don't often delete tweets,
02:53:58.980 | but one of the tweets I deleted
02:54:00.540 | when I said Emacs was better than Vim,
02:54:02.780 | and then the hate I got from them.
02:54:04.580 | - It is.
02:54:05.420 | - I was like, I'm walking away from this.
02:54:07.100 | I bored.
02:54:07.940 | - I do too, I don't push it.
02:54:09.100 | I mean, I'm not.
02:54:09.940 | - I'm just joking, of course.
02:54:11.060 | - Yeah, exactly, it's kinda like,
02:54:12.140 | but people do take the editor seriously, right?
02:54:14.500 | - I did as a joke.
02:54:15.340 | - It's your life.
02:54:16.540 | - It is, but there's something beautiful to me about Emacs,
02:54:20.780 | but for people that love Vim,
02:54:22.380 | there's something beautiful to them about that.
02:54:23.220 | - There is, I mean, I do use Vim for quick editing,
02:54:26.260 | like command line, if I send quick editing,
02:54:27.860 | I will still sometimes use it, but not much.
02:54:30.300 | Like, it's simple, corrective,
02:54:31.660 | corrective single edit character.
02:54:32.780 | - So when you were developing SciPy,
02:54:34.100 | you were using Emacs?
02:54:34.940 | - Emacs, yep.
02:54:35.860 | SciPy and NumPy are all written in Emacs on a Linux box,
02:54:39.020 | and CVS, and then SVN, version control.
02:54:43.140 | Git came later.
02:54:44.820 | I love distributed branch stuff.
02:54:48.060 | I think Git is pretty complicated, but I love the concept.
02:54:51.620 | And also, of course, GitHub, and then GitLab,
02:54:55.220 | make Git definitely consumable, but that came later.
02:54:59.420 | - Did you ever touch Lisp at all?
02:55:01.220 | What were your emotional feelings about all the parentheses?
02:55:04.580 | - Great question.
02:55:05.420 | So I find myself appreciating Lisp today
02:55:08.180 | much more than I did early,
02:55:09.660 | 'cause when I came to programming,
02:55:10.940 | I knew programming, but I was a domain expert, right?
02:55:12.940 | And to me, the parentheses were in the way.
02:55:15.660 | It's like, "Wow, it's just all this."
02:55:17.740 | Like, it just gets in the way of my thinking
02:55:19.260 | about what I'm doing, so why would I have all these, right?
02:55:22.380 | That was my initial reaction to it.
02:55:24.500 | You know, and now as I appreciate kind of the structure
02:55:27.260 | that kind of naturally maps to a logical thinking
02:55:30.260 | about a program, I can appreciate them, right?
02:55:32.940 | And why it's actually, you could create editors
02:55:35.660 | that make it not so problematic, right, honestly.
02:55:40.740 | So I actually have a much more appreciation of Lisp
02:55:43.020 | and things like Clojure, and there's Hy-Vee,
02:55:44.700 | which is a Python, you know, a Lisp
02:55:46.180 | that compiles the Python bytecode.
02:55:48.540 | I think it's challenging.
02:55:50.300 | Like, typically, these languages are, you know,
02:55:53.140 | I even saw a whole data science programming system in Lisp
02:55:56.100 | that somebody created, which is, you know, cool.
02:55:58.500 | But again, I think it's the lack of recognition
02:56:00.840 | of the fact that there exists
02:56:02.020 | what I call occasional programmers.
02:56:04.060 | People that are never gonna be programmers for a living.
02:56:05.820 | They don't want to have all this cuteness in their head.
02:56:08.420 | They want just, you know, it's why BASIC, you know,
02:56:11.860 | Microsoft had the right idea with BASIC
02:56:14.460 | in terms of having that be the language of Visual BASIC,
02:56:17.660 | the language of Excel and SQL Server.
02:56:21.260 | They should have converted that to Python 10 years ago.
02:56:23.500 | Like, the world would be a better place if they had, but.
02:56:27.180 | - There's also, there's a beauty and a magic
02:56:29.660 | to the history behind a language in Lisp.
02:56:31.620 | You know, some of the most interesting people
02:56:34.420 | in the history of computer science
02:56:35.860 | and artificial intelligence have used Lisp, so.
02:56:38.340 | - Yes.
02:56:39.180 | - You feel.
02:56:40.020 | - Well, it's back to that language.
02:56:41.220 | When you have a language, you can think in it.
02:56:43.060 | - Yeah.
02:56:43.900 | - And it helps you think better.
02:56:44.720 | - And it attracts certain kinds of people
02:56:45.660 | that think in a certain kind of way,
02:56:46.900 | and then that's there.
02:56:48.580 | Okay, so what about, like, small laptop
02:56:50.980 | with a tiny keyboard, or is there like three screens?
02:56:55.020 | - You know, good question.
02:56:55.860 | I've never gotten into the many screens, to be honest.
02:56:58.100 | I mean, and maybe it's because in my head,
02:57:00.700 | I kind of just, I just swap between windows.
02:57:03.460 | Like, partly because I guess I really can't process
02:57:07.480 | three screens at once anyway.
02:57:09.220 | Like, I just am looking at one, and I just flip.
02:57:12.580 | You know, I flip an application open.
02:57:14.460 | - So what about--
02:57:15.740 | - Where it's really helpful is actually
02:57:17.340 | when I'm trying to, you know, here's data,
02:57:19.060 | and I want to input it from here.
02:57:19.900 | Like, this is the only time I really need another screen.
02:57:22.260 | - So now, because you're both a developer,
02:57:24.860 | lead developers, but then there's also these businesses,
02:57:27.860 | and there's salespeople,
02:57:29.060 | and you're working with large companies.
02:57:30.860 | - Operations people, hiring people, yeah.
02:57:32.500 | - The whole thing.
02:57:33.380 | Which operating system is your favorite still,
02:57:35.700 | at this point?
02:57:37.260 | - So Linux was the early days.
02:57:38.940 | - So yeah, I love Linux as a server side,
02:57:41.460 | and it was early days I had my own Linux desktop.
02:57:44.340 | I've been on Mac laptops for 10 years now.
02:57:47.820 | - Yeah, this is what leadership looks like.
02:57:50.060 | (laughing)
02:57:50.900 | You switch to Mac.
02:57:52.780 | Okay, great.
02:57:53.780 | - Pretty much, I mean, just the fact that I had
02:57:56.460 | to do PowerPoints, I had to do presentations,
02:57:58.760 | and you know, plug in, I just couldn't mess
02:58:01.220 | with plugging in laptops.
02:58:02.260 | It wouldn't project, and yeah.
02:58:04.420 | - So you mentioned, so Quonset Labs, and things like that.
02:58:08.380 | Can you give advice on how to hire great programmers,
02:58:13.620 | and great people?
02:58:14.580 | - Yeah, I would say, produce an open source project.
02:58:18.000 | - Yeah.
02:58:19.980 | - Get people contributing to it, and hire those people.
02:58:21.540 | - Yeah.
02:58:22.540 | I mean, you're doing it sort of,
02:58:25.060 | you may be perhaps a little biased,
02:58:27.060 | but that's probably 100% really good advice.
02:58:30.300 | - I find it hard to hire.
02:58:31.820 | I still find it hard to hire.
02:58:32.860 | Like, in terms of, I don't think,
02:58:35.620 | it's not hard to hire if I've worked with somebody
02:58:37.500 | for a couple of weeks.
02:58:39.300 | But an hour or two of interviews, I have no idea.
02:58:43.580 | - So that instinct, that radar of knowing
02:58:47.300 | if you're good or not, you've found
02:58:49.700 | that you're still not able to really--
02:58:50.780 | - It's really hard, I mean, the resume can help,
02:58:53.220 | but again, the resume is like a presentation
02:58:55.540 | of the things they want you to see,
02:58:56.940 | not the reality of, and there's also,
02:59:01.920 | you have to understand what you're hiring for.
02:59:03.960 | There are different stages and different kinds of skills,
02:59:06.800 | and so it isn't just a, one of the things
02:59:09.740 | I talk a lot about internally at my company
02:59:12.600 | is just that the whole idea of measuring ourselves
02:59:16.100 | against a single axis is flawed,
02:59:18.620 | 'cause we're not, it's a multidimensional space,
02:59:20.620 | and how do you order a multidimensional space?
02:59:22.120 | There isn't one ordering.
02:59:23.440 | So this whole idea, you immediately have projected
02:59:26.160 | into a thing, and you're talking about hiring
02:59:28.200 | or best or worst or better or not better.
02:59:30.660 | So what is the thing you're actually needing,
02:59:33.500 | and you can hire for that.
02:59:35.980 | There is such a thing, generally I really value people
02:59:39.040 | who have the affect, that care about open source.
02:59:42.920 | Like so in some cases, their affinity to open source
02:59:45.720 | is simply kind of a filter of an affect.
02:59:48.120 | However, I have found this interesting dichotomy
02:59:52.560 | between open source contributors and product creation.
02:59:57.120 | I don't know if it's fully true,
03:00:00.580 | but there does seem to be the more experience,
03:00:04.960 | the more affect somebody has to an open source community,
03:00:08.160 | the less ability to actually produce product that they have.
03:00:11.640 | But the opposite's kind of true too.
03:00:13.520 | The more product focused are, I find a lot of people,
03:00:16.000 | I've talked to a lot of people
03:00:16.840 | who produce really great products,
03:00:18.000 | and they have a, they're looking over
03:00:20.320 | the open source communities,
03:00:21.160 | kind of wanting to participate and play,
03:00:23.300 | but they've played here, and they do a great job here,
03:00:25.960 | and then they don't necessarily have some of the same.
03:00:29.520 | Now I don't think that's entirely necessary.
03:00:32.060 | I think part of it is cultural, how they've emerged.
03:00:34.860 | 'Cause one of the things that open source communities
03:00:36.300 | often lack is great product management,
03:00:39.140 | like some product management energy.
03:00:40.780 | - That's brilliant, but you want both of those energies
03:00:43.620 | in the same place together.
03:00:44.900 | - Yes, you really do.
03:00:45.860 | And so it's a lot of it's creating these teams of people
03:00:48.100 | that have these needed skills and attributes that are hard.
03:00:51.880 | And so one of the big things I look for
03:00:54.660 | is somebody that fundamentally recognizes
03:00:56.340 | their need to learn.
03:00:57.820 | Like one of the values that we have
03:00:59.580 | in all of the things we do is learning.
03:01:01.540 | If somebody thinks they know it all,
03:01:04.580 | they're gonna struggle.
03:01:06.220 | - And some of that is just,
03:01:07.920 | there's more basic things like humility,
03:01:10.580 | just being humble in the face of
03:01:13.340 | all the things you don't know,
03:01:14.420 | and that's step one of learning.
03:01:15.820 | - That's step one of learning, right?
03:01:16.940 | And I've spent a lot of time learning, right?
03:01:20.840 | Other people have spent a lot more time,
03:01:21.860 | but I've spent a lot of time learning.
03:01:23.260 | My whole goal was to get a PhD because I love school,
03:01:26.320 | and I wanted to be a scientist.
03:01:28.260 | And then what I found is what's been written
03:01:30.940 | about elsewhere as well is the more I learned,
03:01:32.620 | the more I didn't know.
03:01:33.780 | The more I realized, man, I know about this,
03:01:37.700 | but this is such a tiny thing in the global scope
03:01:40.060 | of what I might wanna know about.
03:01:41.220 | So I need to be listening a whole lot better
03:01:43.820 | than I am just talking.
03:01:46.340 | That's changed a little bit, actually.
03:01:48.820 | My wife says that I used to be a better listener.
03:01:50.620 | Now that I'm so full of all these ideas I wanna do,
03:01:52.860 | she kind of says, "You gotta give people time to talk."
03:01:55.500 | - So you've succeeded on multiple dimensions.
03:01:58.400 | So one is the tenure track faculty.
03:02:01.680 | The other is just creating all these products,
03:02:03.060 | then building up the businesses,
03:02:04.300 | then working with businesses.
03:02:06.860 | Do you have advice for young people today
03:02:09.220 | in high school and college of how to live a life
03:02:13.900 | as nonlinear and as successful as yours?
03:02:17.700 | - Nonlinear.
03:02:18.540 | - A life that they could be proud of?
03:02:21.220 | - Well, Lex, that's a super compliment.
03:02:22.980 | I'm humbled by that, actually.
03:02:24.220 | I would say a life they can be proud of,
03:02:27.980 | honestly, one thing that I've said to people is,
03:02:30.420 | first, find people you love and care about them.
03:02:33.540 | Family matters to me a lot.
03:02:36.060 | And family means people you love and have committed to.
03:02:38.940 | So it can be whatever you mean by that,
03:02:42.180 | but you need to have a foundation.
03:02:45.160 | So find people you love and wanna commit to and do that,
03:02:47.960 | 'cause it anchors you in a way that nothing else can.
03:02:52.220 | And then you find other things.
03:02:55.220 | And then from out there, you find other kinds of things
03:02:57.860 | you can commit to, whether it's ideas or people
03:03:01.900 | or groups of people.
03:03:03.260 | So especially in high school, I would say,
03:03:06.800 | don't settle on what you think you know.
03:03:08.840 | Give yourself 10 years to think about the world.
03:03:13.780 | I see a lot of high school students
03:03:15.460 | who seem to know everything already.
03:03:17.620 | I think I did, too.
03:03:18.700 | I think it's maybe natural.
03:03:20.340 | But recognize that the things you care about,
03:03:23.180 | you might change your perspective over time.
03:03:26.540 | I certainly have over time.
03:03:28.580 | I was really passionate about one specific thing
03:03:30.660 | and that's kind of softened.
03:03:32.540 | I was a big, I didn't like the Federal Reserve.
03:03:35.400 | We can have a longer conversation
03:03:38.460 | about monetary policy and finances,
03:03:40.100 | but I'm a little more nuanced
03:03:44.300 | in my perspective at this point.
03:03:46.740 | But that's one area where you learn about something
03:03:50.140 | and go, oh, I wanna attack it.
03:03:52.420 | Build, don't destroy.
03:03:54.120 | Build, so often the tendency is to not like something,
03:03:58.380 | they wanna go attack it.
03:03:59.980 | Build something, build something to replace it.
03:04:02.580 | Build up, attract people to your new thing.
03:04:05.040 | It'd be far better.
03:04:08.820 | You don't need to destroy something
03:04:10.220 | to build something else.
03:04:11.480 | So that's, I guess, generally.
03:04:14.520 | And then definitely curiosity.
03:04:19.140 | Follow your curiosity.
03:04:20.840 | And let it, don't just follow the money.
03:04:24.620 | - And all of that, like you said,
03:04:25.800 | is grounded in family, friendship, and ultimately love.
03:04:30.140 | - Yes.
03:04:31.180 | - Which is a great way to end it.
03:04:34.660 | Travis, you're one of the most impactful people
03:04:37.080 | in the engineering and the computer science
03:04:38.740 | in the human world.
03:04:39.900 | So I truly appreciate everything you've done.
03:04:43.520 | And I really appreciate that you would spend
03:04:45.780 | your valuable time with me.
03:04:46.980 | It was an honor.
03:04:47.820 | - It was a real pleasure for me.
03:04:48.820 | I appreciate that.
03:04:50.500 | - Thanks for listening to this conversation
03:04:52.100 | with Travis Oliphant.
03:04:53.980 | To support this podcast,
03:04:55.340 | please check out our sponsors in the description.
03:04:57.900 | And now, let me leave you with something
03:05:00.200 | that in the programming world is called Hodgkin's Law.
03:05:04.060 | Every sufficiently advanced Lisp application
03:05:08.120 | will eventually be re-implemented in Python.
03:05:11.700 | Thank you for listening, and hope to see you next time.
03:05:15.460 | (upbeat music)
03:05:18.040 | (upbeat music)
03:05:20.620 | [BLANK_AUDIO]