back to index

The Array Cast: Jeremy Howard


Chapters

0:0
1:15 Dyalog Problem /solving Contest
2:40 Jeremy Howard
4:30 APL Study Group
10:20 AT Kearney
12:33 MKL (Intel)
13:0 BLAS
13:11 Perl BQN
14:6 Raku
15:45 kaggle
16:52 R
18:50 Neural Networks
19:50 Enlitic
20:1 Fast.ai
21:2 Numpy
21:26 Leading Axis Theory
21:31 Rank Conjunction
21:40 Einstein notation
22:55 CUDA
28:51 Numpy Another Iverson Ghost
30:11 Pivot Tables
30:36 SQL
31:25 Larry Wall "The three chief virtues of a programmer are: Laziness, Impatience and Hubris."
32:0 Python
36:25 Regular Expressions
36:50 PyTorch
37:39 Notation as Tool of Thought
37:55 Aaron Hsu codfns
38:40 J
39:6 Eric Iverson on Array Cast
40:18 Triangulation Jeremy Howard
41:48 Google Brain
42:30 RAPIDS
43:40 Julia
43:50 llvm
44:7 JAX
44:21 XLA
44:32 MILAR
44:42 Chris Lattner
44:53 Tensorflow
49:33 torchscript
50:9 Scheme
50:28 Swift
51:10 DragonBox Algebra
52:47 APL Glyphs
53:24 Dyalog APL
54:24 Jupyter
55:44 Jeremy's tweet of Meta Math
56:37 Power function
63:6 Reshape
63:40 Stallman 'Rho, rho, rho'
64:20 APLcart
66:12 J for C programmers
67:54 Transpose episode
70:0 APLcart video
72:28 Functional Programming
73:0 List Comprehensions
73:30 BQN to J
78:15 Einops
79:30 April Fools APL
80:35 Flask library
81:22 JuliaCon 2022
88:5 Myelination
89:15 Sanyam Bhutani interview
91:27 Jo Boaler Growth Mindset
93:45 Discovery Learning
97:5 Iverson Bracket
99:14 Radek Osmulski Meta Learning
100:12 Top Down Learning
101:20 Anki
103:50 Lex Fridman Interview

Whisper Transcript | Transcript Only Page

00:00:00.880 | Welcome to another episode of ArrayCast. I'm your host, Connor. And today we have a very
00:00:05.440 | exciting guest, which we will introduce in a second. But before we do that, we'll do brief
00:00:09.440 | introductions and then one announcement. So first we'll go to Bob and then we'll go to Adam who has
00:00:13.120 | the one announcement. And then we will introduce our guest. I'm Bob Terrio. I'm a J enthusiast
00:00:18.400 | and I do some work with the J Wiki. We're underway and trying to get it all set up for the fall.
00:00:24.800 | I'm Adam Botzewski, full-time APL programmer at Dialog Limited. Besides for actually programming
00:00:30.640 | APL, I also take care of all kinds of social things, including the APL Wiki. And then for my
00:00:37.680 | announcements, part of what we do with Dialog is arrange a yearly user meeting or a type of
00:00:42.880 | conference. And at that user meeting, there is also a presentation by the winner of the APL
00:00:51.840 | problem solving competition. That competition closes at the end of the month. So hurry up if
00:00:59.360 | you want to participate. It's not too late even to get started at this point. And also at the end of
00:01:03.520 | the month is the end of the early bird discount for the user meeting itself. Awesome. And just
00:01:10.560 | a note about that contest. I think, and Adam can correct me if I'm wrong, there's two phases in
00:01:15.360 | the first phase. It's just 10 short problems. A lot of them are just one-liners. And even if
00:01:20.560 | you only solve one of the 10, I think you can win a small cash prize just from answering one.
00:01:26.960 | Is that correct? I'm not even sure. You might need to solve them all. They're really easy.
00:01:36.160 | So the point being though is that you don't need to complete the whole contest in order to be
00:01:39.680 | eligible to win prizes. No, for sure. There's a certain amount that if you get to that point,
00:01:44.160 | you hit a certain threshold and you can be eligible to win some free money, which is always
00:01:48.240 | awesome. And yeah, just briefly, as I introduce myself in every other episode, I'm your host,
00:01:54.640 | Connor, C++ professional developer, not an array language developer in my day-to-day,
00:01:59.680 | but a huge array language and combinator enthusiast at large, which brings us to introducing our
00:02:06.240 | guest who is Jeremy Howard, who has a very, very, very long career. And you probably have heard him
00:02:13.920 | on other podcasts or have been giving other talks. I'll read the first paragraph of his
00:02:19.200 | three-paragraph bio because I don't want to embarrass him too much, but he has
00:02:22.880 | a very accomplished career. So Jeremy Howard is a data scientist, researcher, developer,
00:02:28.320 | educator, and entrepreneur. He is the founding researcher at FastAI, a research institute
00:02:34.320 | dedicated to making deep learning more accessible and is an honorary professor at the University of
00:02:38.800 | Queensland. That's in Australia, I believe. Previously, Jeremy was a distinguished research
00:02:43.520 | scientist at the University of San Francisco, where he was the founding chair of the Wicklow
00:02:47.600 | artificial intelligence and medical medical research initiative. He's also been the CEO of
00:02:53.120 | analytic and was the president and chief scientist of Kegel, which is the basically data science
00:02:59.760 | version of leak code, which many software developers are familiar with. He was the CEO of two
00:03:04.400 | successful Australian startups, Fastmail and Optimal Decisions Group. And before that,
00:03:08.400 | in between doing a bunch of other things, he worked in management consulting at McKinsey,
00:03:12.960 | which is an incredibly interesting start to the career that he has had now, because for those of
00:03:18.240 | you that don't know, McKinsey is one of the three biggest management consulting firms alongside,
00:03:22.720 | I think, Bain & Co. and BCG. So I'm super interested to hear how he started in management
00:03:27.280 | consulting and ended up being the author of one of the most popular AI libraries in Python and also
00:03:33.520 | the course that's attached to it, which I think is, if not, you know, the most popular, a very,
00:03:38.800 | very popular course that students all around the world are taking. So I will stop there,
00:03:42.800 | throw it over to Jeremy, and he can fill in all the gaps that he wants, jump back to however far
00:03:47.440 | you want to, to tell us, you know, how you got to where you are now. And I think the one thing I
00:03:53.120 | forgot to mention, too, is that he recently tweeted on July 1st, and we're recording this on July 4th,
00:03:58.720 | that he quote the tweets, reads, "Next week, I'm starting a daily study group on my most loved
00:04:03.440 | programming language, APL." And so obviously interested to hear more about that tweet and
00:04:08.560 | what's going to be happening with that study group. So over to you, Jeremy.
00:04:11.040 | Well, the study group is starting today as we record this. So depending on how long it takes to
00:04:19.280 | get this out, it'll have just started. And so definitely time for people to join in. So we'll,
00:04:26.640 | I'm sure we'll include a link to that in the show notes. Yeah, I definitely feel kind of like I'm
00:04:32.480 | your least qualified array programming person ever interviewed on this show. I love APL and J,
00:04:43.520 | but I've done very, very little with them, particularly APL. I've done a little,
00:04:48.960 | little bit with J mucking around, but like, I find a couple of weeks here and there every
00:04:54.480 | few years, and I have for a couple of decades. Having said that, I am a huge enthusiast of
00:05:04.320 | array programming, as it is used, you know, in a loopless style in other languages, initially in
00:05:12.480 | Pell, and nowadays in Python. Yeah, maybe I'll come back to that, because I guess you wanted to get a
00:05:18.400 | sense of my background. Yeah, so I actually started at McKinsey. I grew up in Melbourne, Australia. And
00:05:28.640 | I didn't know what I wanted to do when I grew up at the point that you're meant to know when you
00:05:34.240 | choose a university, you know, major. So I picked philosophy on the basis that it was like,
00:05:39.920 | you know, the best way of punting down the road what you might do, because with philosophy,
00:05:45.360 | you can't do anything. And honestly, that kind of worked out in that I needed money,
00:05:54.480 | and I needed money to get through university. So I got over like one day a week, kind of IT
00:05:59.680 | support job at McKinsey, the McKinsey Melbourne office during university from first year,
00:06:07.600 | I think that's from first year. But it turned out that like, yeah, I was very curious, and so I'm
00:06:15.280 | so curious about management consulting. So every time consultants would come down and ask me to
00:06:18.720 | like, you know, clean out the sticky coke they built in their keyboard or whatever, I would
00:06:24.800 | always ask them what they were working on and ask them to show me and I've been really interested in
00:06:31.760 | like doing analytics see kind of things for a few years at that point. So during high school,
00:06:38.080 | basically every holidays, I kind of worked on stuff with spreadsheets or Microsoft access or
00:06:44.000 | whatever. So it turned out I knew more about like, stuff like Microsoft Excel than they did. So
00:06:50.320 | within about two months of me starting this one day a week job, I was working 90 hour weeks,
00:06:57.120 | basically doing analytical work for the consultants. And so that, you know, that actually worked out
00:07:05.920 | really well, because I kind of did a deal with them where they would, they gave me a full time
00:07:11.920 | office, and they would pay me $50 an hour for whatever time I needed. And so suddenly, I was
00:07:17.760 | actually making a lot of money, you know, working, working 90 hours a week. And yeah, it was great
00:07:28.560 | because then the I would come up with these solutions to things they're doing in the projects,
00:07:33.120 | and I'd have to present it to the client. So next thing I knew I was basically on the client side
00:07:37.040 | or all the time. So I ended up actually not going to any lectures at university. And I somehow kind
00:07:45.280 | of managed this thing where I would take two weeks off before each exam, go and talk to all my
00:07:50.720 | lecturers and say, Hey, I was meant to be in your university course. I know you didn't see me, but I
00:07:55.040 | was kind of busy. Can you tell me what I was meant to have done? And I would do it. And so I kind of
00:08:01.440 | scraped by a BA in philosophy, but I don't Yeah, you know, I don't really have much of an academic
00:08:08.080 | background. But that did give me a great background in like applying stuff like, you know,
00:08:15.280 | linear regression and logistic regression and linear programming and, you know,
00:08:19.760 | the basic analytical tools of the day, generally through VBA scripts in Excel, or, you know,
00:08:27.280 | access, you know, the kind of stuff that a consultant could chuck out, you know, on to their
00:08:32.560 | laptop at a client site. Anyway, I always felt guilty about doing that, because it just seemed
00:08:40.800 | like this ridiculously nerdy thing to be doing when I was surrounded by all these very important,
00:08:46.000 | you know, consultant types who seemed to be doing much more impressive strategy work. So I tried to
00:08:53.920 | get away from that as quickly as I could, because I didn't want to be the nerd in the company. And
00:09:00.480 | yeah, so I ended up spending the next 10 years basically doing strategy consulting. But throughout
00:09:06.080 | that time, I did, you know, because I didn't have the same background that they did that expertise,
00:09:12.320 | they did the MBA, they did, I had to solve things using data and analytically intensive approaches.
00:09:18.320 | So although in theory, I was a strategy management consultant, and I was working on problems like,
00:09:23.680 | you know, how do we fix the rice industry in Australia? Or, you know, how do we, you know,
00:09:29.360 | like, you know, how do we deal with this new competitor coming into this industry or whatever
00:09:33.680 | it was, I always did it by analyzing data, which actually turned out to be a good niche, you know,
00:09:40.000 | because I was the one McKinsey consultant in Australia who did things that way. And so I
00:09:44.640 | successful and I became I think, I ended up moving to AT Carney, which is the other of the two
00:09:50.800 | original management consulting firms. I think I became like the youngest manager in the world.
00:09:56.800 | And, you know, through this, we had parallel path I was doing. And then through that, learned about
00:10:03.840 | the insurance industry and discovered like the whole insurance industry is basically pricing
00:10:09.120 | things in a really dumb way. I developed this approach based on optimization of optimized
00:10:17.600 | pricing, launched a company with my university friend who had a PhD in operations research.
00:10:25.360 | And, yeah, so we built this new approach to pricing insurance, which is, it was kind of fun.
00:10:34.320 | I mean, it's, you know, it went well in the set, you know, commercially took a bit of about 10
00:10:41.600 | years doing doing that. And at the same time, running an email company called fast mail,
00:10:46.960 | which also went well. Yeah, we started out basically using C++. And I would say that was
00:10:55.920 | kind of the start of my array programming journey in that in those days, this is like 1999,
00:11:00.480 | the very first expression templates based approaches to C++ numeric programming were appearing.
00:11:07.840 | And so I, you know, was talking to the people working on those libraries doing stuff like
00:11:14.960 | particularly stuff doing the big kind of high energy physics experiments that were going on in Europe.
00:11:21.040 | It was ultimately pretty annoying to work with, though, like the amount of time it talked to
00:11:28.960 | compile those things, it would take hours. And it was quirky as all hell, you know, it's still
00:11:35.600 | pretty quirky doing metaprogramming in C++. But in those days, it was just a nightmare. Every
00:11:40.800 | compiler was different. So I ended up switching to C sharp shortly after that came out. And, you know,
00:11:49.280 | move in a way it was disappointing because that that was much less expressive as a kind of array
00:11:55.760 | programming paradigm. And so instead, I ended up basically grabbing Intel's MKL library, which is
00:12:04.960 | basically a blast on steroids, if you like, and writing my own C sharp wrapper to give me,
00:12:12.640 | you know, kind of array programming ish capabilities, but not with any of the features one
00:12:17.840 | would come to expect from a real array programming language around kind of
00:12:21.040 | dealing with rank sensibly, and, you know, not much in the way of broadcasting,
00:12:26.320 | which reminds me, we should come back for talking about blasts at some stage, because a lot of the
00:12:32.480 | reasons that most languages are so disappointing at array programming is because of our reliance on
00:12:37.360 | blasts, you know, as an industry. Fastmail, on the other hand, is being written in Perl,
00:12:45.360 | which I really enjoyed as a programming language and still do, I still love Perl a lot.
00:12:50.240 | But the scientific programming in Perl I didn't love at all. And so at the time, Perl 6,
00:13:01.840 | you know, we was just starting to the idea of it was being developed. So I ended up
00:13:06.560 | running the Perl 6 working group to add scientific programming capabilities or kind of, you know,
00:13:14.400 | and at the time, I described those APL inspired programming capabilities to Perl. And so I
00:13:20.560 | did an RFC around what we ended up calling hyper operators, which is basically the idea that any
00:13:27.200 | operator can operate on arrays and can broadcast over any axes that are mismatched or whatever.
00:13:35.600 | And those RFCs all ended up getting accepted. And Damien Conway and Larry Wall kind of expanded
00:13:42.640 | them a little bit. Perl 6 never exactly happened. It ended up becoming a language called Raku.
00:13:51.680 | With the butterfly logo. Yeah. And that, you know, and the kind of the performance ideas,
00:13:58.400 | I really worked hard on, never really happened either. So that was a bit of a,
00:14:01.760 | yeah, that was all a bit of a failure. But it was fun, and it was interesting.
00:14:05.920 | I, you know, so after running these companies for 10 years, one of the big problems with running a
00:14:16.000 | company is that you're surrounding by people who you hired, and they, you know, have to make
00:14:21.600 | you like them if they want to get promoted, you know, get fired. And so you could never trust
00:14:25.120 | anything anybody says. So I was, you know, very bad, very low expectations about my capabilities,
00:14:32.960 | analytics leagues. I hadn't like, you know, I'd basically been running companies for 10 years.
00:14:37.920 | I did a lot of coding and stuff, but it was in our own little world. And so after I sold those
00:14:47.920 | companies, yeah, I, one of the things I decided to do was to try actually to become more competent,
00:14:56.640 | you know, I had lost my, to some extent, I had lost my feeling that I should hide my nerdiness,
00:15:06.240 | you know, and try to act like a real business person. And I thought, no, I should actually
00:15:11.840 | see if I'm actually any good at this stuff. So I tried entering a machine learning competition
00:15:18.720 | at a new company that had just been launched called Kaggle with this goal of like, not coming last.
00:15:26.880 | So basically, the, you know, the way these things work is you have to make predictions on a data
00:15:37.760 | set. And at the end of the competition, whoever's predictions are the most accurate wins the prize.
00:15:46.080 | And so my goal was, yeah, try not to come last, which I wasn't convinced I'd be able to achieve.
00:15:52.800 | Because as I say, I didn't feel like this is, I'd never had any technical training,
00:15:59.600 | you know, and everybody else in these competitions were PhDs and professors or whatever else. So it
00:16:03.840 | felt like a high bar. Anyway, I ended up winning it. And that, that changed my life, right? Because,
00:16:12.000 | yeah, it was like, oh, okay, I am, you know, empirically good at this thing. And people
00:16:23.520 | at my local user groups, we used quite a bit as well. You know, I told them, I'm going to try
00:16:32.560 | entering this competition. Anyone want to create a team with me? I want to learn to use R properly.
00:16:37.360 | And I kind of went back to the next user group meeting and people were like, I thought you were
00:16:41.040 | just learning this thing. How did you win? I was like, I don't know. I just used common sense.
00:16:47.840 | Yeah, so I ended up becoming the chief scientist and president of Kaggle. And Kaggle, as you know,
00:16:54.320 | anybody in the data science world knows, has kind of grown into this huge, huge thing, ended up
00:16:59.760 | selling it to Google. So I ended up being an equal partner in the company. I was the first
00:17:04.080 | investor in it. And that was great. That was like, I just dove in, we moved to San Francisco for 10
00:17:11.760 | years. You know, surrounded, surrounded by all these people who are just sort of role models
00:17:18.400 | and idols, and partly getting to meet all these people in San Francisco was this experience of
00:17:24.880 | realizing all these people were actually totally normal, you know, and they weren't like some
00:17:30.160 | super genius level, like they're just normal people who, yeah, as I got to know them,
00:17:38.720 | it gave me, I guess, a lot more confidence in myself as well. So maybe they were just normal
00:17:44.720 | relative to you. I think in Australia, we all feel a bit, you know, intimidated by the rest of the
00:17:53.840 | world in some ways, or a long way away, you know, our only neighbors really have a New Zealand.
00:17:59.680 | It's very easy to feel, I don't know, like, yeah, we were not very
00:18:07.280 | confident about capabilities over here, other than in sport, perhaps.
00:18:13.040 | Yeah, so one of the things that happened well as a Kaggle was, I had played around with neural
00:18:20.480 | networks a bit, a good bit, you know, like 20 years earlier. And I always felt like neural networks
00:18:26.720 | were one day going to be the thing. It's like, you know, they are at a theoretical level,
00:18:34.080 | infinitely capable. But, you know, they never quite did it for me. And
00:18:41.760 | but then in 2012, suddenly, neural networks started achieving superhuman performance for
00:18:49.120 | the first time on really challenging problems, like recognizing traffic signs, you know,
00:18:54.080 | like recognizing pictures. And I'd always said to myself, I was going to watch for this moment,
00:19:00.160 | and when it happened, I wanted to like, jump on it. So as soon as I saw that, I tried to jump on
00:19:04.800 | it. So I started a new company, after a year of research into like the, you know, what what a
00:19:12.320 | neural network's going to do, I decided medicine was going to be huge, I need nothing about medicine.
00:19:18.160 | And I, yeah, I started a medicine company to see what we could do with deep learning in medicine.
00:19:23.200 | So that was analytic. Yeah, that ended up going pretty well. And yeah, eventually, I kind of got
00:19:33.200 | like a bit frustrated with that, though, because it felt like big learning can do so many things,
00:19:39.120 | and I'm only doing such a small part of those things. So deep learning is like neural networks
00:19:44.000 | with multiple layers. I thought the only way to actually help people really, you know, make the
00:19:51.520 | most of this incredibly valuable technology is to teach other people how to do it, and to help
00:19:56.800 | other people to do it. So my wife and I ended up studying a new, I'd call it kind of a research
00:20:02.560 | lab, fast AI, to, to help, to help do that, basically, initially focus on education,
00:20:09.760 | and then increasingly focus on research and software development to basically make it
00:20:15.520 | easier for folks to use some deep learning. And that's, yeah, that's where I am now. And that
00:20:23.280 | everything in deep learning is all Python. And in Python, we're very lucky to have,
00:20:30.080 | you know, excellent libraries that behave pretty consistently with each other,
00:20:36.160 | basically based around this NumPy library, which treats arrays very, very similarly to how
00:20:45.440 | Jay does, except rather than leading access, it's trailing access. But basically, you get,
00:20:51.920 | you know, you get loop free, you get broadcasting, you know, you don't get things like a rank
00:20:57.760 | conjunction, but there's very easy ways to permute axes. So you can do basically the same thing.
00:21:05.200 | Things like Einstein notation, you know, the built into the libraries, and then, you know, it's,
00:21:11.040 | it's trivially easy to have them run on GPUs or TPUs or whatever, you know, so it's for the last
00:21:20.400 | years of my life, nearly all the code I write is array programming code, even though I'm not
00:21:28.400 | using a purely array language. All right, so where do we start now with the questions?
00:21:35.760 | I'll let Bob and Adam go first if they want. And if they if they don't have a Okay, Bob, you go ahead.
00:21:44.080 | I've got a quick question about about neural networks and stuff. Because when I was going to
00:21:49.360 | university all those years ago, people were talking about neural networks, and then they just sort of
00:21:54.240 | dropped off the face. And as you said, around 2010, suddenly they resurfaced again. What do you think
00:21:59.520 | was the cause of that resurfacing? Was it hardware? Was it somebody discovered a new method or what?
00:22:04.480 | Yeah, mainly hardware. So what happened was people figured out how to do GP GPU, so general purpose
00:22:12.480 | GPU computing. So before that, I tried a few times to use GPUs with neural nets, I felt like that would
00:22:18.560 | be the thing. But GPUs were all about like creating shaders and whatever. And it was a whole jargon
00:22:25.840 | thing. I didn't even understand what was going on. So the key thing was in video coming up with this
00:22:31.680 | CUDA approach, which it's it's all loops, right? But it's much easier than the old way, like the
00:22:42.080 | loops, you basically, it's kind of loops, at least you basically say to CUDA, this is my kernel,
00:22:48.640 | which is the piece of code I want to basically run on each symmetric multiprocessing unit.
00:22:52.960 | And then you basically say launch a bunch of threads. And it's going to call your kernel,
00:23:00.080 | you know, basically incrementing the x and y coordinates and passing it to your kernel,
00:23:06.000 | making them available to your kernel. So it's a kind of it's not exactly a loop,
00:23:09.440 | but it's this gets more like a map, I guess. And so when CUDA appeared, yeah, very quickly,
00:23:16.320 | neural network libraries appear to take advantage appear appeared that would take advantage of it.
00:23:21.680 | And then suddenly, you know, you get orders of magnitude more performance. And it's cheaper.
00:23:28.240 | And you get to buy an Nvidia graphics card with a free copy of Batman, you know, on the excuse that
00:23:34.880 | actually this is all for work. So it was it was mainly that there's also this just like at the
00:23:41.920 | same time, the thing I'd been doing for 25 years, suddenly got a name data science, you know, we
00:23:49.440 | like this very small industry of people like applying data driven approaches to solving
00:23:54.960 | business problems. And we were always looking for a name. Not many people know this, but back in the
00:24:00.800 | very early days, there was an attempt to calling it industrial mathematics. Sometimes people would
00:24:06.480 | like shoehorn it into operations research or management science, but that was almost exclusively
00:24:11.680 | optimization people and specifically people focused more on linear programming approaches.
00:24:17.440 | So yeah, once data science appeared, and also like, you know, basically every company had
00:24:23.360 | finally built their data warehouse and the data was was there. So yeah, it's like more awareness
00:24:32.560 | of using data to solve business problems and for the first time availability of the hardware that
00:24:37.520 | we actually needed. And as I say, in 2012, it just it's it reached the point like it been growing
00:24:44.400 | since the first neural network was built in was at 1957, I guess, that this kind of gradual
00:24:53.040 | rate, but once it passed human performance on some tasks, it just kept going. And so now,
00:25:00.400 | in the last couple of months, you know, it's now like getting decent marks on MIT math tests and
00:25:08.800 | stuff. It's it's, it's on an amazing trajectory. Yeah, it's kind of a critical mass kind of thing,
00:25:16.080 | you get a certain amount of information and able to process and information it, I guess, as you
00:25:22.800 | as you do with your hand, it's an exponential curve. And humans and exponential curves,
00:25:28.720 | I think we're finding over and over again, we're not really great at understanding an exponential.
00:25:34.080 | No, no, we're not. And that's like why I promised myself that as soon as I saw neural net starting
00:25:41.440 | to look like they're doing interesting things, I would drop everything and jump on it, because I
00:25:45.360 | wanted to jump on that curve as early as possible. And we're now in this situation where people are
00:25:50.960 | just making huge amounts of money with neural nets, which they then reinvest back into making the
00:25:57.360 | neural nets better. And so we are also seeing this kind of bifurcation of capabilities where there's
00:26:03.920 | a small number of organizations who are extremely good at this stuff and invested in it and a lot
00:26:09.680 | of organizations that are, you know, really struggling to figure it out. And because of the
00:26:17.680 | exponential nature, when it happens, it happens very quickly, it feels like you didn't see it
00:26:22.240 | coming. And suddenly, it's there. And then it was past you. And I think you're all experiencing that
00:26:26.720 | now. Yeah, and it's happened in so many industries, you know, back in my medical startup, you know,
00:26:34.800 | we were interviewing folks around medicines, we interviewed a guy finishing his PhD in
00:26:42.160 | histopathology. And I remember, you know, he came in to do an interview with us. And he basically
00:26:49.440 | gave us a presentation about his thesis on kind of graph cut segmentation approaches for pathology
00:26:54.960 | slides. And at the end, he was like, anyway, that was my PhD. And then yesterday, because I knew I
00:26:59.920 | was coming to see you guys, and I heard you like neural nets, I just thought I'd check out neural nets.
00:27:04.000 | And about four hours later, I trained a neural net to do the same thing I did for my PhD. And
00:27:11.360 | way outperformed my PhD thesis, I'd spent the last five years on and so that's where I'm at, you know,
00:27:17.360 | and we hear this a lot. Existential crisis in the middle of an interview. Yes.
00:27:24.960 | So I kind of have, I don't know, this is like a 1A, B and C. And I'm not sure if I should ask them
00:27:34.000 | all at once. But so you said sort of at the tail end of the 90s is when your array language journey
00:27:40.880 | started. But it seems from the way you explained it that you had already at some point along the
00:27:45.280 | way heard about the array languages, APL and J, and have sort of alluded to, you know, picking up
00:27:52.640 | some knowledge about the paradigm and the languages. So my first part of the question is sort of,
00:27:58.240 | you know, at what point were you exposed to the paradigm in these languages? The second part is
00:28:04.000 | what's causing you in 2022 to really dive into it? Because you said you feel like maybe a bit of an
00:28:11.600 | imposter or the least qualified guest, which probably is you just being very modest. I'm sure
00:28:16.160 | you know still quite a bit. And then the third part is, do you have thoughts about, and I've
00:28:21.680 | always sort of wondered, how the array language paradigm sort of missed out on like, and Python
00:28:28.160 | ended up being the main data science language, while like there's like an article that's floating
00:28:34.480 | around online called NumPy, the ghost of Iverson, which it's this sort of you can see that in the
00:28:40.640 | names and the design of the library that there is an core of APL and even the documentation
00:28:45.760 | acknowledges that it took inspiration greatly from J and APL. But that like the array languages clearly
00:28:53.040 | missed what was a golden opportunity for their paradigm. And we ended up with libraries and
00:29:00.080 | other languages. So I just asked three questions at once. Feel free to tackle them in any order.
00:29:04.800 | I have a pretty bad memory. So I think I've forgotten the second one already. So you can
00:29:09.680 | feel free to come back to any or all of them. So my journey, which is what you started with,
00:29:18.560 | was I always felt like we should do more stuff without using code. Because I, or at least like
00:29:31.440 | kind of traditional, what I guess we'd call nowadays, imperative code. There was a couple
00:29:38.800 | of tools in my early days, which I've got huge amounts of leverage from because nobody else
00:29:45.760 | in at least the consulting firms or generally in our clients knew about them. So that was SQL and
00:29:52.240 | pivot tables. And so pivot tables, if you haven't come across it, was basically one of the earliest
00:29:58.240 | approaches to OLAP, you know, slicing and dicing. There was actually something slightly earlier
00:30:02.480 | called Lotus Improv, but that was actually a separate product. Excel was basically the first
00:30:07.200 | one to put OLAP in the spreadsheet. So no loops. You just drag and drop the things you want to group
00:30:12.560 | by and you right click to choose how to summarize. And same with SQL, you know, you declaratively
00:30:19.920 | say what you want to do. You don't have to loop through things. SAS actually had something similar.
00:30:25.600 | You know, with SAS, you could basically declare a prop that would run on your data. So yeah, I
00:30:32.080 | kind of felt like this was the way I would rather do stuff if I could. And I think that's what led
00:30:39.840 | me when we started doing the C++ implementation of the insurance pricing stuff of being much more
00:30:46.320 | drawn to these metaprogramming approaches. I just didn't want to be writing loops in loops and
00:30:55.200 | dealing with all that stuff. I'm too lazy, you know, to do that. I think I'm very driven by laziness,
00:31:04.400 | which as Larry Wall said is one of the three virtues of a great programmer. Then yeah, so I think
00:31:14.080 | when as soon as I saw NumPy had reached a level of some reasonable
00:31:22.400 | confidence in Python, I was very drawn to that because it was what I've been looking for.
00:31:28.400 | And I think maybe that actually is going to bring us to answering the question of like what happened
00:31:32.480 | for array languages. Python has a lot of problems, but at its heart, it's a very well-designed
00:31:41.680 | language. It has a very small, flexible core. Personally, I don't like the way most people
00:31:48.880 | write it, but it's so flexible I've able to create almost my own version of Python,
00:31:54.640 | which is very functionally oriented. I basically have stolen the type dispatch ideas from Julia,
00:32:01.600 | created an implementation of that in Python. My Python code doesn't look like
00:32:08.080 | most Python code, but I can use all the stuff that's in Python. So there's this very nicely
00:32:15.200 | designed core of a language, which I then have this almost this DSL on top of, you know, and
00:32:21.440 | NumPy is able to create this kind of DSL again because it's working on such a flexible
00:32:28.960 | foundation. Ideally, you know, I mean, well, okay, so Python also has another DSL built into it,
00:32:36.320 | which is math. You know, I can use the operators plus times minus. That's convenient. And
00:32:41.280 | in every array library, NumPy, PyTorch, TensorFlow, and Python, those operators work
00:32:47.680 | over arrays and do broadcasting over axes and so forth and, you know, accelerate on an accelerator
00:32:54.960 | like a GPU or a TPU. That's all great. My ideal world would be that I wouldn't just get to use
00:33:03.280 | plus times minus, but I get to use all the APL symbols. You know, that would be amazing.
00:33:10.080 | But given a choice between a really beautiful language, you know, at its core like Python,
00:33:18.480 | in which I can then add a slightly cobbled together DSL like NumPy, I would much prefer
00:33:24.720 | that over a really beautiful notation like APL, but without the fantastic language underneath,
00:33:32.480 | you know, like I don't feel like I there's nothing about APL or J or K's like
00:33:40.960 | programming language that attracts me. Do you know what I mean? I feel like in terms of like
00:33:47.840 | what I could do around whether it be type dispatch or how OO is designed or, you know, how I package
00:33:57.280 | modules or almost anything else, I would prefer the Python way. So I feel like that's basically
00:34:06.160 | what we've ended up with. You kind of either compromise between, you know, a good language
00:34:10.720 | with, you know, slightly substandard notation or amazingly great notation with the substandard
00:34:17.840 | language or not just language, but ecosystem. Python has an amazing ecosystem.
00:34:25.600 | I think I hope one day we'll get the best of both, right? Like here's my, okay, here's my
00:34:35.200 | controversial take and it may just represent my lack of knowledge. What I like about APL is its
00:34:42.960 | notation. I think it's a beautiful notation. I don't think it's a beautiful programming language.
00:34:50.480 | I think some things, possibly everything, you know, some things work very well as a notation,
00:35:00.160 | but to get to raise something to the point that it is a notation requires some years of study
00:35:07.680 | and development and often some genius, you know, like the genius of Feynman diagrams or the genius
00:35:15.040 | of juggling notation, you know, like there are people who find a way to turn a field into a
00:35:23.040 | notation and suddenly they blow that field apart and make it better for everybody.
00:35:29.360 | For me, like, I don't want to think too hard all the time. Every time I come across something that
00:35:36.320 | really hasn't been turned into a notation yet, you know, sometimes I just like, I just want to
00:35:43.040 | get it done, you know, and so I would rather only use notation when I'm in these fields
00:35:50.480 | that either somebody else had figured out how to make that a notation or I feel like it's really
00:35:55.520 | worth me investing to figure that out. Otherwise, you know, there are, and the other thing I'd say
00:36:02.080 | is we already have notations for things that aren't APL that actually work really well,
00:36:06.000 | like regular expressions, for example. That's a fantastic notation and I don't want to
00:36:12.320 | replace that with APL glyphs. I just want to use regular expressions.
00:36:20.720 | So, yeah, my ideal world would be one where we, where I can write PyTorch code, but maybe instead
00:36:28.320 | of like Einstein operations, Einstein notation, I could use APL notation. I think that's where
00:36:39.600 | I would love to get to one day and I would love that to totally transparently run on a GPU or TPU
00:36:47.920 | as well. That would be my happy place. Has no reason to do with the fact that
00:36:54.000 | I work at NVIDIA that I would love that. Interesting. I've never heard that before,
00:37:00.240 | the difference between basically appreciating or being in love with the notation, but not the
00:37:08.000 | language itself and that. And, you know, it started out as a notation, right? Like I was in,
00:37:14.640 | you know, it was a notation they used for representing state machines or whatever on
00:37:20.080 | early IBM hardware, you know, when he did his Turing Award essay, he chose to talk about his
00:37:27.040 | notation. And, you know, you see with people like Aaron with his code defense stuff that
00:37:37.680 | if you take a very smart person and give them a few years, they can use that notation to solve
00:37:43.840 | incredibly challenging problems like build a compiler and do it better than you can
00:37:50.320 | without that notation. So I'm not saying like, yeah, APL can't be used to almost anything you
00:37:58.000 | want to use it for, but a lot of the time we don't have five years to study something very closely.
00:38:04.400 | We just want to, you know, we've got to get something done by tomorrow.
00:38:11.360 | Interesting. You're still again, you didn't get a answer to.
00:38:15.680 | Oh, yeah. When did you first, well, when did you first meet APL or how did you even find APL?
00:38:20.480 | I first found J, I think, which obviously led me to APL. And I don't quite remember where I saw it.
00:38:34.880 | Yeah. And actually, when I got to San Francisco, so that would be I'm trying to remember
00:38:45.760 | 2010 or something, I'm not sure. I actually reached out to Eric Iverson and I said, like,
00:38:54.640 | oh, you know, we're starting this machine learning company called Kaggle. And I kind of feel like,
00:39:02.240 | you know, everybody does stuff in Python, and it's kind of in a lot of ways really disappointing.
00:39:06.000 | I wish we're doing stuff in J, you know, but we really need everything to be running on the GPU,
00:39:12.240 | or at least everything to be automatically using SIMD and multiprocessor everywhere.
00:39:18.000 | Here's kind of enough to actually jump on a Skype call with me, not just jump on a Skype call,
00:39:23.440 | it's like, how do you want to chat? It's like, how about Skype? And he created a Skype account.
00:39:27.760 | Like, oh, yeah, we chatted for quite a while. We talked about, you know, these kinds of hopes and
00:39:35.600 | yeah, but I just, you know, never really because neither J or APO is in that space yet.
00:39:46.880 | There was just never a reason for me to do anything other than like,
00:39:51.200 | it kind of felt like each time I'd have a bit of a break for a couple of months,
00:39:54.800 | I'd always been a couple of weeks fiddling around with J just for fun. But that's as far as I got,
00:40:02.000 | really. Yeah, I think the first time I'd heard of you was in an interview that Leo Laporte did with
00:40:08.240 | you on triangulation, and you were talking about Kaggle. That was a specific thing. But I think
00:40:13.280 | I was riding my bike along some logging or something and suddenly he said, oh, yeah, but
00:40:17.120 | a lot of people use J. I like J. It's the first time I'd ever heard anybody on a podcast say
00:40:22.960 | anything about J. It was just like, wow, that's amazing. And the whole interview about Kaggle,
00:40:31.120 | there was so much of it about the importance of data processing, not just having a lot of
00:40:36.640 | data, but knowing how to filter it down, not over filtering all those tricks. I'm thinking,
00:40:41.600 | wow, these guys are really doing some deep stuff with this stuff and this guy is using J.
00:40:47.280 | I was actually very surprised at that point that somebody, I guess not somebody who was
00:40:54.080 | working so much with data would know about J, but just that it would be,
00:40:58.080 | I guess just suddenly popped onto my headsets and I'm just, wow, that's so neat.
00:41:04.720 | And I will say, in the array programming community, I find there's essentially a common misconception
00:41:11.200 | that the reason people aren't using array programming languages is because they don't
00:41:16.160 | know about them or don't understand them, which there's a kernel of truth of that,
00:41:22.240 | but the truth is nowadays there's huge massively funded research labs at places like Google Brain
00:41:31.920 | and Facebook AI Research and OpenAI and so forth where large teams of people are literally writing
00:41:39.520 | new programming languages because they've tried everything else and what's out there is not
00:41:44.080 | sufficient. In the array programming world, there's offered a huge underappreciation of
00:41:52.720 | what Python can do nowadays, for example. As recently as last week, I heard it described in
00:41:59.440 | a chat room, it's like people obviously don't care about performance because they're using Python.
00:42:04.160 | And it's like, well, a large amount of the world's highest performance computing now is done with
00:42:10.800 | Python. It's not because Python's fast, but if you want to use RAPIDS, for example, which literally
00:42:19.040 | holds records for the highest performance recommendation systems and tabular analysis,
00:42:26.000 | you write it in Python. So this idea of having a fast kernel that's not written in the language
00:42:38.160 | and then something else talking to it in a very flexible way, I think is great. And as I say,
00:42:43.200 | at the moment, we are very hamstrung in a lot of ways that we, at least until recently, we very
00:42:48.880 | heavily relied on BLAS, which is totally the wrong thing for that kind of flexible high-performance
00:42:57.680 | computing because it's this bunch of somewhat arbitrary kind of selection of linear algebra
00:43:05.920 | algorithms, which, you know, things like the C# work I did, you know, they were just RAPIDS on
00:43:11.120 | top of BLAS. And what we really want is a way to write really expressive kernels that can do
00:43:18.240 | anything over any axes. So then there are other newer approaches like Julia, for example, which
00:43:31.360 | is kind of like got some rispy elements to it and this type dispatch system. But because it's,
00:43:36.720 | you know, in the end, it's on top of LLVM. What you write in Julia, you know, it does end up
00:43:45.840 | getting optimized very well. And you can write pretty much arbitrary kernels in Julia and often
00:43:52.320 | get best-in-class performance. And then there's other approaches like JAX. And JAX sits on top
00:44:02.480 | of something totally different, which is it sits on top of XLA. And XLA is a compiler, which is
00:44:09.280 | mainly designed to compile things to run fast on Google's TPUs. But it also does an okay job of
00:44:17.040 | compiling things to run on GPUs. And then really excitingly, I think, you know, for me is the MLIR
00:44:26.240 | project, and particularly the affine dialect. So that was created by my friend, Chris Latner,
00:44:34.240 | who you probably know from creating Clang and LLVM and Swift. So he joined Google for a couple
00:44:45.040 | of years. And we worked really closely together on trying to like, think about the vision of
00:44:49.920 | really powerful programming on accelerators that's really developer friendly. Unfortunately,
00:44:58.480 | didn't work out. Google was a bit too tight to TensorFlow. But one of the big ideas that did
00:45:04.240 | come out of that was MLIR, and that's still going strong. And I do think there's, you know, if
00:45:09.040 | something like APO, you know, could target MLIR and then become a DSL inside Python, it may yet win,
00:45:18.800 | you know. I've heard Yeah, I've heard you in the past say that, on different podcasts and talks,
00:45:24.960 | that you don't think that Python, even in light of, you know, just saying, people don't realize how
00:45:31.200 | much you can get done with Python, that you don't think that the future of data science and AI and
00:45:35.200 | neural networks and that type of computation is going to live in the Python ecosystem. And I've
00:45:41.040 | heard on some podcasts, you've said that, you know, Swift has a shot based on sort of the way that
00:45:44.400 | they've designed that language. And you just mentioned, you know, a plethora of different
00:45:48.160 | sort of, I wouldn't say initiatives, but you know, JAX, XLA, Julia, etc. Do you have like a sense
00:45:53.600 | of where you think the future of, not necessarily sort of array language computation, but this kind
00:45:59.680 | of computation is going with all the different avenues? I do. You know, I think we're certainly
00:46:08.560 | seeing the limitations of Python, and the limitations of the PyTorch, you know,
00:46:15.520 | lazy evaluation model, which is the way most things are done in Python at the moment,
00:46:25.280 | for kind of array programming is you have an expression, which is, you know, working on
00:46:31.200 | arrays, possibly of different ranks with implicit looping. And, you know, that's one line of Python
00:46:37.200 | code. And generally, that then gets your, you know, on your computer, that'll get turned into,
00:46:43.280 | you know, a request to run some particular optimized pre written operation on the GPU or
00:46:52.000 | TPU, that then gets sent off to the GPU or TPU, where your data has already been moved there.
00:46:58.960 | It runs, and then it tells the CPU when it's finished. And there's a lot of latency in this,
00:47:06.800 | right? So if you want to create your own kernel, like your own way of doing, you know, your own
00:47:12.480 | operation effectively, you know, good luck with that. That's not going to happen in Python.
00:47:19.600 | And I hate this, I hate it as a teacher, because, you know, I can't show my students what's going
00:47:26.080 | on, right? It kind of goes off into, you know, kind of CUDA land and then comes back later.
00:47:33.520 | I hate it as a hacker, because I can't go in and hack at that, I can't trace it, I can't debug it,
00:47:39.280 | I can't easily profile it. I hate it as a researcher, because very often I'm like,
00:47:44.400 | I know we need to change this thing in this way, but I'm damned if I'm going to go and write my own.
00:47:49.680 | CUDA code, let alone deploy it. So JAX is, I think, a path to this. It's where you say, okay, let's not
00:47:58.160 | target pre-written CUDA things, let's instead target a compiler. And, you know, working with
00:48:07.360 | Chris Latner, I'd say he didn't have too many nice things to say about XLA as a compiler. It was not
00:48:13.040 | written by compiler writers, it was written by machine learning people, really. But it does the
00:48:19.760 | job, you know, and it's certainly better than having no compiler. And so JAX is something which,
00:48:26.080 | instead of turning our line of Python code into a call to some pre-written operation,
00:48:32.400 | it instead is turning it into something that's going to be read by a compiler. And so the compiler
00:48:37.280 | can then, you know, optimize that as compilers do. So, yeah, I would guess that JAX probably has
00:48:46.560 | a part to play here, particularly because you get to benefit from the whole Python ecosystem,
00:48:54.320 | package management, libraries, you know, visualization tools, et cetera.
00:49:04.560 | But, you know, longer term, it's a mess, you know, it's a mess using a language like Python which
00:49:10.640 | wasn't designed for this. It wasn't really even designed as something that you can chuck
00:49:16.880 | different compilers onto. So people put horrible hacks. So, for example, PyTorch,
00:49:21.440 | they have something called TorchScript, which is a bit similar. It takes Python and kind of compiles
00:49:26.800 | it. But they literally wrote their own parser using a bunch of regular expressions. And it's
00:49:34.080 | it's, you know, it's not very good at what it does. It even misreads comments and stuff.
00:49:39.120 | So, you know, I do think there's definitely room for, you know, a language of which Julia would
00:49:47.520 | certainly be the leading contender at the moment to come in and do it properly. And Julia's got,
00:49:54.800 | you know, Julia is written on a scheme basis. So there's this little scheme kernel
00:50:01.440 | that does the parsing and whatnot. And then pretty much everything else after that is written in
00:50:06.560 | Julia. And, of course, leveraging LLVM very heavily. But I think that's what we want, right?
00:50:14.000 | Is that something which I guess I didn't love about Swift. When the team at Google wanted to
00:50:19.840 | add differentiation support into Swift, they wrote it in C++. And I was just like, that's not a good
00:50:26.960 | sign. You know, like, apart from anything else, you end up with this group of developers who are,
00:50:35.040 | in theory, Swift experts, but they actually write everything in C++. And so they actually don't have
00:50:40.800 | much feel for what it's like to write stuff in Swift. They're writing stuff for Swift. And Julia,
00:50:45.760 | pretty much everybody who's writing stuff for Julia is writing stuff in Julia. And I think that's
00:50:52.880 | something you guys have talked about around APL and J as well, is that there's the idea of writing
00:50:59.920 | J things in J and APL things in APL is a very powerful idea.
00:51:04.080 | Yeah, I always wonder about it.
00:51:08.240 | Yeah, sorry, go on. I just remembered your third question. I'll come back to it.
00:51:11.200 | No, no, no, you go ahead. You had.
00:51:12.320 | Oh, you asked me why now am I coming back to APL and J, which is
00:51:16.160 | totally orthogonal to everything else we've talked about, which is I had a daughter,
00:51:23.520 | she got old enough to actually start learning math. So she's six.
00:51:27.680 | And oh, my God, there's so many great educational apps nowadays. There's one called Dragonbox
00:51:36.800 | Algebra. It's so much fun. Dragonbox Algebra five plus. And it's like five plus algebra,
00:51:42.640 | like what the hell? So when she's, I think she actually says still four, I gave, you know,
00:51:46.640 | I let her play with Dragonbox Algebra five plus. And she learned Algebra, you know, by helping
00:51:52.080 | Dragon eggs hatch. And she liked it so much, I let her try doing Dragonbox Algebra 12 plus.
00:52:00.480 | And she loved that as well and finished it. And so suddenly I had a five year old kid that liked
00:52:05.440 | Algebra. Much, much surprised. Kids really can surprise you. And so, yeah, she struggled with
00:52:16.320 | a lot of the math that they were meant to be doing at primary school, like,
00:52:20.880 | like the vision and modification, but she liked Algebra. And we ended up homeschooling her.
00:52:28.240 | And then one of our, her best friend is also homeschooled. So this, this year I decided I'd
00:52:35.440 | try tutoring them in math together. And so my daughter's name's Claire, so her friend Gabe,
00:52:44.400 | so her friend Gabe discovered on his Mac the world of alternative keyboards. So he would
00:52:49.280 | start typing in the chat in, you know, Greek characters or Russian characters. And one day
00:52:55.760 | I was like, okay, check this out. So I like typed in some APL characters and they were just like,
00:53:01.520 | wow, what's that? We need that. So initially we installed dialogue APL so that they could
00:53:08.480 | type APL characters in the chat. And so I explained to them that this is actually
00:53:16.000 | this like super fancy math that you're typing in. And they really wanted to try it. So,
00:53:22.480 | and that was at the time I was trying to teach them sequences and series,
00:53:28.800 | and they were not getting it at all. It was my first total failure time as a, as a math tutor
00:53:35.440 | with them, you know, they'd been zipping along, fractions, you know, greatest common denominator,
00:53:42.240 | factor trees. Okay, everything's fine. It makes sense. And then we hit sequences and series. And
00:53:47.040 | it's just like, they had no idea what I was talking about. So we put that aside. Then we spent like
00:53:55.280 | three one hour lessons doing the basics of APL, you know, the basic operations and doing stuff
00:54:03.840 | with lists and dyadic versus monadic, but still, you know, just primary school level math.
00:54:11.360 | And we also did the same thing in NumPy using Jupyter. And they really enjoyed all that,
00:54:16.080 | like they were more engaged than our normal lessons. And so then we came back to like,
00:54:23.200 | you know, sigma i equals one to five of i squared, whatever. And I was like, okay,
00:54:29.680 | that means this, you know, in APL and this in NumPy. And they're like, oh, is that all?
00:54:38.720 | Fine. With, you know, that's like, yeah, so that was a problem. This idea of like Tn equals Tn
00:54:45.680 | minus one plus blah, blah, blah, blah. It's like, what is this stuff? But when you're actually
00:54:50.160 | indexing real things and can print out the intermediate values and all that, and you've
00:54:56.480 | got iota or a range, they were just like, oh, okay. You know, I don't know why you explained it this
00:55:03.440 | dumb way before. And I will say, given a choice between doing something on a whiteboard or doing
00:55:09.760 | something in NumPy or doing something in APL, now they will always pick APL because the APL version
00:55:15.760 | is just so much easier. You know, there's less to type, there's less to think about,
00:55:20.800 | there's less boilerplate. And so it's been, it's only been a few weeks, but like yesterday,
00:55:26.240 | we did the power operator, you know, and so we literally started doing the foundations of
00:55:32.320 | metamathematics. So it's like, okay, let's create a function called capital S, capital S arrow,
00:55:38.880 | you know, plus jot one, right? So for those Python people listening, jot is,
00:55:46.400 | if you give it an array or a scalar, it's the same as partial in Python or bind in C++.
00:55:59.920 | So, okay, we've now got something that adds one to things. Okay. I said, okay,
00:56:02.800 | this is called the successor function. And so I said to them, okay, what would happen if we go
00:56:06.960 | SSS zero? And they're like, oh, that would be three. And so I said, okay, well, what's,
00:56:14.400 | what's addition? And then one of them's like, oh, it's, it's repeated S. I'm like, yeah,
00:56:19.520 | it's repeated S. So how do we say repeated? So in APL, we say repeated by using this
00:56:24.720 | star diuresis. It's called power. Okay. So now we've done that. What is multiplication?
00:56:30.800 | And then one of them goes after a while. Oh, it's repeated addition. So we define addition,
00:56:36.880 | and then we define multiplication. And then I'm like, okay, well, what about, you know, exponent?
00:56:43.440 | Oh, that's just, now this one, they've heard a thousand times. They both are immediately like,
00:56:47.760 | oh, that's repeated multiplication. So like, okay, we've now defined that. And then, okay, well,
00:56:52.640 | subtraction, that's a bit tricky. Well, it turns out that subtraction is just, you know, is the
00:56:58.160 | opposite of something. What's it the opposite of? They both know that. Oh, that's the opposite of
00:57:01.680 | addition. Okay. Well, opposite of, which in math, we call inverse is just a negative power. So now
00:57:08.480 | we define subtraction. So how would you define division? Oh, okay. How would you define roots?
00:57:13.600 | Oh, okay. So we kind of like, you know, designing the foundations of, of mathematics here at APL,
00:57:22.560 | you know, with a six year old and an eight year old. And during this whole thing at one point,
00:57:27.840 | we're like, okay, well, now I can't remember why, but we're like, okay, now we got to do one divided
00:57:32.000 | by a half. And they both like, we don't know how to do that. So, you know, APL, this stuff that's
00:57:38.880 | considered like college level math suddenly becomes easy. And, you know, at the point when still
00:57:45.360 | primary school level math, like one divided by a half is considered hard. So it definitely made
00:57:50.720 | me rethink, you know, what is easy and what is hard and how to teach this math stuff. And I've
00:57:58.880 | been doing a lot of teaching of math with APL and the kids are loving it. And I'm loving it. And
00:58:04.480 | that's actually why I started this study group, which will be on today. Today, as we record this
00:58:11.680 | a few days ago, as you put it out there, as I kind of started saying on Twitter to people like,
00:58:17.920 | oh, it's really been fun teaching my kids, you know, my kid and a friend math using APL and a lot of
00:58:23.120 | adults were like, ah, can we learn math using APL as well? So that's what we're going to do.
00:58:32.320 | Well, and that's the whole notation thing, isn't it? It's the notation you get away from the
00:58:36.000 | sigmas and the pies and all that, you know, subscripts. I know, right? This is exactly
00:58:40.560 | what Everson wanted. Yeah, exactly. I mean, who wants this, you know, why should capital pi be
00:58:47.440 | product and capital sigma be sums? Like, you know, we did class slash and it's like, okay,
00:58:54.320 | how do we do product? They're like, oh, it's obviously time slash. And I show them backslash,
00:58:58.000 | it's like, how do we do our cumulative product? And so it's obviously time spec slash. Yeah,
00:59:02.960 | this stuff. And but, you know, a large group of adults can't handle this because I'll put stuff
00:59:09.040 | on Twitter. I'll be like, here's a cool thing in APL. And like half the replies will be like,
00:59:13.440 | well, that's line noise. That's not intuitive. It's like, how do you say that? It's this classic
00:59:21.280 | thing that I've always said, it's like the difference between what you said that you don't
00:59:25.520 | understand it, or is it that it's hard? And, you know, kids don't know for kids, everything's new.
00:59:32.720 | So that, you know, they see something they've never seen before. They're just like, teach me
00:59:36.640 | that. Or else adults, or at least a good chunk of adults, just like, I don't immediately understand
00:59:42.000 | that. Therefore, it's too hard for me. Therefore, I'm gonna belittle the very idea of the thing.
00:59:47.760 | I did, I did a tacit program on one liner on APL farm the other day. And somebody said,
00:59:54.160 | that looks like Greek to me. I said, well, Greek looks like Greek to me, because I don't know Greek.
00:59:58.640 | I mean, sure. If you don't know it, absolutely, it looks silly. But if you know it, then it's,
01:00:04.480 | it's not that hard. Yeah, I will say like, you know, a lot of people have put a lot of hard work into
01:00:12.160 | resources for APL and J teaching. But I think there's still a long way to go. And one of the
01:00:20.400 | challenges is, it's like when I was learning Chinese, I really wanted to I like the idea of
01:00:26.240 | learning Chinese new words by looking them up in a Chinese dictionary. But of course, I didn't know
01:00:31.280 | what the characters in the dictionary meant. So I couldn't look them up. So when I learned Chinese,
01:00:35.840 | I really spent the first 18 months just focused on learning characters. So I got through 6000
01:00:41.760 | characters in 18 months of very hard work. And then I could start looking things up in
01:00:47.040 | dictionary. My hope is to do a similar thing for APL, like for these study groups,
01:00:53.120 | I want to try to find a way to introduce every glyph in an order that never refers
01:01:01.040 | to glyphs you haven't learned yet. Like that's something I don't feel like we really have. And
01:01:05.280 | so that then you can look up stuff in the dialogue documentation. Because now still, I don't know
01:01:11.520 | that many glyphs. So like most of the stuff in the documentation, I don't understand because it
01:01:17.680 | explains glyphs using glyphs I don't yet know. And then I look those up. And those are used,
01:01:21.840 | explain things with glyphs I don't yet know. So, you know, step one for me is I think we're just
01:01:27.120 | going to go through and try to teach what every glyph is. And then I feel like we should be able
01:01:32.480 | to study this better together, because then we could actually read the documentation, you know,
01:01:38.080 | to publish these sessions online. Yeah, so the study group will be recorded as videos.
01:01:45.840 | But I also then want to actually create, you know, written materials using Jupiter,
01:01:52.320 | which I will then publish. That's my goal. So what you said very much resonates with me,
01:01:58.720 | that I often find myself in the when teaching people this this bind that to explain everything
01:02:06.240 | I need to already have everything explained. And I think so and especially it comes down to,
01:02:12.960 | in order to explain what many of these glyphs are doing, I need some fancy arrays. If I restrict
01:02:18.400 | myself to simple vectors and scalers, then I can't really show their power. And I cannot create these
01:02:24.800 | higher rank arrays without already using those glyphs. And so hopefully, it is this long running
01:02:30.960 | project since like 2015, I think it is, is to add a literal array notation to APL.
01:02:37.840 | And then there is a way in, then you can start by looking at an array, and then you can start
01:02:45.280 | manipulating and see the effects of the glyphs and intuit from there what they do.
01:02:49.680 | Yeah, no, I think that'll be very, very helpful. And in the meantime, you know,
01:02:54.160 | my approach with the kids has just been to teach row quite early on. So row is the equivalent of
01:03:00.560 | reshape in Python, most Python libraries. And yeah, so once you know how to reshape,
01:03:09.440 | you can start with a vector and shape it to anything you like. And it's, you know,
01:03:13.120 | it's not a difficult concept to understand. So I think that yeah, basically, the trick at the
01:03:17.200 | moment is just to say, okay, in our learning of the dictionary of APL, one of the first things
01:03:22.240 | we will learn is, is row. And that was really fun with the kids doing monadic row, you know,
01:03:29.760 | to be like, okay, well, what's row of this? What's row of that? And okay, what's row of row of this?
01:03:34.880 | And then what's row of row of row, which then led me to the to the storm and poem about
01:03:44.240 | what is it row row row is one, etc, etc, which they loved as well.
01:03:52.160 | Yeah, we'll link that in the show notes. Also, too, while you were saying all that,
01:03:56.480 | that really resonated me with me when I first started learning APL is like one of the first
01:04:03.200 | things that happened when I was like, okay, you can, you can fold, you can map. So like,
01:04:08.640 | how do you filter, you know, what are the classic, you know, three functional things? And the problem
01:04:13.120 | with APL and array languages is they don't have an equivalent filter that takes a predicate function,
01:04:18.240 | they have a filter that is called compress that takes a mask that, you know, drops anything that
01:04:23.520 | corresponds to a zero. And it wasn't until a few months later that I ended up discovering it. But
01:04:28.240 | for both APL and the newer APL BQN, there's these two sites, Adam was the one that wrote the APL one
01:04:35.440 | apple cart dot info, and bacon crate dot info, I also think. And so you can basically
01:04:41.200 | semantically search for what you're trying to do. And it'll give you small expressions that do that.
01:04:46.560 | So if you type in the word filter, which is what you would call it coming from, you know,
01:04:52.000 | a functional language, or even I think Python calls it filter, you can get a list of small
01:04:57.440 | expressions. And really, really often, sometimes you need to know the exact thing that it's called,
01:05:03.280 | like one time I was searching for, you know, all the combinations or permutations. And really,
01:05:07.280 | what I was looking for was power set. And so until you have that, you know, the word power set,
01:05:11.920 | it's, you know, it's a fuzzy search, right? So but it's still a very, very useful tool when it's like
01:05:18.000 | you said, you're trying to learn something like Chinese. And it's like, well, where do I even
01:05:21.040 | start I don't I don't know the language to search the words to search for. But yeah, it is. I agree
01:05:29.520 | that there's a large room from improvement and how to onboard people without them immediately going,
01:05:35.520 | like you said, this looks like hieroglyphics, which I think Iverson considered a compliment,
01:05:39.760 | like there's some anecdote I've heard where someone was like, this is hieroglyphics. And he says,
01:05:42.960 | yes, exactly. And then the other thing like that I want to do is help in particular Python programmers
01:05:52.080 | and maybe also do something for JavaScript programmers, which are the two most popular
01:05:55.680 | languages, like at the moment, like a lot of the tutorials for stuff like J or whatever,
01:06:01.680 | like J for C programmers, you know, great book, but most people aren't C programmers. And also
01:06:07.680 | a lot of the stuff like, you know, it'd be so much easier if somebody just like said to me early on,
01:06:14.000 | oh, you know, just the same as partial in Python, you know, or it's like, you know, putting things
01:06:23.040 | in a box, what the hell's a box if somebody basically said, oh, it's basically the same
01:06:26.400 | as a reference. It's like, oh, okay, you know, I think it one of your podcasts, somebody said,
01:06:30.720 | oh, it's like void stars. Oh, yeah, okay. You know, this is kind of like lack of just saying,
01:06:36.160 | like, this is actually the same thing as in Python and JavaScript. So I do want to do some kind of
01:06:42.320 | yeah, mapping, yeah, like that, particularly for kind of NumPy programmers and stuff, because a
01:06:50.080 | lot of it's so extremely similar. Be nice to kind of say like, okay, well, this is, you know, J
01:06:56.960 | maps things over leading axes, which is exactly the same as NumPy, except it doesn't have trailing
01:07:02.240 | axes. So if you know the NumPy rules, you basically know the J rules. Yeah, I think I think at the
01:07:09.520 | basic level, you're absolutely right. And that that would certainly be really useful. When we've
01:07:14.080 | talked this over before, some of the challenges are in the flavors and the details. If you send
01:07:21.040 | somebody down the wrong road with a metaphor that almost works in some of these areas, it can really
01:07:26.560 | be challenging for them, because they see it in with, you know, through their lens of their experience.
01:07:33.520 | But that would say, in this area, it would work differently than it actually does. So there is a
01:07:39.760 | challenge in that. And we find it even between APL, BQN and J. I'm trying to think of what we were
01:07:46.240 | talking about. Oh, it was transpose, the language, the language is dyadic transpose is they hand,
01:07:51.600 | they handle them differently. They're functionally, you can do the same things, but you have to be a
01:07:56.400 | aware that they are going to do it differently, according to the language. Absolutely. But that's
01:08:01.520 | not a reason to throw out the analogy, right? Like, I think everybody agrees that that it's easier for
01:08:06.480 | an APL programmer to learn J, than for a C or JavaScript programmer to learn J, you know,
01:08:14.800 | because there are some ideas you understand. And you can actually say to people like, okay, well,
01:08:19.760 | this is the rank conjunction in J. And you may recognize this as being like the rank, you know,
01:08:24.320 | operator in APL. So if we can do something like that and say like, oh, well, okay, this would do
01:08:29.840 | the same thing as, you know, dot permute, dot blah in PyTorch. It's like, okay, I see it.
01:08:40.000 | Well, as the maintainer of apple cart, I'd like to throw in a little call to the listeners. Like
01:08:45.920 | what Connor mentioned, I do fairly often get people saying, well, I couldn't find this and
01:08:51.200 | ask them, what did you search for? So do let me know, contact me by whatever means, say, if you
01:08:56.080 | couldn't find something, either because it's altogether missing, and I might be able to edit,
01:08:59.840 | or tell me what you search for and couldn't find, or maybe you found it later by searching for
01:09:04.720 | something else. And I'll add those keywords for future users. And I have put in a lot of like
01:09:11.200 | function names from other programming languages so that you can search for those and find the
01:09:15.920 | APL equivalent. Yeah, I will say, I feel like either I'm not smart enough to use applecart.info,
01:09:24.320 | or I haven't got the right tutorial yet. Because I, I went there, I've been there a few times.
01:09:30.240 | And there's this like whole lot of like impressive looking stuff. And I just I, I don't want to know
01:09:36.400 | what to do with it. And then I sometimes click things and it sends me over to this tao.run that
01:09:40.880 | tells me like real time 0.02 seconds code, like, I find it, you know, a little, not a little, I
01:09:50.080 | have not yet, I don't yet know how to use it. And so, you know, I guess given hearing you guys say
01:09:57.520 | this is a really useful tool that a lot of people put a lot of time into, I should obviously invest
01:10:02.160 | time learning how to use it. And maybe after doing that, I should explain to people how to use it.
01:10:07.840 | I do have a video on it. And there's also a little question mark icon one can click on and get to.
01:10:12.960 | I have tried the question mark icon as well. As I say, it might just you know, I think this often
01:10:21.120 | happens with APL stuff. I often hit things and I feel like maybe I'm not smart enough to understand
01:10:25.840 | this. Clearly don't think that's if we disagree. Yeah, I do recall you saying a few minutes ago
01:10:37.440 | that you managed to teach your, you know, four year old daughter like 12 grade or age 12 algebra.
01:10:43.200 | No, I didn't. I just gave her the app, right? It's like it's I've heard other parents have given it
01:10:49.600 | to their kids. They all seem to handle it. It's it's just this fun game where you hatch dragon eggs
01:10:54.240 | by like dragging things around on the iPad screen. And it just it so happens that the things you're
01:10:59.120 | doing with dragon's eggs are the rules of algebra. And after a while, it starts to switch out some of
01:11:06.320 | the like monsters with symbols like x and y, you know, and it does it gradually, gradually. And at
01:11:12.400 | the end, it's like, oh, now you're doing it after birth. So I can't get any credit for that. That's
01:11:17.600 | some very, very clever people wrote a very cool thing. It really is an amazing program. I homeschooled
01:11:23.120 | my son as well. And we used that for algebra. Great. Yeah, it was a bit more age appropriate,
01:11:28.160 | but it's I, I looked at that and said that that really is well put together. It's it's an amazing
01:11:35.120 | program. I will say there'll be a Dragonbox APO one day. It's not a bad idea. Not a bad idea at all.
01:11:43.920 | I was going to say when you're teaching somebody, one of the big challenges when you're sort of
01:11:47.360 | trying to get a language across to a general audience is who is the audience? Because as you
01:11:53.440 | say, if you're if you're dealing with kids or people who haven't been exposed to programming
01:11:58.640 | before, that's a very different audience than somebody might have been exposed to some other
01:12:03.600 | type of programming. Functional programming is a bit closer, but if you're a procedural programmer
01:12:08.480 | or imperative programmer, it's going to be a stretch to try and bend your mind in the different
01:12:13.120 | ways that, you know, APL or J or BQN expect you to think about things. Yeah, I think the huge rise
01:12:20.800 | of functional programming is very helpful for coming to array programming, you know,
01:12:26.400 | both in JavaScript and in Python. It's, you know, I think most people are doing stuff,
01:12:34.240 | particularly in the machine learning and deep learning world, are doing a lot of functional
01:12:38.480 | stuff off. That's the only way you can do things, particularly in deep learning. So I think, yeah,
01:12:44.240 | I think that does help a lot. Like, like Connor said, like you've probably come across, you know,
01:12:49.360 | map and reduce and filter and certainly in Python, you'll have done list comprehensions and dictionary
01:12:56.880 | comprehensions. And a lot of people have done SQL. So it's, yeah, I think a lot of people come into it
01:13:04.720 | with some relevant analogies, if we can help connect for them. Yeah, one of the things that,
01:13:12.720 | you know, this really is reinforcing my idea that, or it's not my idea, I think it's just an idea
01:13:19.840 | that multiple people have had, but the tool doesn't exist yet. Because we'll link to some
01:13:25.760 | documentation that I use frequently when I'm going sometimes between APL and J on the BQN website,
01:13:31.280 | they have BQN to dialogue APL dictionaries and BQN to J dictionaries. So sometimes I'll like,
01:13:38.320 | if I'm trying to convert between the two, the BQN docs are so good. I'll just use BQN as like an
01:13:43.040 | IR to go back and forth. But I've mentioned on previous podcasts that really what would be amazing
01:13:48.480 | and it would only work to a certain extent is something like a multidirectional array language
01:13:55.040 | transpiler and adding NumPy to that list would probably be, you know, a huge, I don't know what
01:14:00.960 | the word for it is, but beneficial for the array community. If you can type in some NumPy expression,
01:14:06.240 | you know, like I said, it's only gonna work to an extent, but for simple, you know, rank one vectors
01:14:10.960 | or arrays that you're just reversing and summing and doing simple, you know, reduction and scan
01:14:15.840 | operations, you could translate that pretty easily into APLJ and BQN. And it's, I think that would
01:14:22.560 | make it so much easier for people to understand, aka the hieroglyphics or the Greek or the Chinese
01:14:28.080 | or whatever metaphor you want to use. Because yeah, this is, it is definitely challenging at times
01:14:34.640 | to get to a certain point where you have enough info to keep the snowball rolling, if you will.
01:14:39.760 | And it's very easy to hit a wall early on. Yeah. That's a project I've been thinking about is
01:14:47.280 | basically rewrite NumPy in APL. It doesn't seem like a whole lot of work, where just take all those
01:14:55.600 | names that are available in NumPy and just define them as APL functions. And people can explore that
01:15:00.480 | by opening them up and seeing how they're defined. Oh, so not actually you're saying like,
01:15:07.120 | it wouldn't be a new thing. You're just saying like, rename the symbols, what they're known as
01:15:12.560 | in NumPy so that you'd still be in a, like an APL. Yeah. I mean, you could use it as a library,
01:15:19.440 | but I was thinking of it more as an interactive exploring type thing, where you open up this
01:15:24.320 | library and then you, you write the name of some NumPy thing functionality and open it up in the
01:15:34.640 | editor and see, well, how is this defined in APL? And then you could use it obviously, since it's
01:15:40.320 | defined. Interesting. Then you could slowly, you could use these library functions. And then as
01:15:47.360 | you get better at APL, you can start actually writing out the raw APL instead of using these
01:15:52.080 | covers for it. Well, I guess, Jeremy, that's interesting. Do you think that, because you've
01:15:57.680 | mentioned about sort of the notation versus the programming language and where do you think the,
01:16:04.400 | like, in your dream scenario, are you actually coding in sort of an Iversonian like notation?
01:16:11.280 | Or is it at the end of the day, does it still look like NumPy, but it's just all of the expressivity
01:16:19.280 | and power that you have in the language like APL is brought to and combined with what NumPy
01:16:25.600 | sort of currently looks like? I mean, well, it'd be a bit of a combination, Connor, in that, like,
01:16:30.400 | you know, my classes and my type dispatch and my packaging and, you know, all the, you know,
01:16:40.800 | my function definitions and whatever, that's Python. But, you know, everywhere I can use
01:16:49.040 | plus and times and divide and whatever, I could also use any APO glyph. And so it'd be, you know,
01:16:59.760 | basically an embedded DSL for kind of high dimensional notation. It would work automatically
01:17:09.360 | on NumPy arrays and TensorFlow tensors and PyTorch tensors. I mean, one thing that's interesting is,
01:17:16.480 | to a large degree, APL and PyTorch and friends have actually arrived at a similar place
01:17:27.120 | with the same, you know, grandparents, which is, Iverson actually said his inspiration
01:17:36.640 | for some of the APL ideas was tensor analysis. And a lot of the folks, as you can gather from
01:17:43.440 | the fact that in PyTorch, we don't call them arrays, we call them tensors. A lot of the folks
01:17:47.280 | working on deep learning, their inspiration was also from tensor analysis. So it comes from
01:17:51.600 | physics, right? And so I would say, you know, a lot more folks have worked on PyTorch. We're
01:17:57.280 | familiar with tensor analysis and physics than we're familiar with APL. And then, of course,
01:18:03.680 | there's been other notations, like explicitly based on Einstein notation, there's a thing
01:18:10.080 | called INOPS, which like takes, it's a very interesting kind of approach of taking Einstein
01:18:15.280 | notation much further. And like Einstein notation, if you think about it, is the kind of the loop
01:18:21.120 | free programming of math, right? The equivalent of loops in math is indices. And Einstein notation
01:18:28.240 | does away with indices. And so that's why stuff like INOPS is incredibly powerful because you can
01:18:33.840 | write, you know, an expression in INOPS with no indices and no loops. And it's all implicit
01:18:42.160 | reductions and implicit loops. I guess, yeah, my ideal thing would be, we wouldn't have to use INOPS,
01:18:49.520 | we can use APL, you know, and it wouldn't be embedded in a string. They would actually be
01:18:55.680 | operators. Yeah, that's what it is. They'd be operators in the language. The Python operators
01:19:00.160 | would not just be plus times minus slash, that would be all the APL glyphs would be Python
01:19:12.320 | operators. And they would work on all Python data types, including all the different tensor and
01:19:18.000 | array data types. Interesting. Yeah. So it sounds like you're describing a kind of hybrid language.
01:19:24.960 | JavaScript too. I would love the whole DSL to be in JavaScript as well. You know,
01:19:28.720 | that'd be great. And I feel like I saw that somewhere. I feel like I saw somebody actually
01:19:34.640 | do an ECMA script, you know, RFC with an implementation. Yeah, it was an A+4s joke.
01:19:44.240 | Yeah, but it actually worked, didn't it? Like, it's just there was actually an implementation.
01:19:48.800 | I don't think they had the implementation. It was just very, very well-specced. It could
01:19:54.480 | actually work kind of thing. No, I definitely read the code. I don't know how complete it was,
01:19:59.920 | but there was definitely some code there. I can't find it again. If you know where it is.
01:20:04.080 | There's a JavaScript implementation of APL by Nick Nicolev. But my problem with it,
01:20:12.480 | it's not tightly enough connected with the underlying JavaScript.
01:20:17.280 | It shouldn't be an A+4 full stroke, should it? It's like Gmail was an A+4 full stroke,
01:20:24.000 | right? Gmail came out on April 1st and totally destroyed my plans for fast mail because it was
01:20:29.440 | an April Fools joke that was real. And Flask, you know, the Flask library, I think, was originally
01:20:35.040 | an April Fools joke. We shouldn't be using frameworks because I created a framework that's
01:20:40.480 | so stupidly small that it shouldn't be a framework. And now that's the most popular web framework in
01:20:45.120 | Python. So, yeah, maybe this should be an April Fools joke that becomes real.
01:20:52.000 | How close? This is maybe an odd question, but because from what I know about Julia,
01:20:56.800 | you can define your own Unicode operators. And I did try at one point to create a small
01:21:05.760 | composition of two different symbols, you know, square root and reverse or something,
01:21:11.600 | and it ended up not working and asking me for parentheses. But do you think Julia could evolve
01:21:17.360 | to be that kind of hybrid language? Maybe. I'm actually doing a keynote at JuliaCon in a couple
01:21:26.320 | of weeks, so maybe I should erase that. Just at the Q&A section, say, any questions? But first,
01:21:34.800 | I've got one for the community at large. Here's what I'd like. I think my whole talk is going to
01:21:38.880 | be kind of like what Julia needs to be, you know, to move to the next level. I'm not sure I can
01:21:45.840 | demand that a complete APL implementation is that thing, but I could certainly put it out there as
01:21:50.320 | something to consider. It always bothers me, though, that if you try to extend those languages
01:21:57.200 | like this or you could do some kind of pre-compiler for it, then their order of execution ends up
01:22:05.440 | messing up APL. I think APL very much depends on having a strict one-directional order of functions,
01:22:12.800 | otherwise it's hopeless to keep track of. That is a big challenge because currently
01:22:18.880 | the DSL inside Python, which is the basic mathematical operations, do have the BODMAS
01:22:26.720 | or PEMDAS order operations. So there would need to be some way. So in Python, that wouldn't be
01:22:32.960 | too hard, actually, because in Python, you can opt into different kind of parsing things by adding a
01:22:42.320 | from dunderfutures import blast. You could have a from dunderfutures import APL precedence.
01:22:49.360 | And then from then on, everything in your file is going to use right-to-left precedence.
01:22:54.480 | That's really interesting and cool. I didn't know that.
01:23:00.400 | Yeah, that's awesome. I've been spending a lot of time thinking about
01:23:08.240 | function precedence and just the differences and different languages. I'm not sure if any other
01:23:13.760 | languages have this, but something that I find very curious about BQN and APL is that they have
01:23:19.920 | functions basically that have higher precedence than other functions. So operators in APL and
01:23:27.920 | conjunctions in adverbs, they have higher precedence than your regular functions that apply to arrays.
01:23:35.200 | I'm simplifying a tiny bit, but this idea that in Haskell, function application always has
01:23:40.880 | the highest precedence. You can never get anything that has a higher function precedence than that.
01:23:46.080 | And it always, having stumbled into the array world now, it seems like a very powerful thing
01:23:51.360 | that these combinator-like functions don't have just by default the higher precedence. Because if
01:23:56.160 | you have a fold or a scan or a map, you're always combining that with some kind of binary operation
01:24:01.760 | or unary operation to create another function that you're then going to eventually apply to
01:24:05.600 | something. But the basic right to left, putting aside the higher order functions or operators,
01:24:15.840 | as they're known in APL, the basic right to left path, again, for teaching and for my own brain,
01:24:22.240 | gosh, that's so much nicer than in C++. Oh my God, they're not being able to operate a precedence.
01:24:30.320 | There's no way I can ever remember that. And there's a good chance when I'm reading somebody
01:24:34.800 | else's code that they haven't used parentheses because they didn't really need them and that I
01:24:40.160 | have no idea where they have to go and then I have to go and look it up. It's another of these things
01:24:45.200 | that with the kids, I'm like, okay, you remember that stuff we spent ages on about like, first you
01:24:51.040 | do exponents and then you do times. It's like, okay, you don't have to do any of that in APL.
01:24:56.160 | You just go right to left and they're just like, oh, that's so much better.
01:24:59.680 | This literally came up at work like a month ago, where I was giving this mini APL, we had 10 minutes
01:25:07.440 | at the end of a meeting, and then I just made this offhand remark that of course, the evaluation
01:25:11.680 | order in APL is a much simpler model than what we learned in school. And I upset, there was,
01:25:16.960 | I don't know, 20 people in the meeting and it was the most controversial thing I had said.
01:25:23.200 | I almost had like an out of body experience because I thought I was saying something that
01:25:27.360 | was like objectively just true. And then I was like, wait a second, what I'm clearly missing,
01:25:32.720 | like, is there? Yeah, well, you were wrong. Like, how do you communicate? No, I mean,
01:25:36.480 | most adults are incapable of like new ideas. It's just, it's, it's, that's what I should have said
01:25:44.240 | in the meeting. What, I mean, this is a reason that I, another reason I like doing things like
01:25:50.400 | APL study groups, because it's a way of like self-selecting that small group of humanity who's
01:25:55.280 | actually interested in trying new things, despite the fact that they're grownups, and then try to
01:25:59.920 | surround myself with those people in my life. But isn't it sad then? I mean, what has happened
01:26:04.560 | to those grownups? Like when you mentioned teaching these people and trying to like,
01:26:08.240 | map their existing knowledge onto APL things, what does it mean to box and so on? I find that
01:26:12.640 | two children and non-programmers, expanding their array model and how the functions are applied and
01:26:19.120 | so on, is almost trivial. Meets no resistance at all. And it's all those adults that have either
01:26:26.560 | learned their, their primitives or button mass or whatever the rules are, and, and all the computer
01:26:31.440 | science people that know their proceedings tables and their lists of lists and so on.
01:26:36.000 | Those are the ones that are really, really struggling. It's not just resisting. They're
01:26:40.800 | clearly struggling. They're really trying and, and, and it's a lot of effort. So there is actually,
01:26:47.520 | I mean, that is a known thing in educational research. So yeah, I mean, so I spent months
01:26:55.120 | earlier this year and late last year reading every paper I caught about, you know, education,
01:27:02.720 | because I thought if I'm going to be homeschooling, then I should try to know what I'm doing.
01:27:06.240 | And yeah, what you describe at arm is, is absolutely a thing, which is that the,
01:27:12.640 | you know, the research shows that trying, you know, when you've got a, you know, an existing idea,
01:27:18.480 | which is an incorrect understanding of something, and you're trying to replace it with a correct
01:27:23.200 | understanding, that is much harder than learning the correct version directly. So which is obviously
01:27:31.520 | a challenge when you think about analogies and analogy has to be good enough to lead directly
01:27:38.160 | to the, to the correct version. But I think, you know, the important thing is to find the people
01:27:43.040 | who are who have the curiosity and tenacity to be prepared to go over that hurdle, even though it's
01:27:50.480 | difficult, you know, because yeah, it is like, that's just, that's just how human brains are.
01:27:55.920 | So so be it, you know. Yeah, unlearning is really hard work, actually. And if you think about it,
01:28:02.240 | it probably should be because you spend a lot of time and energy to put some kind of a pattern
01:28:06.720 | into your brain. Right. You don't want to have that evaporate very quickly. Right. And our,
01:28:12.080 | you know, myelination occurs around what, like age is age to 12 or something. So like our brains
01:28:17.520 | are literally trying to stop us from having to learn new things, because our brains think that
01:28:23.760 | they've got stuff sorted out at that point. And so they should focus on keeping long term memories
01:28:27.680 | around. So yeah, it does become harder. But, you know, a little bit, it's still totally doable.
01:28:34.640 | The solution is obvious. Teach AP on primary school.
01:28:37.200 | That's what I'm doing. What was the word you mentioned? Am I a myelation?
01:28:43.520 | My myelination. M-E-Y-L-I-N-A-T-I-O-N. Interesting. I'd not heard that one before.
01:28:51.680 | So it's a physical coating that I can't remember goes on the dendrites.
01:28:56.160 | I think it's on the axons, isn't it?
01:28:57.760 | That sounds right. These fat layers or cholesterol layers. I never took any biology courses in my
01:29:06.800 | education. So clearly, I've missed out on that aspect. You myelinated anyway.
01:29:13.840 | Isn't that an APL function? Myelinate.
01:29:18.480 | You also mentioned the word tenacity, Jeff. Yeah.
01:29:24.240 | And and and I was watching an interview with Samyan Bhatani.
01:29:29.600 | And you were talking about because it sounds like he was you spotted at an early point in his
01:29:38.400 | working with Kaggle that he was something probably different. And the thing you said
01:29:41.840 | was that tenacity to to keep working at something. Yeah.
01:29:45.760 | I think that's a really important part about educating people
01:29:49.440 | that they shouldn't necessarily expect learning something new to be easy.
01:29:53.280 | Yeah. But you can do it.
01:29:55.520 | Oh, yeah. I mean, I really noticed that when I was started learning Chinese.
01:30:00.240 | Like I went to, you know, just some local class in in Melbourne.
01:30:08.480 | And everybody was very, very enthusiastic, you know, and everybody was going to learn Chinese.
01:30:14.560 | And we all talked about the things we were going to do.
01:30:19.920 | And yeah, each week, there'd be fewer and fewer people there.
01:30:22.800 | And, you know, I kind of tried to keep in touch with them.
01:30:26.160 | But after a year, every single other person had given up and I was the only one still doing it.
01:30:32.240 | You know, so then after a couple of years, people would be like,
01:30:34.320 | wow, you're so smart. You learn Chinese. This is like, no, man.
01:30:39.440 | Like during those first few weeks, I was pretty sure I was learning more slowly than the other
01:30:45.120 | students. But everybody else stopped doing it. So of course, they didn't learn Chinese.
01:30:51.680 | And I don't know what the trick is, because, yeah, it's the same thing with, you know,
01:30:56.080 | like it fast. I courses, they're really designed to keep people interested and get people doing
01:31:01.840 | fun stuff from from day one. And, you know, still, I'd say most people drop out and the ones that
01:31:09.120 | don't I would say most of them end up becoming like actual world class practitioners and they,
01:31:16.480 | you know, build new products and startups and whatever else. And people will be like,
01:31:20.480 | oh, I wish I knew neural nets and deep learning. It's like, okay, here's the course.
01:31:25.440 | Just just do it and don't give up. But yeah, I don't know tenacity.
01:31:31.440 | It's not a very common virtue, I think, for some reason.
01:31:36.960 | It's something I've heard, I think it's Joe Bowler at Stanford talk about the growth mindset.
01:31:41.840 | And I think that is something that, for whatever reason, some people tend to, and maybe it's
01:31:47.280 | malanation, at those ages, you start to get that mindset where you're not so concerned about
01:31:53.600 | having something happen that's easy to do well. But just the fact that if you keep working at it,
01:31:59.600 | you will get it. And not everybody, I guess, is maybe put in the situations that they
01:32:05.760 | get that feedback that tells you if I keep trying this, I'll get it. If it's not easy, they stop.
01:32:11.360 | Yeah, I mean, that area of growth mindset is a very controversial idea in education.
01:32:18.800 | Specifically the question of can you modify it? And I think it's certainly pretty well established
01:32:27.840 | to this point that the kind of stuff that schools have tended to do, which is put posters up around
01:32:32.480 | the place saying like, you know, make things a learning opportunity or don't give up, like they
01:32:37.680 | do nothing at all. You know, with my daughter, we do all kinds of stuff around this. So we've
01:32:46.640 | actually invented a whole family of clams. And as you can imagine, clams don't have a growth mindset,
01:32:52.960 | they tend to sit on the bottom of the ocean, not moving. And so the family of clams that we
01:33:00.400 | invented that we live with, you know, always at every point that we're going to have to like learn
01:33:05.360 | something new or try something new, always start screaming and don't want to have anything to do
01:33:10.880 | with it. And, you know, so we actually have Claire telling the clams how it's going to be okay. And,
01:33:17.280 | you know, it's actually a good thing to learn new things. And so we're trying stuff like that to try
01:33:22.480 | to like have have imaginary creatures that don't have a growth mindset and for her to realize how
01:33:29.520 | how silly that is, which is fun. And the things that you were talking about in terms of the
01:33:34.960 | meta-mathematics, you didn't say, Oh, the successor, this is what plus is you said,
01:33:40.320 | how do you how do you how would you use this? How would you start to put it together themselves?
01:33:46.720 | Which to me, that's the growth mindset that if you Yeah, you're creating that. But then like,
01:33:52.240 | you know, gosh, you're getting to all the most controversial things in education here, Bob,
01:33:56.720 | because that's the other big one is discovery learning. So this idea of having kids explore and
01:34:04.160 | find. It's also controversial, because it turns out that actually the best way to have people
01:34:11.280 | understand something is to give them a good explanation. So it is important, like, that
01:34:17.040 | you combine this, like, okay, how would you do this within like, okay, let me just tell you
01:34:23.200 | what you know why this is. It's easier for homeschooling with two kids, because I can make sure
01:34:28.560 | their exploration is short, and correct. You know, if you spend a whole class, you know,
01:34:36.720 | 50 minutes doing totally the wrong thing, then you end up with these really incorrect
01:34:42.640 | understandings, which you then have to kind of deprogram. So yeah, education's hard, you know.
01:34:51.040 | And I think a lot of people look for these simple shortcuts, and they don't really exist. So you
01:35:00.640 | actually have to have good, good explanations and good problem solving methods and yeah,
01:35:10.320 | all this stuff. That's a really interesting area, the notation and the tools. Yeah, and you know,
01:35:17.280 | notation, I mean, so I do a live coding, you know, video thing every day with a bunch of folks. And
01:35:28.320 | in the most recent one, we started talking about APL, why we're going to be doing APL this week
01:35:35.360 | instead. And I gave, you know, somebody actually said like, oh, my God, is it going to be like
01:35:40.560 | regexes? And, you know, I kind of said like, okay, so regexes are a notation for doing stuff. And we
01:35:49.760 | spent an hour solving the problem with regexes. And oh, my God, it was such a powerful tool for
01:35:59.680 | this problem. And you know, by the end of it, they were all like, okay, we want to like deeply
01:36:04.080 | study regexes. And obviously, that's a much less flexible and powerful tool notation than APL.
01:36:12.560 | But you know, we kind of talked about how once you start understanding these notations, you can build
01:36:19.680 | things on top of them. And then you kind of create these abstractions. And that's yeah, notation is
01:36:26.720 | how, you know, deep human thought kind of progresses, right, in a lot of ways. So, you know, it's like,
01:36:37.840 | I actually spoke to a math professor friend a couple of months ago about, you know, my renewed
01:36:42.800 | interest in APL. And he was like, and I kind of sent him some, I can't remember what it was,
01:36:48.480 | maybe doing the golden ratio or something, little snippet, and he was just like,
01:36:53.840 | yeah, something like that looks like Greek to me, I don't understand that. It's like,
01:36:57.280 | do you draw a math professor, you know, like, if, if I said somebody who isn't in math,
01:37:03.040 | like a page of your, you know, research, what are they going to say? And, you know, it's interesting,
01:37:11.040 | I said, like, there's a bit of their ideas in here, like, Iverson brackets, for example,
01:37:16.160 | have you ever heard of Iverson brackets? He's like, well, of course, I've heard of it. Like,
01:37:19.040 | you know, it's a fundamental tool in math. It's like, well, you know, that's one thing that you
01:37:23.520 | guys have stolen from APL. You know, that's a powerful thing, right? It's like, fantastic,
01:37:28.640 | I'd never want to do without Iverson brackets. So I kind of tried to say like, okay, well, imagine,
01:37:32.960 | like, every other glyph that you don't understand here, has some rich thing like Iverson brackets,
01:37:38.400 | you could now learn about. Okay, maybe I should give it a go. I'm not sure he has.
01:37:46.960 | But I think that's a good example for mathematicians, is to show like his one thing,
01:37:52.320 | at least that found its way from APL. That maybe gives you a sense that for a mathematician,
01:37:58.240 | that there might be something in here. On that note, because I know we are potentially,
01:38:05.760 | well, we've gone way over, but this has been awesome. But a question I think that might be
01:38:10.400 | a good question to end on is, is, do you have any advice for folks that want to learn something,
01:38:20.880 | whether it's Chinese, or an array language, or to get through your fast AI course? And
01:38:26.560 | is there because I think, you know, like you said, you like to self select for folks that are
01:38:32.560 | the curious types and that are want to learn new things and new ways to solve things. But like,
01:38:38.480 | is there any way, other than just being tenacious to, like, be tenacious, is there tips to, you know,
01:38:46.960 | approaching something with some angle, because I think a lot of the folks maybe listening to this
01:38:51.680 | don't have that issue. But I definitely know a ton of people that are the are the kind of folks
01:38:57.120 | that you know, they'll join a study group, but then three weeks and they, you know, the kind of
01:39:00.400 | lose interest or, or they decide it's too much work or too difficult. As an educator, and you know,
01:39:07.040 | it seems like you operate in this space. Do you have advice to tell folks, you know,
01:39:13.760 | I mean, so much, Connor, I actually kind of embedded in my courses a lot. I can give you
01:39:19.680 | some quick summaries. But what I will say is, my friend Radhika Zmalski, who's been taking my
01:39:25.120 | courses for like four years, has taken everything I've said, and his experience of those things and
01:39:33.520 | turned it into a book. So if you read, Zmalski's book is called Meta Learning, powerful mental
01:39:42.480 | models for deep learning. This is learning as in learning deeply. So yeah, check out his book,
01:39:49.280 | to get the full answer. I mean, there's just, gosh, there's a lot of things you can do to make
01:39:55.760 | learning easier. You know, and a key thing I do in my courses is I always teach top down. So like
01:40:06.400 | often people with like, let's take deep learning and neural networks, they'll be like, okay, well,
01:40:10.480 | first, I'm going to have to learn linear algebra and calculus and blah, blah, blah. And, you know,
01:40:16.480 | four or five years later, they still haven't actually trained a neural network. Our approach
01:40:21.760 | in our course is in lesson one, the very first thing you do in the first 15 minutes is you train
01:40:26.320 | a neural network. And it is more like how we learn baseball or how we learn music, you know,
01:40:36.640 | like you say, like, okay, well, let's play baseball comes, you stand there, you stand there,
01:40:40.960 | I've threaded this to you, you're going to hit it, you're going to run, you know, you don't start by
01:40:45.520 | learning, you know, the parabolic trajectory of a ball or the, you know, history of the game or
01:40:53.440 | whatever, you just start playing. So that's, you know, you want to be playing. And if you're doing
01:40:59.760 | stuff from the start, that's fun and interesting and useful, then top down, doesn't mean it's
01:41:07.360 | shallow, you can then work from there to like, then understand like, what's each line of code
01:41:12.320 | doing? And then how is it doing it? And then why is it doing it? And then what happens if we do
01:41:16.560 | it a different way? And until eventually, with with our fast AI program, you actually end up
01:41:23.040 | rewriting your own neural network library from scratch, which means you have to very deeply
01:41:28.240 | understand every single part of it. And then we start reading research papers. And then we start
01:41:32.960 | learning about how to implement those research papers in the library we just wrote. So yeah,
01:41:37.600 | I'd say go top down, make it fun, make it applied. For things like APL or Chinese, where there's
01:41:45.040 | just stuff you have to remember, use Anki, use repetitive space learning. You know, that's been
01:41:52.000 | around, Ebbinghaus came up with that, I don't know what, 250, 200 years ago, it works, you know,
01:42:02.080 | everybody, if you tell them something, will forget it in a week's time, everybody, you know, and so
01:42:08.800 | you shouldn't expect to read something and remember it. Because you're human, and humans don't do that.
01:42:15.120 | So repetitive space learning will have you quiz you on that thing tomorrow. And then in four days
01:42:22.960 | time, and then in 14 days time, and then in three weeks time, and if you ever forget it, it will
01:42:29.280 | reset that schedule. And it'll make sure it's impossible to forget it, you know, so it's,
01:42:34.320 | it's depressing to study things that then disappear. And so it's important to recognize
01:42:40.960 | that unless you use Anki or super memo or something like that, unless you use it every day,
01:42:47.760 | it will, it will disappear. But if you do use repetitive space learning, it's guaranteed not
01:42:53.120 | to. And I told this to my daughter, a couple of years ago, I said, I, you know, what if I told you
01:43:00.800 | there was a way you can guarantee to never ever forget something you want to know? It's just like,
01:43:06.800 | that's impossible. This is like some kind of magic. It's like, no, it's not magic. And like, I sat down
01:43:13.280 | and I drew out the Ebbinghaus forgetting curves and explained how it works. And I explained how,
01:43:20.640 | you know, if you get quizzed on it in these schedules, it flattens out. And she was just
01:43:25.200 | like, what do you think? I want to use that. So she's been using Anki ever since.
01:43:31.520 | So maybe those are just two, let's just start with those two. Yeah, so go top down and, and use
01:43:38.640 | Anki, I think could make your learning process much more fulfilling, because you'll be doing
01:43:44.400 | stuff with what you're learning and you'll be remembering it. Well, that is awesome. And yeah,
01:43:50.160 | definitely we'll leave links to not just Anki and the book, meta learning, but everything that we've
01:43:56.560 | discussed throughout this conversation, because I think there's a ton of really, really awesome
01:44:00.400 | advice. And obviously to your fast AI course in the library. And we'll also link to, I know you've
01:44:07.040 | been on, like we mentioned before, a ton of other podcasts and talks. So if you'd like to hear more
01:44:12.960 | from Jeremy, there's a ton of resources online. Hopefully, it sounds like you're going to be,
01:44:17.120 | you know, building some learning materials over the next however many months or years. And so
01:44:21.920 | in the future, if you'd love to come back and update us on on your journey with the array
01:44:26.000 | languages, that would be super fun for us, because I've thoroughly enjoyed this conversation. And
01:44:31.040 | thank you so much for waking up early all on the other side of the world from us, at least in
01:44:36.400 | Austria. Thanks for having me. And yeah, I guess with that, we'll say happy array programming.
01:44:41.840 | Happy programming.
01:44:43.840 | [BLANK_AUDIO]