fast.ai APL study session 7

I don't know. Hello, everybody. Hello, everybody. Hello. Good morning, Jeremy. Hey guys. Hi. Doing good. I got a APL thing just pushed a couple of minutes ago, so let me see. Notepok1 of the Fast AI Numerical Linear Algebra course, but APL-ified. Oh, it's not on the forum yet. Where do I find it?

I just posted in the chat here. I literally pushed it like two minutes ago, so I haven't even had a chance to post it on the forum. I just got it done. But I will make a forum post after the call. Okay. So this is about the numerical, well, computational linear algebra course, which we did, oh my God, five years ago.

Well, three days short. I think five, oh my gosh, that's amazing. I started going through it when you mentioned it just the other day, and it seems like a great course. I don't know why I didn't do it sooner. Oh, thanks. Fantastic. We like to think so. I mean, it's not as like, obviously, immediately applicable kind of a thing, but you know, it's interesting.

So, okay, so you've, so you've taken the first notebook from it. And here we go. APL, look at that. Get those two, two ones and then I can't cut the answer using our matrix multiplication. I see this is like a little just empty thing to fill in the space, I guess.

Yeah, I'm not really sure how to make tables look good in APL. I came up with stuff together but one more section. A couple more sections here. I got to learn a bunch of new glyphs doing this. Cool. Nice using the power operator I'm glad we did that. Yeah.

Oh. And I can values, my goodness. I don't know how to calculate in a smart way but I got them calculated, and hopefully I learned a smarter way later in the course. All right. Great. That's the last of it. This is my favorite bloom filter check. It's also the only one I know.

Nice job, Isaac. That's cool. Thank you. Not going to get any easier. Well, I mean, in theory, this should be exactly what I feel is good at right. Yes, absolutely. I mean, the bits we have to start like opening JPEGs and stuff might get complicated, but. Oh yeah, I spent some time trying to figure out how to open images.

I don't have a super easy way of doing it as far as I could tell so I'll have to take another stab at that later. And then Molly's posted something. You can talk Molly you don't have to put stuff in the chat. We like hearing from people. Oh no, the conversation was already going so I was waiting for it to be over.

Yeah, so a few videos ago, you orders formula was talked about just a bit, and it was an entity. Yeah, the Khan Academy one shows like the power series of the power. Sorry, my color. I forget. Series is that like a tailor. Oh yeah, I guess it must be like a stainless yeah like a tailor series.

Um, it shows that first sign cosine and E, and then inserts. One, the exponential. Cool. That one inserts I into it and shows how it's a combination of the site. The sign and cosine with an eye in it. Okay. Yeah. Thanks. And then you can split it and then it shows how you can split them up into the one for sign and cosine and how it ends up looking like.

Yeah, you're, you're, you're formula. Excellent. Thank you. Anything else that's come up. I was just gonna say, Jeremy I listened to some more of the array broadcast content. I listened to yours too, which was great for the other. What shocked me was that how deeply embedded APL is in Wall Street.

Yes, that's amazing. I didn't realize that that was such a long legacy there with trading. Yes. Yeah, I mean, mainly nowadays it's K and KTB, which I think most of the big sense hedge fund trading folks use. But that itself came from a plus, which Arthur Whitney built at Morgan Stanley, and I discovered the other day that it actually exists as an open source thing nowadays.

Not exactly. Yeah, historical interest. I remember who was talking about. One of the reasons that the APL community is a bit cloistered as you said last time, is that a lot of them didn't open source their plantations, it was really built around proprietary applications that limited the exposure, which is really interesting compared to something like Python.

Yeah, exactly. And it's also interesting how like proprietary trading shops like, you know, secrecy is so important to them, but also like they don't care about following cultural kind of trends and so they do tend to like pick things that are good regardless of whether they're popular. So, like Jane Street for example uses OCaml and Morgan Stanley had a plus and lots of them use APL.

Yeah, it is interesting. I know a lot of them have been moving towards using more Python in recent years though. And I think partly that might be because Python is much better for working with accelerators. This one with Aaron was like one of my favorite episodes if you haven't seen it.

And this one with Prok was interesting it doesn't actually talk about APL that much, but just like you're doing this so that you get us to talk. Whatever works. Great. Oh, so that means that whole discussion about what's in the chat. Yes. Isaac's thing you didn't actually say it.

Oh, what did I stop sharing the screen I don't even remember pressing the stop sharing screen button. Okay, fine. Yeah okay so this is K. And this is a plus. And this is this. Yeah, public. The available implementation. And, yeah, this is the one with Aaron, who built a GPU compiler in APL.

Yeah, this one with Brooke I thought was cool. Which other ones are good. There's also one with Eric Iverson which is good. Yeah, this one is good. It's a kind of a weird podcast because the first few episodes, kind of like, assume you don't know anything about array programming.

So it's like why would you be listening to an array programming podcast if you didn't know anything about it for programming. Yeah, anyway, I felt glad about the point where they talked to Eric and talked about tacit programming was first data getting interesting. I think, yeah, I didn't know that until you mentioned you're going on.

Thanks for the tones. All right. No worries. Okay, should we do some calculus then. Okay, going to some calculus. Yeah, so this is where we got to yesterday right we were doing. Rise over run. The slope. So this is a numerical approximation of a derivative. And it's an approximation because like the smaller you get this the closer you get to the slope at this exact point, but it's never, you know, quite short enough to be perfect.

So yeah, I thought it'd be nice if we could create something that would calculate this for any function, which we can do. And the way you do it is by creating a custom operator. So we could create something called gradient. And we could kind of copy and paste all this.

And let's say we put x on the left so that'll be our alpha. And our difference will be on the right. So that'll be our Omega. So that's going to be a gradient of a particular function. Right. So the gradient of f at three, or whatever that x, which is three, let's write it at three, with a difference of 0.01.

Oh, I've got to run everything. Didn't know I restarted my notebook. I'm surprised how long it takes to run, actually. Okay, it's happening. There we go. Okay. So that number is the same as that number. So the thing is, we want to replace f with. Oh, and let's simplify this.

We don't need these parentheses, right, because plus happens first. There we go. So in order to pass in a function, we can turn this into an operator. So if you look at the help for an operator, like star diuresis, you've got up to five things around it. You've got the two arguments of the function it creates, and one or two arguments of the functions that you pass to the operator.

So there's five things. So if you want to create a custom operator, this thing's going to be omega. This thing's going to be alpha. And then there's two more things. This thing is going to be called omega omega. And this thing is going to be called alpha alpha. So if we replace f with double alpha, we've now created an operator.

And so that means we now have to tell it what function to take the derivative of. Oh, omega omega. So I'm going to put it on the right. OK, what did I do wrong? I'm not aware of needing to put it in parentheses, but what did I do wrong?

Why you still keep the f? Why is there an f here? Because I've got to say, what am I taking the gradient of? So I'm taking the gradient of this function. So that's the thing that omega is going to be replaced with. So this is kind of where I find the quad operator really nice, because right in the function, you can add your print statements to, you know, so like take that first, that alpha that it's pointing at and assign that to the quad operator in the function.

And then it'll print out where it's there, hopefully. OK, which is that should run before most everything else. OK, so it should print that before it fails. That didn't work. All right, let's try making something simpler. We're going to create an operator which just calls the function. OK, so there's the world's dumbest operator.

So we should be able to go G of plus, which would be plus, apply that to two. OK. G plus. All right, so that did not work how I expected. Does it need to be alpha alpha the other way around? So normally you do like plus slash. So it goes on the right.

I don't think this is right. OK. So maybe if we search for this. I haven't normally found this search very useful, but let's give it a try. Dops. Yes. OK, so it does expect to have just alpha alpha if it's magnetic. So that means. Oh, it goes, I think, the opposite way around the way I expected.

All right. So let's change this to alpha alpha. And that would mean I think it's plus G. Ah, OK. So I guess that makes sense. Other operators work that way, like plus slash. Oh, of course they do. No. So I thought about me. You're an idiot. All right. Yes.

Yeah, somehow I had it backwards in my head. OK. All that. Fine. By the way, Isaac, for your flashcards, it occurred to me that a lot of these things don't really make sense as flashcards. And for those like it occurred to me that something that might be useful is if you.

Added tags to the ones that you want to be exported as cards, then you could go through in your script and just add cards for those that have like a card tag on it or something like that. That would be a way to avoid having lots of crap you don't need.

Yeah, I quickly learned the shortcut to suspend the card, but that would probably be a better way to do it to not have it generated in the first place. Yeah. OK. Great. Yes, that's a bit. I don't know. It's a bit weird in some ways, but I guess it kind of makes sense.

This is how you create an operator. So this is a monadic operator because it only has alpha alpha. It doesn't have omega omega. And it's a monadic operator that creates a dyadic function because it's got an alpha and omega. And so I don't think we actually need the parentheses anymore.

Yeah, we don't because operators bind more tightly. So it's as if this is parenthesized. Does that make sense? So a monadic operator takes stuff from the left. If you give it an alpha alpha, it would take stuff on the left. I mean, I assume we could go omega omega.

Although, as Isaac said, that's not quite what you would expect given how other ones work. Let's see if it works. No, you can't. OK, so yeah, it goes on the left if you say alpha alpha. And if it's on the left and the right, then you would do you would do both.

OK, so that's our custom. Derivative. And that's a numeric approximation of a derivative to be more precise. All right, so. OK, we've got a whole list of operators here. Wait, so left arrow is considered an operator. Has anybody figured out what the curly brackets means yet, by the way?

I haven't. I'll tell you an operator I'd quite like to do is this one, tilde diuresis. Think they can save a parenthesis in the one we just did. Correct. Tilde diuresis. OK, which is a monadic operator. So it's going to take one function on its left and it produces a dyadic function.

Hence, there's the one function on its left and that results in the dyadic function. It's got a bit of a strange name commute, but all it does is it takes X and Y and it returns a function that actually calls YFX rather than XFY. So if we do. OK, what's the letter for that?

Yeah. Shift T. And that's called tilde diuresis. Monadic, shift T, dyadic, shift T. Oh, there is no monadic. OK. So then there's commute. And. You could say, yeah, three minus two is that so that would be putting X on the left and Y on the right. So it's three minus two.

But if we do it the other way around, two minus three, we could also write like this, two minus, sorry, three minus. What was the letter again? T. Yeah. And then commute means switch the order of the arguments. Does that make sense? Does that afflict them around? Oh, just one moment.

I don't know what it wants to be. OK. OK. OK. OK. OK. OK. Sorry, we had a missing computer problem. So Marty found a link for brackets. Curly braces. Great. Let's take a look. I thought they might be optional arguments, but it didn't make sense for results. So it can indicate shy results.

And did you find out what a shy result is? I've heard that word before. APL shy result. Ah, OK. By default, functions print the result unless they're shy. There you go. OK, so that's an optional argument. And that's a shy result. Great. How do you define a shy result in a function?

No idea. OK. So this is a dyadic tilde diuresis. And so we can now redefine gradient. Like so. So because the right hand side is handled first, we can now say and I find it's really helpful to like find a way to say this, which is I would say Omega.

Divided into the right hand side. So I wouldn't say divide. Commute, I would say divided into like normally there's some way you can like express the idea of these things being backwards in a reasonable math or English expression. So that does make it a bit more clean, which is nice.

And then there was another version of compute of tilde diuresis, which is constant. And so constant just always returns its argument. So we could create a function called zero. And so this. This is a function. And so we can apply it to anything we like. And I believe we can even do it diatically.

So that's just a function that returns zero. And that's it for tilde diuresis. This form I see a lot. People use it very frequently in APL. Tell me when are they using it. They use it for exactly this kind of purpose. APL is height parentheses. And they height unnecessary symbols, which I kind of get like it's.

Certainly by having less stuff to read, it's I find it easier to read. You know, the other one, I think we might want to do. Each which one of these is each does anybody remembers this one. Yeah, this one. Okay, so this is just diuresis. And here it is.

This one. It's a monadic operator. Oh, this word here means can be either monadic or dyadic. It's not ambivalent as in I don't care, but it's ambivalent as in either valence. Valence is handedness. Yes, this is. Okay. This is a list of. Oops. Okay, this is a list. This is an array of arrays.

So it's an array with two elements. And if we try to do plus slash of that. It's going to get upset because it's trying to do. It's trying to insert plus between its arguments, which would be the same as typing one, two, three, four, plus five, six, seven. The each operator takes the previous function, which in this case is itself being defined with an operator.

And it means some and it applies it over each of its arguments. So plus slash each means apply plus slash to this and then apply plus slash to this. Thus giving us the results 10 and 18. Does that make sense? And I think that might work for like. We're going to get an array matrix, which is a two rows by three column array.

Iota six rows got to go between them. Cool. Thank you. Okay. So if I tried to do. Two, three plus Matt. Something like that can work in NumPy. It would broadcast the. Maybe like this or broadcast this over each row. But it doesn't add it. Also, I think can work in J, but I don't think by default it works in APL.

But I think if we say that it applies to each element of Matt. Or each column. Something like that. I think the problem is that it's going through each of two. Three, four separately. What does this look like plus slash. Okay, so it doesn't actually work that way correctly on a matrix.

On a matrix. I think this is. This might be related to when we were looking at the iota before we were searching for using it to find values in the matrix it was, you wouldn't find individual values you'd find all rows at a time. Yeah, I think the issue is it's not going over cells of an array that going over items.

So I'm guessing if we did like this. I'm going to do like. Oh, not that be. Okay, yeah, so that's going to go over each of these it's going to go over it's going to go to plus this and then three plus this. And I assume there must be some way to make that apply over a rank two array, but I don't know what it is.

So I guess, anybody fingers that out let us know otherwise I guess we'll probably come to it at some point. I put the syntax for defining a shy function in the chat. Okay. And one of the structure flow but shy, this is not shy, let me copy them. I don't really have much of a flow I gotta be honest.

Okay. Seems like you're on a roll to me. Copy this. That's more like it. Okay. So, I'm going to get shy, is it. Yep. It's not shy. So what do you actually add to the function that makes it not print out. Cool. Thanks. Yeah. Not sure when I would want things not to print out but.

Okay, so none of their examples are using matrices. The only other places can be helpful to look at is to look at the APL wiki. It's only defined in nested APLs. I think that means things we can have an array in an array. Okay, I don't know what I knew that means there if we can search, trying to search the appeal card, be nice that you can search for symbols.

So I assume there's going to be some magic incantation that basically turns a matrix into an array of arrays of rows, and that you would do it, do it that way, I assume. Okay. So you can search for a, an APL symbol on a PL cart will give you everything that's in most of them are to me but I mean that's not a bad idea to learn how to use this thing because when I was on the podcast they seem to think it was thing worth learning about.

Okay, here it is. Each. How do I see so typing comma each ensures all events vectors join items. I see these are like idioms I guess. Yeah, so, like one that I found I was working on this computational algebra that you can type in like calculate the determinant, and they've got a big long thing for that.

And so some of them. I think for most of them, the ones at the top seem to be more simpler than the ones down below. Okay, are sorted. And actually this this table's lives in a text file in a GitHub repository. Here's Conway's game of life. That's great. I'm gonna be really happy when I can read all this.

Yeah, this is intense. That's cool. I already say one thing that mentions and matrix. Can you try adding matrix to the search, I can do that works too. Pepto diagonal matrices. I believe there are, he does have additional tags and stuff that's not shown here to help with the searching.

I don't know how good they are. I'm going on matrix of shape matrix. So if I copy this and saying at the top that what each thing is, M is a matrix and capital N is a numeric array. I was just saying it's a numeric array which is a matrix I guess.

So I guess that means in theory, we could type match here. We can. Okay. This is. Okay, this is the H, which is flipped. Okay, I'm not going to try to do this just now. All right. I see slash bar and slope bar a lot. I didn't know this is called slope.

I always call it backslash. And I have no idea what they are. So maybe we should learn. It says an operator. It's a monadic operator. And we type it. Oh, the slash. Makes sense. Slash. And this is called. Replicate first. No, it's called reduce first. Oh, my daughter wants me again.

Sorry. All right. So if you do the sum plus slash, and a matrix versus this one, one will some column wise and one will some row wise. Okay. Cool. And we had a matrix ready to go then. Okay. And is that literally all it does. This is okay. So in J.

J has a rank operator, which is actually the double quote sign, where basically you can just always say what access you want to use so that would be reduced over, over the zero axis, and that would be reduced over the one axis. But I'm not sure if you can do anything quite the same.

There is a thing called rank. Maybe we should see how this is different. What's this one here, which is called. I assume. J. J. Shift J. Classic edition. I have the same like usual information. Anyway, it's called rank. I already forgot what letter I said. I guess this is called dot diuresis.

I didn't see a thing for it. On attic. Rank. And if I say. Do it over this access. Well, that sure didn't work. Oh, that might be this. Hey, look, it is the same. So, wait, no, that's the same as some. And what if we put a zero. Okay, I don't know what I'm doing.

Let's come back to that one and make sure we okay so it sounds like slash bar there's not much to learn, which is it's just the same as slash. But it does it over a different axis. I assume is going to be the same for backslash bar. Except they didn't call it backslash they called it slope.

And that's probably going to be back tick backslash I'm guessing. Nope, it's not right. Okay, so we can, we can specify the axis of plus slash. Yep. By adding like bracket one, right after the operator. I believe, and then bracket. No square brackets, immediately after the operator, so just like that.

Okay, so does that apply for like the cease. Does that apply for like everything or is that just this particular. I feel like I've seen that mentioned in the docs. Yeah, it's called a function access. Like operators like to you can you put it on all functions or. I think it's all functions.

Okay, so why on earth do we need slash bar access access. This is what I'm looking at here. This must be a monadic primitive mixed function taken from these or function. Okay, I can't apply to many things. It's just slash or slope for these. There's a bunch of And then you've also got access with diet a copper and it must be a dyadic primitive scalar function.

Oh, well, that's good. So any dyadic scalar function primitives go. Or mixed function which I see means one with an operator where they've used one of these. Okay, so actually does sound like you can do a lot of things with it. That's good. Yeah, there's a, there's a wiki I put on the chat about the function access specifically and kind of covers both.

I think it's kind of a combination of these two. Okay, cool. I see a few other people have missed them all chat by the way, is there anything we wanted to talk about or ask about anybody wants to speak up. Function access. Mine was was not specifically related to the content we were discussing so didn't want to bring it up.

Oh, please do. We're so like not at all focused Molly, as you'll see. This is just like, yeah, hanging out chatting about whatever seems interesting. This formula is definitely interesting. Oh yeah so I came first I first came across dealers formula in a paper that I was reading on positional embeddings.

So, I had no idea what it was. I actually use this to figure out the fundamentals I was missing. Yeah, it was rotational positional embeddings what it was called. Okay. So, for those of you who are interested in transformer models. There's like literally no sense of like the order of things in a vector.

And so it's like literally impossible for it to learn anything that requires order and language is ordered. So we use these things called positional embeddings. This rotary position. It's been a while. Yes, that's it. And then there's a proof. Yeah, this was the actual proof I was going through, trying to figure out what they were talking about.

All right. Yeah. And then I was missing a lot of the fundamentals so I just go down to where they're doing the actual proof. And then you can just click on the whatever you do not understand, like the proof for what you don't understand. So if you don't understand complex cosine, like how they're able to get the complex cosine function like infinite you don't find yourself clicking.

Come back to where you started. Somewhere that can happen. But yeah. And then I just start looking up the various things they're talking about here until I understand it. So, okay, that's cool. It's just one way I was able to understand papers, like the math behind papers and stuff.

Thanks. That's great. Okay, we might stop now and probably talk about access next time. Access. Something's going to be there, put a number. We can do that next time. That's going to be super handy. All right. Thanks everybody. See you next time. Bye.

fast.ai APL study session 7

Chapters

Transcript