back to index

The Fastlite DB library - Answer.AI dev chat #2


Whisper Transcript | Transcript Only Page

00:00:00.000 | OK, welcome to our dev chat with Jono, Alexis, Luke.
00:00:06.860 | We are going to talk about this library
00:00:12.280 | that I just created over the weekend called Fastlight.
00:00:16.800 | It was meant to be called FastSQLite,
00:00:19.520 | but there's already a FastSQLite,
00:00:21.160 | so it became Fastlight.
00:00:24.480 | I think it just took me a day to write this, which
00:00:27.040 | I'm quite pleased with how quickly it went
00:00:29.760 | and it's really because it's just pasting together
00:00:34.480 | two other people's work on the whole.
00:00:40.720 | So let me describe what this is about.
00:00:44.360 | So I'm kind of on a bit of a mission
00:00:49.320 | at the moment to make web application programming easy
00:00:53.720 | again, like put a PHP file in your home directory kind
00:00:58.680 | of easy, or stick a .pl file in your CGI bin directory
00:01:04.200 | kind of easy.
00:01:07.160 | So I've been looking into like, OK,
00:01:11.160 | where do we stick our data for our web application?
00:01:14.960 | And two very popular options at the moment
00:01:19.000 | are SQLAlchemy, which is kind of the original ORM,
00:01:24.560 | Object Relational Mapping, and the very trendy SQLModel
00:01:35.240 | from Sebastian, the amazing author of FastAPI,
00:01:39.320 | amongst other things.
00:01:40.320 | And yeah, because I think Sebastian's work tends
00:01:47.680 | to be extremely good, I kind of assumed
00:01:49.320 | I would go with SQLModel.
00:01:50.600 | And I went through the kind of documentation,
00:01:53.600 | the original tutorial.
00:01:56.080 | And I kind of very quickly early on got to this bit
00:01:59.960 | where it's like, oh, OK, you have
00:02:02.040 | to create these five different things
00:02:03.760 | with different inheritance hierarchies
00:02:06.920 | and passing things to constructors
00:02:09.320 | and a rather lengthy--
00:02:12.920 | and I was like, wow, that's--
00:02:15.960 | I'm not sure I'm clever enough to remember or do all that.
00:02:19.440 | I think I'm probably too lazy to do it anyway.
00:02:23.840 | So I was a bit turned off of that.
00:02:25.320 | So then I started--
00:02:29.760 | I've used SQLAlchemy before, but I always
00:02:32.520 | forget how to use it, which is never a good sign.
00:02:34.800 | And there's been a version 2, so I wanted
00:02:37.040 | to use the new version properly.
00:02:39.080 | And again, I kind of started looking at their quick start.
00:02:43.880 | And you have to have a subclass and then
00:02:46.880 | a subclass of that subclass.
00:02:48.720 | And magic names in them and both annotations and defaults.
00:02:57.560 | And again, I was like, I don't think I can do all that.
00:03:05.600 | So also, I know how to write SQL.
00:03:11.440 | And I know how to use GUIs.
00:03:13.360 | I really like GUIs for databases.
00:03:15.600 | You can kind of see them all there and move things around
00:03:18.640 | and--
00:03:20.080 | I thought, like, hmm.
00:03:21.240 | I started to link these two together and understand,
00:03:26.480 | like, well, why can't I just use SQL?
00:03:30.720 | And you can.
00:03:31.720 | Like, SQLAlchemy also has this SQLAlchemyCore piece,
00:03:38.440 | which is part of this unified tutorial, which
00:03:46.120 | is this foundational Alchemy database toolkit, whatever.
00:03:51.040 | So I read quite a bit of that.
00:03:52.320 | And I also got an O'Reilly book about SQLAlchemy and read that.
00:03:57.080 | I still felt not smart enough to do that either.
00:04:03.320 | So I kind of kept looking around a bit
00:04:07.320 | and remembered that my friend Simon Willison had created
00:04:17.600 | a thing called SQLiteUtils, which I remembered
00:04:21.600 | was this command line application for using SQLite
00:04:29.600 | for doing stuff like this.
00:04:35.000 | And you could pipe data to it.
00:04:36.560 | And you could do CSV files and all this.
00:04:39.720 | And I remember when Jono and I were
00:04:41.800 | looking at Simon's LLM library, one of the things
00:04:47.720 | we discovered was that, like, actually the CLI was just
00:04:51.560 | a kind of a thin wrapper for a really nicely designed Python
00:04:55.880 | And I thought, like, oh, I wonder
00:04:56.800 | if that was the case for this, too.
00:04:57.880 | And I looked.
00:04:58.380 | I was like, oh, it is also the case here.
00:05:00.760 | And I was like, oh, OK.
00:05:03.840 | So then I kind of started again.
00:05:06.320 | And this is the getting started.
00:05:10.960 | I was like, OK, even I can do this.
00:05:14.160 | Database, name of file, table, insert, a list of dictionaries,
00:05:22.760 | loop through them like this, get back.
00:05:24.840 | And I'm like, OK, I think I'm done.
00:05:28.840 | I'm ready to work now.
00:05:31.000 | So that's why I leant towards this.
00:05:36.360 | So I don't know.
00:05:37.120 | Do you guys have any thoughts about should I be--
00:05:40.800 | is this a mistake?
00:05:41.640 | Should I be using an ORM?
00:05:43.480 | Or does this seem like a reasonable place to start?
00:05:45.640 | So I've had this experience.
00:05:51.920 | And it's been mixed for me where I see something and say,
00:05:54.960 | that looks way too complicated.
00:05:56.320 | I'm sure I can figure out something simpler.
00:05:58.960 | And then I either write or find a thing that looks simpler.
00:06:01.880 | And then you hit a point where it's like, oh,
00:06:03.880 | it would be really good if it did this or that.
00:06:06.800 | And so then you start tacking on those bits.
00:06:08.640 | And you end up with something that ends up just
00:06:10.600 | as complicated as the end product.
00:06:12.280 | But I'm assuming, because you're showing us this thing
00:06:14.480 | and you're happy with it, that we're hopefully
00:06:16.400 | not going to end up back at the convoluted Jeremy's
00:06:18.640 | version of SQLchemy.
00:06:20.840 | No, I mean, I don't think so.
00:06:22.160 | Because you just passed SQL, which I guess--
00:06:28.440 | OK, it's a whole other language designed out
00:06:31.720 | of a whole area of math called relational algebra
00:06:33.960 | to be especially around dealing with relational data.
00:06:38.560 | So I don't think you could ever hit a point where it's like,
00:06:41.800 | oh, we can't do stuff.
00:06:42.880 | Because this is just like, no.
00:06:45.040 | When I say just use SQL, for somebody who doesn't know SQL,
00:06:48.680 | that's a huge just.
00:06:50.600 | But I don't know.
00:06:52.480 | It feels like in the past, when I did a lot of stuff
00:06:58.160 | with C#, that's what we wrote Kaggle in.
00:07:01.120 | We were always talking about SQL versus ORMs.
00:07:04.040 | And I don't know.
00:07:07.560 | They're both fine, I guess.
00:07:09.800 | But in this particular case, I think--
00:07:12.000 | I don't know what it was about it.
00:07:13.440 | But I guess Python being a dynamic language,
00:07:17.640 | ORMs tend to be quite a good fit for more static languages,
00:07:22.200 | maybe.
00:07:23.440 | But even there, when I used to use F#,
00:07:25.800 | F# had this really cool thing called type providers,
00:07:28.320 | where it actually looked more like this,
00:07:35.600 | which is pretty neat.
00:07:37.600 | So yeah, I don't think--
00:07:39.480 | I'm pretty sure you'd never get to a situation where it's like,
00:07:42.100 | oh, we're going to have to add a whole lot of complexity.
00:07:44.200 | You might get to a situation where it's like,
00:07:46.200 | oh, I would love to add a few more shortcuts
00:07:48.080 | to make some of this SQL a bit easier to work with.
00:07:51.400 | But even there, I feel like the way we do SQL is like--
00:07:59.000 | you just pass parameters, you know?
00:08:00.880 | Question mark, insert, or do it by name.
00:08:06.860 | I quite like the look of it.
00:08:08.240 | I agree with your instinct here, because I
00:08:12.000 | think it's often a sign of a good taste
00:08:14.160 | when something doesn't require you to learn a lot to use it.
00:08:17.520 | So if it doesn't require you to learn its own concepts,
00:08:20.120 | its own type names--
00:08:21.240 | Yeah, also, I trust Simon.
00:08:23.960 | Simon, he's written lots of stuff in this,
00:08:27.160 | and he's been working on this since 2018.
00:08:32.160 | So presumably, if it wasn't doing what he needed it to do,
00:08:36.320 | he would have noticed by now.
00:08:40.160 | So OK, so that's the background.
00:08:42.920 | So then I just started playing with it.
00:08:48.720 | And so when I spoke to Sebastian, who writes SQL Model
00:09:01.080 | and FastAPI and whatnot, he told me
00:09:04.040 | that a big driver of how he writes things and what he writes
00:09:07.400 | is he wants autocomplete.
00:09:09.680 | And I feel the same way.
00:09:11.920 | And I think this is actually often what's going on here,
00:09:14.560 | is why not just use a create table statement?
00:09:22.840 | It's like, well, in VS Code or whatever,
00:09:26.400 | you need something like this for it
00:09:28.240 | to be able to give you autocomplete,
00:09:29.760 | because otherwise it doesn't know what fields are in a user
00:09:32.760 | and what their types are and so forth.
00:09:36.720 | And it's not just the autocomplete,
00:09:38.280 | but also because this is working with identic,
00:09:41.360 | which checks things.
00:09:42.880 | That's why there's all this extra stuff here
00:09:44.760 | to make sure at each time you're validating correctly.
00:09:50.880 | And I guess if I was using a static language,
00:09:53.600 | maybe that's what I'd be wanting.
00:09:55.360 | But in a dynamic language, I feel
00:09:58.800 | like I would like to take advantage
00:10:00.200 | of the dynamic features of the language.
00:10:02.480 | That's kind of why I use Python to a large degree.
00:10:08.000 | So the kind of thing that that means,
00:10:11.920 | then, is if you--
00:10:13.320 | I don't know if you guys like Brett Victor.
00:10:16.000 | Brett Victor has this idea of never
00:10:20.880 | work in a dead environment.
00:10:23.680 | He's got this talk called "Stop Drawing Dead Fish."
00:10:29.160 | Always work in a live environment.
00:10:30.600 | A live environment is something where you're directly
00:10:32.800 | manipulating the information or the content
00:10:37.000 | or whatever in real time.
00:10:38.520 | You can see it.
00:10:40.640 | And yeah, you can't get that in VS Code.
00:10:43.360 | I mean, VS Code has notebooks.
00:10:44.560 | But if you're not using notebooks,
00:10:46.000 | the rest of the notebook, it's a live environment, right?
00:10:49.200 | So if I start writing a few cells here,
00:10:56.240 | so I, from sqliteutils, I've imported database.
00:11:02.360 | And then this database, Chinook, is something you can download.
00:11:06.440 | I've got it in the readme, how to download it.
00:11:11.200 | And that's now something I can--
00:11:13.800 | if I hit Tab, I know these are the actual things in DB
00:11:18.800 | because it's actually checking dynamically.
00:11:23.400 | It's actually behind the scenes doing that.
00:11:28.480 | And the ones that start in underscore,
00:11:30.320 | it doesn't show me because they're considered private.
00:11:32.520 | So it'll show me these.
00:11:35.600 | So you can check that, db.a tab.
00:11:40.000 | There they are, right?
00:11:41.040 | But having said that, I can do like this to see the tables.
00:11:54.080 | I can do this to get a particular table.
00:12:01.200 | Sorry, this to get a particular table.
00:12:07.880 | But this particular library isn't
00:12:10.720 | designed to take advantage of Python's dynamic features.
00:12:16.080 | So I can't hit Tab here or anything and see the tables.
00:12:20.760 | So I have to copy and paste or go to tables, find it,
00:12:25.360 | double-click it, come back, do this, et cetera.
00:12:31.520 | So I don't like that.
00:12:34.040 | So yeah, basically, the main thing I wanted to do
00:12:37.600 | is make it so I can use this as easily as I can use files
00:12:43.240 | and folders in Bash or ZSH by hitting Tab and seeing
00:12:47.280 | all my options and stuff.
00:12:48.720 | So I'm going to show you what it looks like to use it.
00:12:56.680 | And yeah, tell me if you think this--
00:12:59.520 | yeah, if any of it doesn't make sense
00:13:01.520 | or doesn't seem interesting or whatever.
00:13:05.120 | So yeah, so in the--
00:13:08.760 | this is the-- so because this is an nbdev project,
00:13:11.920 | that means that the index notebook
00:13:14.640 | is the same as the readme, is the same
00:13:17.680 | as the documentation.
00:13:21.400 | So yeah, so this is the repo.
00:13:27.120 | You can click up here to get to the documentation.
00:13:30.600 | And the documentation is the same as the index.ipamd,
00:13:37.920 | which is the same as the--
00:13:41.120 | which is the same as the readme.
00:13:45.720 | And so it's nice that you can open up
00:13:48.040 | that in Jupyter or Colab or whatever and try it.
00:13:53.720 | So my database now with Fastlight--
00:14:01.240 | Fastlight adds a T property.
00:14:03.520 | SQLAlchemy has a C property, which is similar.
00:14:05.960 | So I'm going to try to borrow some ideas from SQLAlchemy
00:14:08.280 | as well, make things not too weird.
00:14:11.120 | So db.t.
00:14:13.000 | And that has autocomplete.
00:14:18.040 | And so I can click it, or I can tab, enter,
00:14:25.800 | and I've got the actual thing.
00:14:30.320 | I'm too lazy to press B, full stop.
00:14:33.760 | So you can store that in a thing.
00:14:38.160 | So that's the database tables.
00:14:39.760 | And also, I've tried to--
00:14:40.880 | another thing I think is nice when
00:14:42.880 | you're working in a notebook is to make sure things
00:14:45.000 | have a nice representation.
00:14:46.920 | So when I show the database tables object, which
00:14:50.520 | I remember is the thing I've got this tab completion on it,
00:14:54.200 | it lists out the tables.
00:14:58.920 | So yeah, so I can grab a particular table.
00:15:04.200 | I can store it somewhere.
00:15:08.080 | Or I can-- another thing I added is
00:15:09.680 | you can grab multiple tables by using an indexer, which
00:15:15.040 | I think is quite Pythonic.
00:15:16.280 | And yeah, as mentioned, we can also do our tab completion.
00:15:32.760 | So that's piece number one of this.
00:15:38.320 | And then, yeah, piece number two is
00:15:39.880 | once you've got a table, which remember we got by doing this.
00:15:44.280 | So from the database tables, we got the artist table.
00:15:47.480 | So from that, we can get the columns with C.
00:15:50.680 | And so you won't be surprised to hear that, first of all,
00:15:54.440 | that column list has a nice representation
00:15:57.560 | and that we can autocomplete it.
00:16:06.480 | So columns are less interesting than tables.
00:16:11.040 | Table is actually a particular type
00:16:19.600 | that has things like primary keys and various methods
00:16:27.920 | and whatever else.
00:16:29.400 | Columns are nothing fancy, actually.
00:16:32.360 | In SQLite Utils, so the only thing a column basically
00:16:49.320 | has of interest is its type and its name.
00:17:01.800 | So since my thinking was like, yeah, let's just use SQL,
00:17:07.320 | that means I want to have tab completion
00:17:10.000 | as I write SQL queries.
00:17:11.840 | So this is kind of nice.
00:17:13.160 | If I do an fstring, then I've got tab completion
00:17:20.520 | of any tables I've created.
00:17:22.720 | Or I could do it this way as well.
00:17:27.760 | So in this case, I've already got it in a variable.
00:17:31.200 | And ditto-- so for my album columns,
00:17:37.480 | I've got completion of that as well.
00:17:39.080 | Now, what's that doing?
00:17:44.560 | All it's doing, if I say-- if I stringify that column name,
00:17:49.640 | is this is what it appears as.
00:17:51.040 | It appears as the artist in double quotes
00:17:53.360 | dot name in double quotes, which in SQL--
00:17:56.880 | most dialects of SQL, including SQLite,
00:17:59.280 | that's how you represent a name that
00:18:05.280 | could have spaces in it, or slashes in it,
00:18:07.640 | or whatever else.
00:18:09.120 | So this basically ensures that your names
00:18:11.440 | will work correctly.
00:18:12.880 | Ditto for your table name here.
00:18:15.160 | So it's not doing very much.
00:18:16.520 | It's just giving you this convenient tab completion
00:18:20.240 | and correct behavior.
00:18:25.320 | Can I ask a question?
00:18:26.200 | Yeah.
00:18:27.520 | So the reason that you were able to get a tab completion
00:18:30.600 | on the first field, artist with a lowercase a,
00:18:33.680 | that's just because you happened to have
00:18:35.640 | defined that variable in the notebook environment, right?
00:18:38.480 | It wasn't--
00:18:39.960 | Exactly.
00:18:41.280 | So the things that get tab completed in Jupyter
00:18:47.960 | are the keys to this dictionary.
00:18:53.280 | This dictionary, called globals in Python,
00:18:56.800 | is all the stuff you have access to.
00:18:59.480 | So for example, there's a--
00:19:00.560 | so if I go g equals globals, and then I look up,
00:19:11.680 | here is all of the things.
00:19:14.160 | So I should find that there's a thing called open, for instance.
00:19:22.560 | There it is, right?
00:19:23.920 | And that would be exactly the same as saying globals open.
00:19:30.000 | So this is just the--
00:19:31.200 | this is what Python's syntax sugar is, you know?
00:19:36.320 | So that's the top level--
00:19:37.960 | You could write it with no bare words except the word globals,
00:19:41.360 | and for, and if, and def, you know?
00:19:45.920 | So if you, say, were manipulating two SQLite databases,
00:19:50.320 | and then you were still just authoring that query string,
00:19:53.680 | and you wanted to get autocomplete that
00:19:55.680 | was scoped to the tables in this database versus that database,
00:19:59.920 | then you'd need to do something like db1.tables.
00:20:03.080 | or something, and then it might--
00:20:04.400 | Well, it's not tables, it's t.
00:20:05.600 | But yeah, so you could-- exactly.
00:20:07.200 | So you'd have db1 and db2, and you'd have db.t.blah.
00:20:11.200 | Or you could have, earlier on, if you're
00:20:12.960 | using it multiple times, you would
00:20:15.160 | have said, like, visual artists equals db visual.t.artist,
00:20:27.840 | or whatever, and then there'd be other artists.
00:20:30.560 | And yeah, so you can name them whatever you like.
00:20:33.800 | OK, cool.
00:20:35.120 | So yeah, that is just a regular variable
00:20:39.120 | that happens to contain a symbol of type db.table.
00:20:47.840 | And it stringifies like so.
00:20:51.760 | And that's why that works.
00:20:56.800 | So yeah, so I added a little .q thing,
00:21:00.400 | which you just pass a query to, and it just
00:21:02.520 | returns the result of executing that query on that database.
00:21:09.080 | So just like t is tables, v is views.
00:21:16.680 | So this particular database doesn't come with any views.
00:21:19.720 | So I created a very important view,
00:21:22.120 | which is all of the albums in the database created by ACDC,
00:21:27.920 | or as we call it in Australia, ACADACA.
00:21:31.000 | So there is our ACDC.
00:21:33.960 | So here, I've created my album variable.
00:21:36.820 | And again, I don't have to do that.
00:21:38.320 | I could just write dt.album.
00:21:41.000 | And then I've got a SELECT statement here.
00:21:44.000 | And if you want to, you could have a look at that.
00:21:54.040 | So that's all it's done, right?
00:21:58.640 | Again, album just adds the quotes around it.
00:22:02.280 | AC.name fully qualifies it.
00:22:05.200 | So that's the view that we created.
00:22:09.680 | I don't know how common this is, if it's just a SQLite thing,
00:22:12.240 | but this is just a shortcut for saying onAlbum.artistId
00:22:15.720 | equals artist.artistId.
00:22:17.840 | It's just a convenient way of doing an inner join when
00:22:21.200 | they both have the same name.
00:22:23.720 | Anyway, the main thing there is to say we created a view.
00:22:28.000 | And then we can query the view.
00:22:30.600 | And so now, bump.
00:22:35.240 | There's only one thing in db.v, because there's just one view.
00:22:42.400 | So this next bit's kind of magic.
00:22:46.040 | You can go Diagram, db.tables.
00:22:50.040 | And I find this super helpful.
00:22:51.560 | I'm a very visual person.
00:22:53.680 | I don't really understand a database until I see it.
00:22:57.840 | And so you can see here this Chinook database
00:23:01.360 | that we downloaded.
00:23:02.520 | There are employees.
00:23:04.160 | There's a self join on reports, too.
00:23:06.160 | There are artists that have albums.
00:23:13.640 | Albums that have-- I should probably
00:23:16.240 | join these arrows in the opposite direction,
00:23:18.600 | shouldn't I?
00:23:20.880 | So it should be an album, one to many has--
00:23:24.120 | yeah, so I might--
00:23:25.160 | I don't know, which way around does it make sense for the--
00:23:27.640 | should it point to the primary key,
00:23:29.240 | or should the primary key point to the many side?
00:23:32.560 | What did access do?
00:23:33.520 | Oh, they didn't have arrows.
00:23:37.840 | That's right.
00:23:38.400 | They had to do one indicators and many indicators.
00:23:42.160 | That doesn't quite help.
00:23:45.560 | Anyway, we'll think about whether we
00:23:50.040 | should turn those around.
00:23:51.720 | So an artist has album, an album has tracks.
00:23:54.520 | A track is-- could be a number of tracks in an invoice line.
00:24:01.360 | So I guess this is a many to many.
00:24:04.160 | Invoice has many tracks.
00:24:05.640 | Track has many invoices.
00:24:06.960 | An invoice is for a customer.
00:24:08.200 | A customer has a support rep.
00:24:09.600 | A support rep has someone they report to.
00:24:14.480 | And tracks can be on playlists.
00:24:16.560 | And playlists can have tracks.
00:24:18.680 | Anyway, so that's that.
00:24:19.680 | And then this is just a list of tables.
00:24:22.400 | And so we can combine that with that convenient indexing thing
00:24:26.760 | we saw earlier to pick a subset of tables
00:24:29.240 | to diagram if you want to.
00:24:31.760 | And you can give it a size and a ratio.
00:24:33.800 | Yeah, so that's about it.
00:24:39.360 | So I'm planning to--
00:24:41.280 | I think this will make life easier for me
00:24:42.920 | as I try to build up my web application from a database.
00:24:51.560 | Yeah, I don't know.
00:24:52.520 | Do you guys have any thoughts?
00:24:54.920 | It's like watching this, it feels
00:24:59.520 | like there's some new libraries that you
00:25:01.480 | write that are entirely around some piece of functionality
00:25:04.000 | that doesn't exist.
00:25:04.840 | I want to create this type of file from that type of file.
00:25:07.600 | And then the user interface is like a single command line
00:25:11.600 | script or whatever that gets run.
00:25:13.360 | And this is like the complete inverse.
00:25:15.640 | A lot of the functionality of--
00:25:16.920 | I haven't got any functionality, except the diagram.
00:25:20.720 | But the user interface is what's important.
00:25:22.840 | And it's not that Simon's library doesn't
00:25:24.720 | have a nice user interface.
00:25:25.800 | It has an excellent user interface
00:25:27.160 | for the command line, which is where he likes to work.
00:25:31.560 | And it's just a different user interface
00:25:33.400 | that works best with--
00:25:34.720 | Just stop for a moment.
00:25:35.840 | I just-- let me--
00:25:39.040 | I just saw Vic saying something in Discord.
00:25:41.800 | I don't want that to appear in our dev chat.
00:25:44.920 | Can you restart your last sentence?
00:25:47.880 | And I will edit over it.
00:25:49.880 | [AUDIO OUT]
00:25:52.080 | So it's not that Simon's user interface
00:25:54.640 | for this functionality is bad.
00:25:55.880 | It's just that it's very optimized for the command line.
00:25:58.360 | It's excellent for the command line, which
00:26:00.160 | is where he likes to work.
00:26:01.320 | And VS Code, probably.
00:26:02.680 | Yeah, and VS Code and others.
00:26:04.840 | This is not just basically creating a new UI
00:26:07.480 | or translation of that to the type of Jupyter interactive
00:26:11.360 | coding that suits you.
00:26:13.560 | Yeah, yeah.
00:26:14.440 | And Sebastien's like SQL Model and SQL Acme, in particular,
00:26:18.080 | really focused on getting autocomplete and stuff
00:26:22.240 | into VS Code.
00:26:24.760 | And yeah, that's totally fine.
00:26:30.600 | Do you know what I mean?
00:26:31.640 | But it does mean that you have to-- at some point,
00:26:33.760 | you either have to--
00:26:35.000 | instead of using Create Table in SQL,
00:26:38.320 | you have to define it as a class and then
00:26:40.400 | have some process that creates the tables from the class.
00:26:45.880 | Or you have to have something that kind of codegens class
00:26:49.000 | files that VS Code can read from the database.
00:26:51.360 | And SQL Acme provides both of those things.
00:26:55.040 | But yeah, I like this kind of dynamic approach
00:26:57.960 | where I feel like SQL is a language that I'm
00:27:03.000 | quite comfortable with and is--
00:27:05.920 | I can use everywhere.
00:27:08.200 | And I don't have to learn the particular quirks, quirks
00:27:12.960 | not in a bad way, but each ORM has
00:27:14.760 | its own way of doing things.
00:27:17.960 | Yeah, that's why I don't really have to learn that.
00:27:20.160 | Yeah.
00:27:22.680 | It's funny to think of the--
00:27:25.640 | it almost feels like an oxymoron
00:27:27.960 | to have this database interaction in the notebook.
00:27:31.000 | Because the way I often think about it,
00:27:33.160 | or at least I have friends who work on companies
00:27:35.320 | where you sit and write down the database on paper
00:27:38.200 | or in a diagramming software.
00:27:40.360 | And it becomes the static thing of this
00:27:42.480 | is the types of data that we're committed to using.
00:27:45.320 | We have these rows.
00:27:46.680 | We can't change them.
00:27:48.520 | This database is this fixed object.
00:27:49.960 | And then my code would now use this static definition.
00:27:55.600 | And then I'd want to follow that very rigidly.
00:27:57.800 | And so then it really makes sense to have, OK,
00:28:00.400 | I define my database.
00:28:01.720 | Maybe it's in SQL that we write it,
00:28:03.600 | or maybe it's in something that spits out SQL.
00:28:05.840 | But then also I have a way to interact with that
00:28:07.800 | for my code that's very known in advance and defined.
00:28:14.000 | Yeah, I guess that's maybe--
00:28:15.560 | That's our way.
00:28:16.720 | But also a lot of DBAs do it a very different way, which
00:28:20.840 | is they live in SQL admin applications.
00:28:26.480 | And they tend to have auto-completion of column names
00:28:30.760 | and table names and stuff.
00:28:32.960 | This was the original intention.
00:28:34.280 | SQL was supposed to be human-friendly and interactive.
00:28:36.680 | It just ended up being maybe not quite as
00:28:39.800 | human-friendly and interactive as we'd all
00:28:42.000 | like it to be because of language design failures
00:28:46.120 | or just it's a difficult domain.
00:28:50.000 | I quite like it.
00:28:50.960 | I have one thought and one question.
00:28:54.800 | The thought is, I think, that this interface that you've
00:28:58.880 | presented has the same merits as the library assignments
00:29:03.760 | that you're building on top of, which is it doesn't require you
00:29:06.260 | to learn a new thing because it's working
00:29:07.920 | with what's already there.
00:29:09.600 | Simon's thing just requires you to know about Python dictionaries
00:29:12.440 | and iteration in order to interact with SQL
00:29:14.400 | rather than learn a bunch of maybe cockamamie new things
00:29:20.960 | that aren't new enough to justify existing.
00:29:23.200 | And here, you're working with the autocomplete that's
00:29:25.760 | already in Jupyter.
00:29:26.880 | You're not requiring a new Jupyter extension
00:29:30.200 | and a new key bind.
00:29:31.240 | That's the charm of it.
00:29:32.480 | It'll also work in IPython.
00:29:34.280 | It'll also work in pretty much any other interactive
00:29:36.540 | environment.
00:29:36.900 | Yeah, that was my question.
00:29:37.740 | Because you're saying, well, this is good for notebooks.
00:29:39.340 | But it's really good for anyone who doesn't
00:29:40.460 | want an interactive prompt.
00:29:41.580 | So if I'm just using, as I often do,
00:29:43.900 | the Python prompt in Emacs where there's tab completion
00:29:47.620 | and that's based on read line, I think
00:29:49.380 | I'd still get the better--
00:29:50.460 | Could work fine in Emacs as well, yeah.
00:29:52.220 | Yeah.
00:29:52.900 | Yeah.
00:29:53.400 | Yeah, so let me show you how it's implemented.
00:29:55.580 | Yeah, that was my next question.
00:29:56.860 | I'm kind of curious how this works because you've
00:29:58.980 | emphasized, well, you'd need to use all these static types
00:30:01.400 | to get autocomplete in VS Code, but you're
00:30:03.740 | getting autocomplete without building a static type
00:30:07.580 | superstructure.
00:30:08.220 | So I imagine you're--
00:30:09.500 | And this is like--
00:30:11.900 | the interesting thing is, Alexis,
00:30:13.660 | I know a lot of your background has been Swift,
00:30:17.940 | which is a static, largely a static language.
00:30:20.340 | It has dynamic extensions.
00:30:23.900 | And it was created originally as a static language.
00:30:25.940 | Python was created as a dynamic language.
00:30:29.180 | And it's funny.
00:30:32.180 | People are trying to turn it into a static language.
00:30:34.380 | It's almost as if people feel like they're not
00:30:35.980 | proper grown-up software engineers
00:30:37.460 | unless they have strong typing and all this stuff.
00:30:41.860 | But actually, as you'll see, as we dig into how this works,
00:30:46.020 | Python is very much designed to have a lot of flexibility
00:30:50.100 | for the developer to provide dynamic behavior that
00:30:53.980 | works for you.
00:30:55.700 | So in fact, if we look at the actual Python file that
00:31:03.460 | gets built, this is it.
00:31:07.700 | All of the functionality in this,
00:31:13.300 | if you think about how many blank lines there are,
00:31:15.340 | it's about 100 lines of code.
00:31:16.500 | And it's 15 before we get past the comments
00:31:24.260 | and the imports.
00:31:27.700 | So in practice, it might be more like 80 lines of code or less.
00:31:31.980 | So yeah, it really is just using the functionality
00:31:37.860 | that Python provides, that Guido created when he created Python.
00:31:43.220 | And this is functionality that's been in Python
00:31:45.140 | pretty much since day one.
00:31:47.420 | And it goes back to this basic idea of Python
00:31:49.820 | is this very small kernel, which everything else is
00:31:54.380 | built with that kernel, really.
00:31:56.700 | So let me show you what I mean.
00:31:58.660 | So as with pretty much everything I write,
00:32:02.460 | the implementation, the source code of Fastlight
00:32:06.260 | is a notebook.
00:32:07.060 | And so as I built this, because I'd never
00:32:12.860 | used Simon's thing before, I was trying to use Simon's thing.
00:32:18.660 | And each time I found myself not having access
00:32:21.740 | to something I wanted, I wrote that thing,
00:32:23.860 | and then I kept moving on.
00:32:25.020 | So that's kind of how this got built.
00:32:28.020 | So what happens when we do db.t?
00:32:34.340 | So that is an object with a particular type.
00:32:38.540 | And its type is tables getter.
00:32:41.620 | So here's the definition of tables getter.
00:32:43.900 | Now tables getter derives from getter plus one thing.
00:32:49.460 | So it might be easier to pretend that this dir was actually
00:32:57.380 | over here, and that this thing was called tables getter.
00:33:01.540 | That's the same thing as what the inheritance does, right?
00:33:04.620 | It just sticks it in there.
00:33:08.380 | So remembering that basically this can be thought of as all
00:33:12.100 | one thing, because that's how inheritance works,
00:33:17.220 | what happens when we just write dt on its own?
00:33:21.620 | How did it end up with this comma separated list of tables?
00:33:27.060 | So that's because pretty much every part
00:33:32.300 | of the behavior of classes in Python
00:33:34.860 | is defined by methods of that class.
00:33:38.580 | And what is displayed in a notebook, or in IPython,
00:33:42.300 | or in lots and lots and lots of places in Python
00:33:44.540 | and things around that, is defined by this dunder repra.
00:33:50.380 | So something with two underscores on either side
00:33:52.780 | means this is a magic method, which
00:33:55.540 | will be called by Python or some other thing in some situation
00:33:59.220 | automatically.
00:34:00.380 | So this is the thing that will get called-- specifically,
00:34:03.700 | it's called when this function called repra is called.
00:34:09.020 | And that function called repra is called by Python a lot.
00:34:13.100 | And it's called by Jupyter any time something is displayed.
00:34:15.620 | It provides the representation of that item.
00:34:17.980 | So when we say dt, it actually calls dunder repra.
00:34:22.940 | And that returns commas joining up der self.
00:34:30.100 | So obviously, that means we want to know what is der dt.
00:34:35.300 | And somehow, that's become a list of the things
00:34:37.500 | we wanted to join.
00:34:39.460 | Der is used, again, in lots of places in Python.
00:34:44.020 | And specifically, any time you do an autocomplete,
00:34:48.160 | it's actually calling der.
00:34:50.900 | And when you call der behind the scenes,
00:34:53.100 | it actually calls dunder der.
00:34:55.380 | So you'll see that is der is exactly the same as me going
00:34:58.820 | dt dot der.
00:35:05.500 | Same thing.
00:35:07.220 | So here's der.
00:35:10.940 | And so that just returns the table names.
00:35:13.980 | And that's just part of SQLite utils.
00:35:19.260 | So OK, so a getter joins up the table names
00:35:26.100 | when you get a representation of it.
00:35:28.580 | So that's how that works.
00:35:29.660 | Does that make sense so far?
00:35:33.860 | Makes sense.
00:35:35.100 | How come this works?
00:35:36.380 | Well, when you square bracket index into something in Python,
00:35:40.300 | it calls this special thing, dunder get item.
00:35:44.660 | So dunder get item, if you pass a single thing,
00:35:53.540 | then this will be just a string.
00:35:57.220 | And if you pass a bunch of things,
00:35:59.140 | then this will be a tuple.
00:36:02.300 | So if you passed in a single thing, for simplicity,
00:36:05.460 | I'll just make it a list with one thing in it.
00:36:08.140 | And so then I just want to return
00:36:10.860 | that table for each table that you requested.
00:36:15.460 | So that's how that works.
00:36:17.540 | So we want to do the same thing for columns.
00:36:26.060 | So I created a columns getter.
00:36:30.060 | It's got a der.
00:36:31.660 | It's got a repre.
00:36:33.820 | And this one's got a bit of extra magic, which
00:36:41.740 | is that I create--
00:36:45.700 | because SQLiteUtils doesn't have a column class.
00:36:50.060 | Columns are just strings.
00:36:51.660 | But I wanted each column to remember
00:36:53.140 | what table it came from so that I could have it spit out
00:36:57.940 | table.name when it's stringified.
00:37:00.740 | So cols getter, when you--
00:37:12.420 | how do I explain this?
00:37:13.340 | So let's say, OK, so I go artist.c.
00:37:15.620 | That's going to give me a cols getter for the artist table.
00:37:26.980 | So here's the list of columns.
00:37:28.900 | That's created by calling repre.
00:37:31.180 | This is exactly the same as before.
00:37:32.820 | Repre joins up the der.
00:37:36.020 | And this time, the der just calls repre
00:37:42.820 | on each thing on self parentheses.
00:37:48.820 | So when you call a class like a function,
00:37:53.340 | it uses this special one, dunder call.
00:37:56.340 | So this is actually the list of things.
00:37:58.140 | So it's going to go through each column in the table.
00:38:00.300 | And it's going to create a _col with the table
00:38:03.580 | name and the column name, which it then saves away.
00:38:08.860 | And then it then calls repre on each one of those columns,
00:38:17.980 | which calls col.dunderrepre, as you see.
00:38:22.860 | And that just returns--
00:38:26.380 | well, that's totally pointless.
00:38:27.820 | That is exactly the same as that, I guess.
00:38:37.500 | Right.
00:38:39.780 | So the representation of a column is just the column name.
00:38:44.300 | So you had the generic getter class above.
00:38:47.780 | Is there a reason this is not using that?
00:38:50.340 | Is it because you have to have the special functionality?
00:38:52.780 | Yeah, this was for table and view getters.
00:38:55.780 | This one doesn't have any ability
00:38:57.420 | to pass in a column name.
00:38:58.540 | The only thing that's shared is kind of that, I guess.
00:39:10.580 | Even that's not quite shared.
00:39:13.260 | Oh, OK.
00:39:14.020 | The only thing that's shared is that.
00:39:16.220 | So you could put that in a super class.
00:39:17.940 | But at that point, it's like you're not really saving it.
00:39:20.260 | Yeah, OK.
00:39:25.580 | So when you have an f string, and you put something
00:39:30.980 | in curly brackets inside it, it calls--
00:39:37.740 | it's the same as calling this function.
00:39:39.380 | And when you call that function, behind the scenes,
00:39:46.540 | it calls dunderstr.
00:39:50.300 | So this is-- people often get confused about dunderstr
00:39:53.340 | versus dunderrepre.
00:39:54.380 | So this is called by Jupyter when getting
00:39:57.180 | the representation of something.
00:39:58.580 | This is called when stringifying it,
00:40:00.740 | such as putting it inside an f string.
00:40:03.300 | So in that case, we have the table in quotes, and then
00:40:06.580 | the dot, and then the column in quotes.
00:40:08.300 | So that's how come this works.
00:40:14.420 | And then I think we've probably talked about patch before.
00:40:21.460 | But just in case people missed that last time,
00:40:25.180 | this is part of FastCore.
00:40:27.420 | And it's just going to add this method or property
00:40:32.060 | to this class.
00:40:33.660 | And since we said as prop, it makes it a property.
00:40:38.100 | So that's how come database now has a dot T.
00:40:44.580 | I think this is a lot more ergonomic than having
00:40:49.700 | a whole separate fastlite.tables bracket database or something.
00:40:56.980 | Generally speaking, the way I think about patches,
00:40:59.020 | if there's something which I think to myself,
00:41:01.740 | like, oh, I feel like I would have liked this library better
00:41:05.260 | if this was built into it, then I use patch.
00:41:08.180 | And now it is built into it.
00:41:15.020 | So again, I do the same for table and view,
00:41:17.820 | which are both parts of SQLite utils.
00:41:20.140 | So I patch in a stru for those.
00:41:22.660 | And that's actually exactly the same.
00:41:24.420 | And so now I can do the whole SQL statement.
00:41:31.380 | So that's not much code, obviously.
00:41:35.500 | But it kind of gets us quite a long way.
00:41:42.180 | Now, this is super lazy, of which I'm not ashamed.
00:41:49.580 | There's already a dot query in database.
00:41:54.340 | And you can pass it a query.
00:42:03.340 | And it returns a generator.
00:42:05.140 | And doing that is enough to make me sad.
00:42:11.140 | So queue is a very common thing for queries.
00:42:15.380 | Like, for example, in URLs, queue is normally the query.
00:42:19.820 | So I just thought, OK, let's add queue.
00:42:21.660 | So that just does a list of a query.
00:42:23.740 | So again, just for interactive use, slight convenience.
00:42:27.820 | And if you're someone who prefers the type query,
00:42:30.340 | there's nothing wrong with that.
00:42:31.860 | You could also patch in a thing called query
00:42:33.620 | if you wanted to just do the listing.
00:42:35.300 | The whole thing with this is that it's malleable and flexible.
00:42:38.300 | So this is like, oh--
00:42:39.660 | Yeah, and I didn't want to change the behavior that's
00:42:42.100 | already there.
00:42:42.580 | Because if you want to use a generator,
00:42:43.860 | there's good reasons to use generators sometimes.
00:42:45.980 | Yeah, so if you want to--
00:42:47.780 | paging through a 1 million row table,
00:42:50.220 | then you don't have to put the whole thing in memory.
00:42:52.540 | But that's, like, much less common.
00:42:55.020 | Most of the time, I'm not working
00:42:56.340 | with million row tables.
00:42:57.620 | So I have the short one be the thing which
00:43:00.220 | is the thing I want to do most of the time.
00:43:02.180 | OK, so this is just doing the view thing,
00:43:08.700 | which we've already seen.
00:43:09.700 | So you can see now I've just added views getter, which
00:43:12.180 | is exactly the same as tables getter, but it's view names.
00:43:16.460 | And so v is just that.
00:43:19.220 | OK, so we can now check our Akadeka albums.
00:43:25.580 | And there we are.
00:43:27.860 | All right.
00:43:28.360 | So do we want to move on to diagrams
00:43:35.620 | or anything else to discuss there first?
00:43:39.300 | I guess I have a little question.
00:43:44.740 | I don't know much about how VS Code autocomplete works.
00:43:47.860 | But given that Python does offer dynamic access to namespaces
00:43:54.420 | that are associated with objects,
00:43:58.540 | what other constraint or goal is causing people
00:44:04.180 | not to make use of that when they do autocomplete?
00:44:07.060 | The code has to be executed, right?
00:44:08.860 | So if you're in a .py file and you're halfway through,
00:44:12.180 | it hasn't executed the code above,
00:44:14.140 | which is different to a book order.
00:44:16.940 | That makes sense.
00:44:18.060 | And in general, you wouldn't want VS Code
00:44:20.300 | to automatically execute the code.
00:44:24.820 | I mean, I do think there's room to create an extension, which
00:44:31.380 | kind of, like, maybe there's some special comments
00:44:35.940 | or something, something at the top of a file
00:44:38.020 | that describes what objects to create and how to create them
00:44:42.500 | so that you then do get dynamic autocomplete.
00:44:45.500 | But yeah, the thing that you're getting autocomplete
00:44:47.780 | for in VS Code is classes, not objects on the whole.
00:44:52.980 | And so that's why it relies on--
00:44:56.820 | people generally put type annotations in here
00:44:59.700 | and return annotations in here, because they
00:45:03.060 | don't have that dynamic environment to use.
00:45:09.220 | Yeah.
00:45:15.140 | So yeah, OK, so I looked around for database schema diagrams,
00:45:34.140 | et cetera, et cetera.
00:45:39.860 | And I didn't find great options.
00:45:46.500 | Either they were kind of expensive
00:45:48.140 | or they were very heavyweight things.
00:45:51.780 | And my biggest complaint was that most of them
00:46:04.140 | looked like this.
00:46:05.900 | And my problem is that in this type of diagram,
00:46:12.060 | you can't tell what is joined to what.
00:46:15.740 | Like, the lines just go to random places on the boxes.
00:46:18.780 | So what the hell is it?
00:46:23.580 | What readings dot what is connected to where in sensors?
00:46:28.220 | So I don't know why this is such a common way of doing things.
00:46:32.300 | But I grew up on Microsoft Access, which I have always
00:46:40.060 | felt like is the best at this stuff.
00:46:45.020 | And Microsoft Access has always shown you exactly what field
00:46:59.180 | gets mapped to what.
00:47:02.700 | So I really wanted that.
00:47:04.260 | And also, I wanted it in a notebook,
00:47:06.620 | like jumping in and out to different programs
00:47:09.180 | and kind of annoying.
00:47:10.700 | Having said that, this seemed like way too difficult a task
00:47:17.140 | for me to bother with, given the amount of time it would save.
00:47:22.500 | I thought maybe I'd spend the $200 a year
00:47:24.620 | on the license, whatever.
00:47:26.460 | But then I remembered that there's
00:47:29.060 | this amazing program called GraphViz, which
00:47:32.580 | has been around for 33 years.
00:47:35.220 | And it converts a small piece of text into a diagram.
00:47:47.300 | I wonder if I've got an easy example.
00:47:56.220 | Good lord, they don't make it easy.
00:47:57.660 | Just jump straight into the full specification.
00:48:03.540 | I feel like GraphViz is right up there with FFmpeg,
00:48:06.060 | where the correct way to use it is something like chatGBT
00:48:09.140 | and/or find an example that looks kind of like what you
00:48:11.780 | want and then copy it.
00:48:13.420 | No one is expected to know it.
00:48:16.660 | This is very much the case.
00:48:18.020 | And of course, the galleries are ridiculously overcomplicated.
00:48:24.460 | I'm not quite ready for that yet.
00:48:26.180 | Thank you.
00:48:27.860 | Oh my god, and I never want to be ready for that.
00:48:29.900 | Thank you.
00:48:30.660 | Oh, look, here we go, finally.
00:48:32.500 | A not-too-complicated one.
00:48:34.060 | I mean, even easier would be nice.
00:48:44.460 | Here we go.
00:48:46.460 | So basically, they're called .files.
00:48:53.020 | There's a program called dot.
00:48:54.260 | And you create a digraph.
00:49:00.340 | You give it a name.
00:49:01.340 | Doesn't matter what you call it.
00:49:02.700 | And then you have a thing in curly brackets.
00:49:05.260 | And the thing in curly brackets has a string, an arrow,
00:49:10.900 | and another string.
00:49:12.580 | And that creates a picture of a directed graph
00:49:15.700 | from string A to string B. And that's actually it.
00:49:23.020 | So we could-- it's nice that we've got a playground now.
00:49:34.180 | Hello, Jono.
00:49:39.140 | Wow, real time.
00:49:45.820 | Hello, Alexis.
00:49:48.100 | So you can see that if it's the same word,
00:49:51.660 | it's the same node, you know.
00:49:53.820 | Bit morbid, never mind.
00:50:01.820 | You get the idea.
00:50:03.060 | So actually, the basics of GraphViz are super simple.
00:50:09.140 | And I think the problem is people often
00:50:10.740 | show overly complex examples.
00:50:12.860 | So with that said, let's now look
00:50:14.900 | at an overly complex example.
00:50:16.100 | [LAUGHTER]
00:50:24.980 | So our example is going to create this.
00:50:31.020 | Now, what we might do first is we
00:50:33.660 | could make it a little bit less complex.
00:50:35.620 | That would be crazy.
00:50:43.300 | Let's just do artist and album.
00:50:45.580 | Let's run that.
00:50:57.380 | So I added a thing to this called
00:51:00.340 | render, which if you set it to false,
00:51:03.100 | it just returns the string.
00:51:05.460 | So then we can print that.
00:51:09.820 | So maybe we'll look at the thing that comes out of it as well.
00:51:13.740 | And you know what we could do?
00:51:17.620 | We could just grab that and just go s equals and print--
00:51:29.820 | not print, display.
00:51:37.100 | So now we can fiddle around a bit
00:51:39.900 | and get rid of stuff that's not really needed.
00:51:42.700 | There we go.
00:51:49.780 | So here we are taking advantage of a quite interesting
00:52:01.220 | and curious addition to GraphViz that instead of just having--
00:52:05.620 | so here, this is the thing I just showed you, right?
00:52:09.700 | But if I just had those strings, remember,
00:52:16.340 | they just appear like this, right?
00:52:20.260 | But you can define ahead of time what a thing is,
00:52:25.180 | as we've done here.
00:52:26.820 | And in this case, we have defined ahead of time
00:52:28.900 | that actually album is a table.
00:52:33.940 | And the album table, it's just-- as you can see,
00:52:37.220 | it's HTML.
00:52:39.380 | It's an HTML table.
00:52:40.580 | It's not really HTML.
00:52:42.140 | GraphViz just so happens to have borrowed this HTML-like dialect
00:52:48.780 | for describing table nodes.
00:52:53.180 | So a table has rows, and a row has cells, data.
00:53:00.620 | And they can have attributes.
00:53:05.260 | So one of the interesting attributes
00:53:07.820 | is port, which is basically giving a name to this row.
00:53:13.660 | So when we say album colon artist ID,
00:53:16.420 | it finds the album node, and it finds the artist ID port.
00:53:21.300 | And that's going to be the start of its error.
00:53:24.180 | And then this is just a Unicode key I put next to each one.
00:53:33.580 | And then I also added a--
00:53:37.140 | I believe this is called blanched almond, this color,
00:53:39.820 | blanched almond background on each one.
00:53:43.740 | And without this left to right rank direction,
00:53:49.420 | you can see it goes top to bottom, which is not great.
00:53:53.100 | So I feel like most of the time in GraphViz,
00:53:55.660 | I want left to right.
00:53:56.580 | I don't know.
00:53:57.420 | It's just me.
00:53:58.220 | It probably isn't, right?
00:53:59.260 | Because our screens are wider than they are tall.
00:54:01.300 | So if it's tough for a screen, you normally want that.
00:54:03.980 | You can make it a graph instead of a digraph.
00:54:11.140 | Digraph means directed graph.
00:54:13.260 | So if it's an undirected graph, then you
00:54:15.660 | can't give it an arrow anymore.
00:54:18.860 | I don't really know of a reason not to use digraph,
00:54:22.740 | because you can add annotations to these edges
00:54:25.980 | to say to add an arrow on both sides, or just on one side,
00:54:29.140 | or on no sides.
00:54:30.460 | So I think always you just write digraph, g curly,
00:54:35.580 | rank to lr, then your definition, and then curly.
00:54:41.220 | Well, that's a quick guide to GraphViz.
00:54:44.860 | Does that make sense so far?
00:54:47.740 | OK, so we just have to build up this string.
00:54:52.220 | So yeah, the way I did it was I just wrote one kind of by hand
00:54:56.700 | and tried to make it look nice.
00:54:58.580 | And then once it looked nice, I just
00:55:00.340 | did basic string manipulation until the string
00:55:02.780 | looked the same.
00:55:03.620 | And then once the string looked the same,
00:55:05.340 | I tried rendering it.
00:55:06.700 | So I'll leave Neato for a moment.
00:55:11.900 | So yeah, basically, I'm going to need a list of edges
00:55:14.540 | at the bottom.
00:55:17.180 | That's going to be these bits down here.
00:55:20.940 | So my edges just go through each of my tables
00:55:26.260 | and grab the edges.
00:55:27.140 | So an edge is a foreign key.
00:55:29.820 | So this is why I don't use auto formatters like Black,
00:55:38.260 | by the way.
00:55:39.060 | When you've got two things that are doing the same thing,
00:55:42.020 | I like everything to line up.
00:55:43.700 | No auto formatter is going to put the extra space in there
00:55:46.460 | for you.
00:55:49.340 | So edges-- yeah, it calls edge on my table.
00:55:53.780 | So that's going to go through all the tables' foreign keys.
00:55:56.260 | So you can see up here, I was just
00:55:58.220 | checking to see how that worked for myself.
00:56:00.060 | It's like, OK, yeah, it's just a list of foreign key.
00:56:02.540 | And that has a definition.
00:56:06.660 | So yeah, go through each foreign key.
00:56:09.620 | And then I just want an fString that
00:56:11.380 | goes from table.column to other table.otherColumn.
00:56:18.260 | So it's nice how simple it feels when it comes out.
00:56:22.980 | So then my table nodes, which is these bits,
00:56:31.580 | basically consist of the table part and the rows part.
00:56:36.860 | So my table nodes is going to be a _tnode applied
00:56:43.420 | to each table.
00:56:45.500 | So my tnode applied to a table is
00:56:48.860 | going to have all my rows in the middle.
00:56:51.140 | I'm calling _row.
00:56:52.780 | And then around that will be table.
00:56:57.260 | And then this is the name of the table at the top in light gray.
00:57:04.700 | And then all my rows.
00:57:06.900 | So my row, I'm going to have a row for each column.
00:57:11.180 | So if it's a primary key, it's going to have that on the end.
00:57:15.660 | Also, if it's a primary key, it's
00:57:17.100 | going to have this in there.
00:57:21.540 | And so the port's just the code name.
00:57:24.620 | There's that background color.
00:57:27.660 | And then there's the actual text that appears.
00:57:29.820 | And then there's the primary key bit.
00:57:32.660 | That's it.
00:57:33.660 | Yeah, so it's just jing manipulation.
00:57:37.540 | So that's that.
00:57:45.020 | And so then, yeah, so when I export that,
00:57:49.220 | that then ends up in the Python file I showed you.
00:57:52.300 | And yeah, in this case, to create the home page/readme,
00:57:56.260 | all I did was I actually just made a copy of this notebook,
00:58:01.060 | deleted all of the actual source code definitions.
00:58:07.260 | And most of it otherwise is basically the same.
00:58:10.580 | nbdev uses Quarto behind the scenes.
00:58:17.940 | So these screenshots look huge.
00:58:22.700 | And that's because of, I think, the retina rendering
00:58:27.100 | or whatever they have where it's not pixel for pixel,
00:58:30.300 | but it's kind of sub-pixel rendered.
00:58:33.100 | So you have to do something special
00:58:37.420 | to make that appear correctly, which Jupyter Notebook does not
00:58:40.860 | currently do by default. That'd be a nice thing to add,
00:58:44.060 | actually.
00:58:45.940 | So I just add manually something to tell Quarto
00:58:51.100 | how wide I want it to be, which does not appear correctly here.
00:58:55.100 | But it does appear correctly once it's here.
00:58:59.420 | So that's fine.
00:59:01.380 | And I think that's the first slide.
00:59:07.820 | Cool.
00:59:08.320 | That's very neat.
00:59:15.100 | Thank you.
00:59:17.580 | I mean, it was literally a day's work, I guess.
00:59:24.220 | I think because Simon's library was so--
00:59:29.140 | it's one of those things, it's like, oh,
00:59:30.780 | there's no other way you could have done this.
00:59:32.220 | It's the obvious way.
00:59:33.100 | It's the correct way.
00:59:34.620 | But it probably took seven years to get to the point
00:59:37.260 | that it feels that way, which I think is a good sign.
00:59:43.980 | Yeah, you're getting a lot of leverage out of the fact
00:59:46.380 | that his library just works in plain old data, plain old Python
00:59:51.700 | data structures.
00:59:53.020 | And then you're very familiar with some
00:59:59.340 | of the slightly meta stuff around Python
01:00:02.140 | that lets you hook into autocomplete using
01:00:04.220 | plain old data.
01:00:05.220 | There's a great documentation page in the Python docs
01:00:09.660 | called the Python Data Model, which lists
01:00:14.100 | all of the magic methods.
01:00:15.620 | And it's good reading, because it's actually
01:00:22.740 | how Python works.
01:00:24.620 | I mean, there's some low-level C bit
01:00:27.740 | which makes that all happen.
01:00:29.660 | But it's like, given that machinery,
01:00:33.100 | here's how all the Python syntax falls out of it.
01:00:37.220 | All this Python syntax is a syntax sugar,
01:00:41.140 | like a for loop is syntax sugar for calling next
01:00:45.380 | on an iterator, stuff like that.
01:00:48.820 | So yeah, I think it's not something everybody needs
01:00:52.740 | to know, but it's something that probably more people would
01:00:55.980 | find useful than people who actually read
01:01:00.620 | that page of the documentation.
01:01:02.460 | At least I feel like I've done that.
01:01:04.540 | - It's a shame to be using a dynamic language and not
01:01:06.660 | leveraging what that gives you.
01:01:08.020 | - I know, right?
01:01:08.860 | I know.
01:01:09.460 | - Yeah.
01:01:10.580 | - Yeah.
01:01:12.100 | - Speaking of, the repro function is interesting.
01:01:17.060 | I remember once upon a time, I think--
01:01:19.540 | maybe this isn't something people think about anymore--
01:01:22.060 | but encountering the idea that the representation of an object
01:01:25.740 | was supposed to be also a representation that
01:01:29.260 | could be read later.
01:01:30.820 | - Yeah.
01:01:31.540 | - I don't know if that--
01:01:32.540 | - That is true.
01:01:33.060 | - It doesn't seem like that's actually observed in Python.
01:01:34.780 | - I'm not observing it.
01:01:36.860 | Right.
01:01:37.740 | And I think that is, strictly speaking, wrong.
01:01:41.860 | But it is how Jupyter chooses to display it.
01:01:48.420 | And my main purpose of it is to display it.
01:01:53.140 | If I want something I can read back later,
01:01:54.940 | I would use Pickle.
01:01:56.340 | And Pickle has its own set of magic methods
01:01:59.420 | that define how an object serializes and writes itself.
01:02:03.860 | So I've, for quite a few years now,
01:02:09.220 | quietly ignored that repro rule.
01:02:12.100 | I don't think I've been called out before.
01:02:13.980 | So here we are.
01:02:16.100 | - Yeah, well, the thing is, before I was a Swift nerd,
01:02:18.940 | I was a Lisp nerd.
01:02:21.540 | - Isn't there a repro markdown that Jupyter
01:02:24.140 | does if it's present, that supersedes it?
01:02:26.660 | If you do want to have full--
01:02:28.300 | - Yes, absolutely.
01:02:29.820 | There is.
01:02:30.420 | And I use that quite a bit.
01:02:32.460 | But then I also want it to look nice in IPython,
01:02:39.180 | for example, which doesn't have a markdown wrapper.
01:02:42.540 | So yeah.
01:02:44.620 | I use repromarkdown quite a bit.
01:02:47.060 | - So repromarkdown gives you the representation in markdown.
01:02:49.860 | Is that as simple as it sounds?
01:02:51.500 | - And so there's a--
01:02:53.060 | yeah, basically, Jupyter has a system by which it
01:02:55.660 | decides what to display.
01:02:57.820 | - Which one to use.
01:02:59.180 | - And so I think if there's a repromime bundle, it uses that.
01:03:02.820 | And if there isn't, it uses repromarkdown.
01:03:05.460 | If that doesn't exist, it uses repromarkdown.
01:03:08.020 | If that doesn't exist, it uses repromarkdown.
01:03:10.380 | - OK.
01:03:14.980 | So John was suggesting that, in theory, one
01:03:16.780 | could use only the repromarkdown.
01:03:19.620 | I think no one actually observes rep or read compatibility--
01:03:23.500 | - I've never seen it matter.
01:03:25.060 | - Yeah.
01:03:25.940 | I think it's kind of-- the idea that that should exist
01:03:28.140 | is a holdover from another language other times.
01:03:30.820 | - Possibly.
01:03:32.700 | Anyway, a very interesting point.
01:03:35.540 | Well, thank you, Jono.
01:03:36.660 | - Well, thanks for sharing that, Jeremy.
01:03:38.300 | I look forward to seeing it in use,
01:03:40.180 | seeing it get used maybe to teach some other people how
01:03:42.500 | to wrangle data sets and what gets added.
01:03:44.660 | - I'm going to use it next to build a web application
01:03:49.420 | to show you guys.
01:03:50.500 | So hopefully, I'll be able to show you that tomorrow.
01:03:53.580 | - I'm very excited as part of my developing,
01:03:56.380 | like rediscovering my affection for Python.
01:03:59.740 | And maybe this will line up for the talk
01:04:01.900 | I can do on my scripting utility.
01:04:04.300 | - Great.
01:04:05.500 | Coming back from the lands of Swift.
01:04:10.220 | - Python just moved a lot when I wasn't paying attention.
01:04:12.540 | It's much nicer now.
01:04:13.700 | And anyway, it's what everyone uses.
01:04:14.980 | So you don't want to put yourself in a ghetto.
01:04:16.900 | You need to use the real things.
01:04:19.300 | - All right, gang.
01:04:20.580 | Thank you.
01:04:21.300 | - Bye-bye.
01:04:21.780 | - Bye-bye.
01:04:23.340 | [BLANK_AUDIO]