Agentic Coding with Claude Code

00:00:00.000 | "I think this should be the live post."

00:00:09.360 | "Bringing should be live."

00:00:18.640 | "Okay, I have no idea if this works. This is an experiment.

00:00:29.280 | If you are, if you can hear me or something, write something into the stream chat.

00:00:39.280 | Because I don't know.

00:00:42.400 | Actually, let me open the stream. There can be some sort of live update on

00:00:53.840 | what I'm doing, so like some sort of feedback.

00:00:58.560 | So yeah, I have created a small little test project here just to

00:01:02.320 | maybe show a little bit how I'm doing this.

00:01:04.960 | Okay, so

00:01:08.240 | what I'm usually doing is I'm using plot code. There are a bunch of other options too.

00:01:14.800 | There's open code, there is amp, and a few others. And what all of these have in common is that they

00:01:22.960 | currently primarily run on the command line and detached from your editor. So I'm actually using now

00:01:32.160 | VS code for the most part again. I used to use cursor, but the way I'm working is I have cloth running here

00:01:40.080 | and I have an editor here. And I use the editor to review, and I'm using the editor primarily to

00:01:48.880 | make some small changes to it, right. So I have created a small little project here. And I just

00:01:55.440 | want to show the authentic setup. So what am I doing for agents to work at all, right?

00:02:02.320 | So the most important part here is to have a cloth MD file. This is actually auto-generated with

00:02:10.960 | /init. And just I did some minor modifications to it. For almost every single project that I'm doing,

00:02:19.520 | I use a makefile. In fact, I would say for 100% of projects I'm doing, I have a makefile. And the makefile

00:02:25.760 | acts as the main entry point for the agent to run things. So the agent, plot code in particular,

00:02:33.920 | will use bash, it will use python, it will write some code, it will run the code, and it can dig itself

00:02:40.240 | out of a bunch of nasty situations with this way. But I want to steer it in certain directions, right?

00:02:47.280 | And so what I usually have is a project overview. It just tells it what it is. I don't actually know

00:02:53.440 | if it's necessary. It will read the readme tool. But it gives you some context. And then I give it

00:02:58.720 | immediately next the commands that it should use. The most important command here is makedef.

00:03:04.000 | And the reason this is so important is because I actually don't really want it to run this command.

00:03:11.920 | I usually run this command myself. This brings up all the services.

00:03:16.720 | So this particular project has two services, as a front-end service, as a back-end service. And

00:03:22.720 | I tell it here that this brings up the server. It starts both the front-end and the back-end.

00:03:30.400 | I also tell it that it auto-reloads and it auto-compiles. And I also tell it it should never

00:03:36.640 | stop the server. And that's actually one of the first important things here is that

00:03:42.160 | because it can do anything. And in particular, because I run it in YOLO mode and it will just do anything,

00:03:47.040 | I also wanted to encourage it never to run in the wrong direction.

00:03:55.040 | So it's kind of annoying to explain, but it can, for instance, stop the server and restart the server.

00:04:05.680 | And I really don't want it to do that.

00:04:08.720 | Okay. So if the audio doesn't quite work, let me see if I can fix something here.

00:04:13.840 | I might have, I might have overdone this. Let me see.

00:04:24.480 | Let me maybe remove here. Let me, let me know if the audio fixes itself a little bit, if this is better.

00:04:32.480 | Okay. I did, I did the change here. Let me see if it works. But

00:04:38.720 | basically, I wanted to stay on the path of, of most success. And

00:04:49.840 | for this to work, I have to basically juggle me and the agent. So what I want to do is I want

00:04:56.560 | the dev environment to be in my, to the front. I always want to see it, right? So I have clawed up

00:05:02.880 | here doing stuff. And I always have the dev server running things and I can see what it's doing, right?

00:05:07.600 | So if I, for instance, go now to this website and I load it. Well, so far, nothing has happened

00:05:14.560 | because it doesn't log requests. But for instance, if I were to hit the backend service, let's say,

00:05:20.560 | go to slash API health. I can see that a request was made, right? So I want to see

00:05:26.800 | how the server runs. The server is always there for me.

00:05:29.120 | So that's the first, most important thing. I want the server to run in my terminal. I don't want

00:05:38.720 | clawed to run it in the background.

00:05:41.440 | The next thing is that I want to have a consistent log of all the things that are happening. I've

00:05:46.080 | talked about this a couple of times, but this basically allows me to see

00:05:49.040 | frontend and backend requests simultaneously, even though they're from different services, right?

00:05:53.200 | One is the Veed server. The other one is my Go server.

00:05:56.160 | But I also want to give this visibility to the agent, right? So one of the tools in here

00:06:02.560 | is this make tail lock command. So if I run this here, basically, I see the same thing as here.

00:06:11.040 | Why is this important? Well, it is important because I want to get the agent to always understand how it

00:06:20.800 | establishes the context. Like if I'm talking about a bug with the agent, I want the agent to understand

00:06:26.000 | what's going on, right? So I will show in a minute how I'm doing this, but this is one of the reasons

00:06:32.240 | why this tail lock command is here. And then the other commands there, as you would expect, how to

00:06:37.040 | run the linter, how to run format, how to run clean. It's just that's the entry point of the tools that it

00:06:43.520 | should use. And then here really is the most important part about this. I'm telling it

00:06:49.760 | where the log files are. So I want to prevent it from guessing a bunch of other ways to establish

00:06:56.240 | the context that it needs. Again, I reaffirm that this is the command it should read,

00:07:01.120 | used to read the log file. And I also tell it to never stop the server. I also tell it that this

00:07:07.840 | server auto compiles and auto reloads, right? That's the most important part. So how do I have the

00:07:13.920 | auto reload setup, the auto compile setup? Well, I use, so they said there's a program called four men.

00:07:20.080 | I'm not using that. And yeah, I think the problem is I probably don't have the right

00:07:29.280 | setup here for the webcam. So I might, this might be confusing because I noticed earlier that the audio

00:07:39.120 | desync is a little bit. I need to get the setup working better, but I hope it's not distracting.

00:07:45.840 | Otherwise I can just disable the webcam. Maybe this makes it less awkward. I'll just turn it off for now

00:07:51.680 | because I think the webcam is, is trailing a little bit here. Okay. So I'm using a fork of foreman or a

00:07:59.920 | re-implementation of foreman called shorman. You can find this on the internet. Foreman clone, foreman.

00:08:08.800 | It's this one here. And the reason I'm using this is because this is basically a very small shell script,

00:08:16.560 | which does the same thing as foreman, but I had to make some changes to it. And it was easier for me

00:08:21.040 | to do the change in shorman. So what does it do? So I have a file here called proc file. And it says,

00:08:29.360 | this is the command to run for the front end. And this is the command to run for the backend.

00:08:34.240 | And the front end is a basic Vite application, um, which auto reloads all the time, right? So as I'm

00:08:42.160 | making changes to the front end, it automatically appears because it recompiles. And I have set up

00:08:47.200 | something very similar for the backend. I basically told it to use this watch exec tool. And this is

00:08:55.600 | this one here. All that this does is it watches these files, um, specifically go and SQL files,

00:09:03.680 | and, um, it watches recursively this entire folder, and then it runs this, um, this go run command, um,

00:09:12.880 | to, uh, to compile this. And so when I make a change here on my server, let's say, um,

00:09:21.520 | I think this is might actually be unused at the moment. So let me completely delete this file. I

00:09:26.480 | don't want to use it. Right. It's gone. The server has recompiled, right? That's, that's, that's all that is. Um,

00:09:34.560 | then, um, go back to this. So, so then I have this sort of very basic setup where at least

00:09:48.560 | these things log in, um, into this project, um, what in this project is not set up yet is that the

00:09:56.000 | server also the front end also logs into it. So if I go here and I issue, um, console log whatever,

00:10:03.440 | I don't see this here, right? So that's actually one of the changes that I want to do. And I want

00:10:09.040 | to show why this is useful. Um, but basically that's the next thing that I want to set up is also get this,

00:10:15.440 | to, to, to, to log into the server. Um, I want to say one last thing about the,

00:10:23.520 | the make command here. So I give it instructions in the cloud file, right? I'm, I'm telling it,

00:10:29.520 | these are the commands that you should run. There's some extra stuff down here about how I wanted this,

00:10:34.480 | the structure of the project to be, but this is not alone. This alone is not enough to actually get it

00:10:41.840 | to work reliably, um, to work correctly, right? So the reason I made changes to Shorman is because

00:10:50.480 | I actually discovered, this is not, it doesn't take a long to discover this, that, um, the tools work

00:10:58.720 | better if they are more descriptive in their error messages about, um, what, what would, what actually

00:11:07.600 | happened. Let's put it this way, right? So one of the things that I changed into Shorman is that when

00:11:12.560 | Shorman runs, it writes this file called Shorman pit, and this is basically the, um, the, the, this

00:11:20.000 | master process running as a shell script. And if I run it a second time, Shorman checks if it's already

00:11:27.040 | running, and then errors. But many tools do that, right? Many tools error when they're running. What this,

00:11:34.240 | what I changed here is that I error in a way where if the agent reads the error, it is more likely to

00:11:42.000 | understand what happened, right? So every once in a while, say the agent tries to do an HTTP request,

00:11:50.720 | but it makes the HTTP request right at the moment where the server restarts, right? And then it might

00:11:54.960 | see, oh, the server's not running, right? And then it comes, gets the idea to run make dev. But the reality

00:12:00.000 | is that the server was actually running, right? And so when the agent now goes in and intentionally

00:12:04.960 | tries to start the server, it gets a slightly different error message than it would otherwise

00:12:08.320 | get, right? It now gets the error message. Service is already running. That's good. We auto-reload.

00:12:12.400 | No need to do anything, right? So it reinforces to the agent that it doesn't, it shouldn't stop the

00:12:19.600 | server. It shouldn't kill a bunch of processes, right? Because the agent will, it will start killing a

00:12:24.480 | bunch of stuff and restart it. And I don't want it, right? I, I, I want this service here to be very

00:12:29.360 | reliable. And when it tries to start it again, I want it to get exactly the error message that it

00:12:34.000 | should get to not go off the beaten path, right? It should, if it does accidentally run it, now it

00:12:38.960 | will, it will realize, oh, it's actually running, right? Um, so, and you can see this, right? So if I

00:12:45.360 | tell the agent, um, I want you to start the dev services, right? It will run make dev.

00:12:56.720 | It gets an error, but it says no action needed, right? So this is, this is, this works better

00:13:05.040 | than if it just errors out and says, um, showman already running or whatever the default is, right?

00:13:11.200 | So getting better error messages specifically for the agentic loop in is, is one key part here.

00:13:17.520 | The second thing is of course that I changed showman to write these, these log files correctly. So if we're

00:13:23.440 | looking to showman, but I can actually show this in a different way, right? So, um, there's, um,

00:13:29.520 | there's a dev log. This is the one that it writes. And as you can see, it only contains the messages

00:13:36.800 | from the current run. So if I restart this here, it starts out fresh. Um, so this is one of the changes

00:13:46.000 | that I found to work really, really well is I'm hiding away information that it will not need

00:13:50.960 | because otherwise, if I have this ever growing log file, then sometimes it picks up new work and it

00:13:58.640 | sees unrelated changes from yesterday, for instance, right? Um, so that's, um, that's one of the other

00:14:06.640 | changes that I landed into showman. Um, the other thing is since there was already a question of

00:14:14.640 | Docker compose, I do not actually use Docker compose. Um, I don't use any Docker here and this is all

00:14:20.640 | running on my machine. If I were to put anything into Docker, then maybe I would care about Docker

00:14:24.720 | compose, but this is just the most basic setup that I can have. Um, okay. So, uh, let's make a change,

00:14:31.920 | right? Let's put the, uh, console forward plugin in. Um, the reason I want this plugin is because it

00:14:38.400 | makes just iteration generally much easier. I just haven't set it up here yet. So that is this plugin.

00:14:43.520 | Um, so I want it to set this up. Um, so let's say, please set up this plugin by our NPM.

00:14:58.000 | And my English just sucks. So it doesn't read me. Um, okay. Let's see if it can do that, right?

00:15:04.720 | So it should hopefully read this. Um, I, I can optimize a lot of these things, right? This probably

00:15:15.680 | also could have done it manually. I just want to see if it works. Um, one of the things, for instance,

00:15:20.320 | that slows down Claude a lot right now is that it actually from scratch always tries to figure out

00:15:25.520 | what we're using here. So, uh, this might be one of the things we should put into this project

00:15:30.320 | is that we're using NPM. Um, so I didn't write this yet. So we can do this here. We always use NPM.

00:15:40.560 | Right? This in theory should prevent it from using PNPM or something else that it might have.

00:15:47.760 | Um, it will only pick up on that when we, uh, start from scratch, but, um,

00:15:55.040 | in it for future iterations, maybe it will, will improve this slightly. Um, another thing it probably

00:16:05.280 | read the instructions incorrectly. I noticed the other day that I'm not documenting this correctly. I'm

00:16:10.960 | actually importing, or maybe I do it correctly, but I think I've done this once before and it always

00:16:17.200 | imports this incorrectly. So I will manually fix this now because this should actually be like this.

00:16:22.880 | But in theory, uh, plug in

00:16:27.760 | this should work now. Um, I noticed that this before that it always gets this wrong. Um,

00:16:41.760 | but in theory now what should happen is that if we log an error here,

00:16:46.960 | we see it in the log, right? That that's what we want to accomplish here. And

00:16:56.320 | the whole point of this is that future iterations where, uh, where we're coming, where we're running

00:17:04.720 | into issues on the front end will also show up in the same log, right? So now we have at least this,

00:17:10.000 | this running. Um, and let's see, there's some changes here. Um,

00:17:16.400 | let's do this, um, update. So, uh, what do I want to say here?

00:17:25.680 | Set up console forward plugin and remove old pagination.

00:17:30.160 | I got very lazy and used a lot of dictation now. Um, okay. So

00:17:39.760 | let's try to set up some code here, right? That's, that's really what we're here for.

00:17:43.120 | How can we make some changes to this?

00:17:44.640 | Um, I actually don't find

00:17:48.560 | agentic from the start to work particularly well.

00:17:51.840 | So I did actually bootstrap this with cloud code, but I did already make some changes so that it has

00:17:56.480 | an infrastructure that I like. So in particular, for instance, um,

00:18:00.560 | at the very least, I picked my web framework or the router that I want to use.

00:18:06.240 | I set up some utilities and here to respond with errors. It's just the most basic kind of infrastructure

00:18:14.560 | that I wanted to use, uh, for building our API.

00:18:19.440 | The second thing that I did is I created a plan and this is, this is basically all the things

00:18:24.000 | that I want to implement here, right? I, I want to build a small bulletin board that's modeled after

00:18:28.160 | PHPB mostly, but also 4chan. So I don't want to have user authentication. The idea is just that I'm using,

00:18:35.680 | what's called trip codes to authenticate, um, admins can fill in boards. So basically I have this,

00:18:40.960 | this whole plan here that I wanted to implement. And I will not tell it to do this in one go,

00:18:48.960 | but what I want to do is I want to have it look at the plan and tell me if it needs something else.

00:18:55.920 | So I created a plan in plan.md. I want you to ultra think about it and see if there are omissions in

00:19:04.000 | the plan that we need to fill in. So, um, so it will now read this file and

00:19:13.920 | ultra think is basically a hard coded value in Claude that also extends the thinking context window.

00:19:24.880 | So it will, um, use, um, more tokens to reason. Um, one question came up is what like this,

00:19:33.760 | dictation tool I'm using, I'm using two different ones. I'm trying, um, flow. The other one I'm using

00:19:42.240 | is called voice ink. I use them both for different things. I'm just trialing different things right now.

00:19:48.400 | Um, uh, yeah, that's, that's basically the answer to that. Um,

00:19:54.080 | the reason I wanted to read through my plan is that it's actually quite good at telling me if there are

00:20:01.920 | omissions that will help it later. I don't usually use, um, this with Claude. Instead, what I usually

00:20:12.560 | do is I copy paste this entire thing into O3. So let's do this here. Um, I have a plan here. I want

00:20:19.760 | you to think hard about this plan and tell me if there are omissions to this plan that we should

00:20:24.800 | look into before we implement it. Let's put the plan and see what O3 is doing. Um,

00:20:32.960 | so let's see what it came up with. Several omissions, admin authentication. So how admin

00:20:41.520 | privilege is granted verified. That's actually a good point. We didn't mention that. So, uh, let's do

00:20:49.600 | both here. Actually, we already have a section up here. So, um, admin_scan_well_reports, admin_commissions

00:20:59.200 | are hardcoded in nflr. Okay, we, we don't really have authentication. Maybe we just use, um,

00:21:10.560 | very, very, very basic, um, HTTP_basic. Well, we'll see. Um,

00:21:17.920 | so most of this we will not actually do. Um, this doesn't really matter.

00:21:26.320 | The indexes, I think, like a lot of this stuff it will figure out along the way anyways. Mostly I want

00:21:34.080 | to see if there is some, um, some very clear omission that we have, um, that we should clarify.

00:21:42.000 | Now, so far this looks good. See if, if, uh, this came up with something. Uh, thank you for giving me

00:21:52.480 | two. Uh, do I have to pick one? Uh, just want to quickly look through. Uh, okay. This one tells me that

00:22:02.560 | there are no deletions in it. That's a good point. Um, we will not do this for now. Uh, okay. So,

00:22:11.520 | so far, if we go to this bulletin board, there's nothing, right? This, um, uh, we haven't set up anything

00:22:21.680 | yet. Um, I get a warning here that the, we're kind of using outdated packages for

00:22:32.400 | for the dev tools. I will leave this for now. I just don't want to spend too much time on the

00:22:35.360 | stream on the wrong things, but, um, yeah, the, the idea is basically that we are going to implement

00:22:41.440 | the feature now.

00:22:42.000 | The only endpoint that I have right now is actually the, uh, help check endpoint. So we don't have much.

00:22:49.440 | We don't have any database code yet, other than setting up the database once in the server. I think

00:22:54.560 | it's here somewhere. Um, so let's see. So we'll like one API endpoint and I think with best to start with

00:23:02.640 | listing all the boards that exist. And because you cannot create a board right now because there's no

00:23:10.480 | admin panel, we're just going to hard code a bunch of boards in the database. Um, so here we have migrations.

00:23:18.480 | So we ask it to make a new migration with two boards that it can just make up. And then we're going to

00:23:25.040 | list the boards. I want you to make an API endpoint that lists all the boards that exist because we do

00:23:31.840 | not yet have an API to create the boards. I want you to make a migrations and create two default boards,

00:23:38.000 | one called general and one called water cooler. Um, let's see if this is good enough.

00:23:46.480 | Um, I think I already wrote what the response for APIs largely should be. I think this is all, um,

00:24:00.640 | yeah, so let's do this. Each board in the response should pose as could contain

00:24:08.640 | the most recent topic and the most recent post in addition to just the title and description.

00:24:13.760 | Okay, let's see what it does.

00:24:19.280 | So let's see what the questions are in the meantime.

00:24:23.200 | Have I tried using claw to generate a file mapping and usable voicing AI post-processing for prompt

00:24:32.400 | generation? Um, I have tried that. So far I haven't, a lot of the things I'm doing at the moment are

00:24:40.640 | basically based on does it actually make anything work better. And I know, I know that a lot of these

00:24:45.760 | AI tools can do quite impressive things, but very often it doesn't make any more productive. So I don't

00:24:52.480 | really like using voicing or something like this to generate prompts to then have another prompt. So I

00:24:58.560 | much rather have commands set up. Um, but yeah, I haven't, I haven't tried that so much.

00:25:06.080 | So it, it kind of, it came up with a migration here, uh, for defaults. So, um, it will run this.

00:25:14.720 | One of the things you will notice is that in this project and in fact, all of the code I'm writing now,

00:25:19.360 | I'm, I'm, I'm asking it to write a custom SQL. I do not use an ORM. This is really because I always

00:25:27.280 | liked writing SQL manually. In fact, I really just like SQL. It's not that I enjoy SQL, but I like having

00:25:33.600 | as little of an indirection between me and the database. The main reason I don't do it when I don't

00:25:41.520 | use enchanted coding as much is because it is annoying to write SQL, but now they have a machine write it for me.

00:25:46.800 | This beats to me having, um, like another indirection in place. So let's see what it does here.

00:25:55.360 | Um, I think it already does some things I don't like, but let's see. Um, most likely what's going

00:26:03.840 | to spit out is code I don't like. And then rather than it making more code like this, I just want to

00:26:10.400 | stick with the initial one. I want to fix it up because the more code exists that looks like what

00:26:15.600 | I want, the more likely it is that future API generations will kind of fit into this. Right.

00:26:20.640 | That's sort of the idea. Um, yeah. And so, as you know, I basically, I gave it all the permissions.

00:26:29.840 | I just let it write. I, I don't, I don't do anything here. Right. It's like, I, I just let it go.

00:26:37.200 | It has all the permissions to do everything on the system, which in parts could be a terrible idea,

00:26:41.760 | but seemingly plot code does really well. Um, right. So it, it managed to, to run the API.

00:26:50.800 | Uh, it sees that there is a, there's a response coming back from the API. So, so it is working.

00:26:58.560 | Uh, we can also go to the browser now and sort of test this, um, I think it called it boards. Right.

00:27:04.720 | And so we see, we see that there is a board and it actually has test posts in it. I'm assuming it has

00:27:12.000 | test posts in it because it's just went to the database and created some. This is my guess. Uh,

00:27:16.880 | I didn't actually see where it did it, but this might be an interesting moment to look into the database.

00:27:22.800 | Uh, so we have a database here called miniDB. This was empty when we started earlier.

00:27:28.560 | And it has created some posts here. Um,

00:27:34.800 | I wonder when it, when it created them. I didn't look. So, uh, when did it create them? Did it make me a test?

00:27:43.760 | So let's, let's do this. Um, let's check quickly which files, uh, we have here. So it must have created

00:27:51.520 | these manually through, at which point did it create them? So it created some handlers. Um,

00:27:59.600 | when did it create? This is one of the reasons why the terminal interface is not very great,

00:28:06.320 | because I don't have to search here. I have to quickly go through this and see. Um,

00:28:12.400 | I don't actually know when it made the, when it made the posts, but it clearly created some,

00:28:21.200 | some content database here. We'll just leave it now. Um, this is, this works good enough. Let's check

00:28:26.240 | the changes, right? So now we can see sort of how I do that. So I know that I changed these files,

00:28:31.280 | right? Because they're all modified. So we have a new route here. Um, boards. This is okay. I'm,

00:28:39.360 | it's fine. I have a list boards. And so all of this is new, right? We only had the health check before.

00:28:45.600 | And now we have, uh, this. So it calls this get boards, which is down here. Um, I really don't

00:28:55.440 | like this, right? The, it, it should not do this. All the database code should go into separate module.

00:29:00.480 | So let's start with this here, right? Um, we need some changes. So let's see.

00:29:07.680 | All the database queries should go into models slash boards dot go. Actually, we'll do boards dot go.

00:29:17.520 | Um, so that's the first that we want. So we want this to go somewhere else.

00:29:25.280 | And this is okay. So the boards response is okay. So it will be a list of boards. Each board will be a

00:29:32.000 | database model. But this kind of thing here will be kind of weird because I want the model to represent

00:29:38.480 | a singular row only. So the models should only represent a singular row, not any, uh, joint records.

00:29:49.680 | Uh, so we need to figure out how to best, um, query this board then to have this in two. Um,

00:29:59.920 | what is it doing here anyways? It is, it's running another query. So this is an n plus one query anyways.

00:30:08.160 | That's probably good enough for now. Um,

00:30:13.920 | So let's just say that it should move this over there. Um, the topic should go into models topic.go.

00:30:26.160 | And then we have the post. The post should go else.

00:30:37.280 | Um, let's just see if it's, if it manages to refactor this a little bit.

00:30:41.040 | Um, and then we see from there what we need to do.

00:30:44.880 | "Does Claude Code Visual Extension work if you're outside of Visual School Terminal?"

00:30:50.000 | Um, yes. So if the, if the, um, if the integration is set up correctly, it works even if it's running

00:30:56.800 | on the side, right? I can also start Claude in here, but I don't really like it. I prefer this terminal

00:31:01.600 | on the outside. Um, but yeah, it's, these changes, they still show up. Um, although I think that this

00:31:07.760 | comes actually from the Git plugin. Um, but we'll see. What are the questions there? Um,

00:31:16.720 | so maybe I should explain this because I didn't do this, but "plot YOLO"

00:31:26.480 | write this here. It's just an alias for this impossible to pronounce argument called dangerous

00:31:33.200 | and skip permissions, right? Basically I run this all the time. Is it a good idea?

00:31:38.560 | I don't know. I'm not strongly advocating for it, but I can tell you that I'm using it this way all the

00:31:45.040 | time. Um, so that's why it doesn't ask me for anything. It's just, it just edits. Um, what are my

00:31:53.840 | thoughts on Gemini CLI? I will reevaluate it last time I was using it. The problem basically is that

00:32:02.880 | any model other than the entropic family of models is not overly amazing. It will use its usage. So

00:32:10.240 | I want to see that these authentic loops work. So that's why I'm playing with it. I have most success with

00:32:18.240 | Claude. I also think that Claude is the cheapest option because the 100 euro, sorry, 100 dollars a

00:32:25.280 | month package in Sonnet only mode is enough. Um, and it's kind of hard to beat for the price right now,

00:32:34.880 | right? And I don't know how long this price is going to stick here, but that's really why I'm not trying

00:32:41.600 | Gemini much. I have Gemini on the system. I sometimes give Claude access to Gemini to read through a code base,

00:32:47.360 | but it's, um,

00:32:50.080 | I'm, I'm going to get this working first, working well, and then I will try other tools again. I, I also tried AMP. I tried a bunch of

00:32:58.960 | other ones, but, um, this is the one that, um, it's just,

00:33:03.760 | I think it has the highest chance of sticking around also in part because the people that write

00:33:07.920 | the tool are also the people that write the model or create the model. And so they go hand in hand.

00:33:12.160 | Um, okay. So now we have a board.go get all boards. This looks, this looks quite a bit better. We don't

00:33:20.720 | need pagination here because we don't expect that many boards. So that will be quite good. Um, now it uses

00:33:26.880 | the scan to feed this and we have a board by ID. Uh, this is also quite okay. And the board by slug.

00:33:35.680 | I am quite okay with all of these. Um,

00:33:41.760 | one of the consequences now that all of these methods can return null or board. So if the board

00:33:52.000 | doesn't exist, it returns null or nil. Um, do I like this? I don't know. Um, so we have this most recent

00:34:03.040 | post by board ID. Um,

00:34:05.920 | okay. So, so one of the things for sure

00:34:11.360 | that is not amazing is that it looks like the board doesn't have a pointer to the most recent

00:34:21.120 | topic. But the topic as opposed to the most recent post. So we, maybe this is okay. I, I, I will not

00:34:29.920 | judge the database structure too much right now. Okay. So I think we can stick with this. In theory,

00:34:36.560 | if we now go to here, it should more look more or less the same. So we have the most recent topic,

00:34:40.880 | the most recent post. Um, let's actually remove the most recent post from, um,

00:34:51.200 | of the API because

00:34:52.640 | I don't think we need it. Um,

00:34:57.440 | so wonder what is not the office here. Okay. Let's leave it for now. Let's leave it for now. But I

00:35:04.000 | think we will throw it away. So what I usually do when I'm program with this is I create myself a to do

00:35:08.880 | file. Um, where I basically keep track of all the stuff that I still need to do. So one is, um,

00:35:15.920 | I'll call this nets. Um, we should remove the most recent post from the board listing.

00:35:24.800 | Okay. So we'll think of this later. So let's have a look at how the API response so far looks like. So we

00:35:32.320 | have a list port route, um, which is hooked up to the router. Um, and it creates this boards response

00:35:44.960 | and then response with Jason and get port with recent is what it calls, which is now it gets all the

00:35:52.640 | boards and then it gets the most recent topic and post. Um,

00:36:02.880 | yeah, uh, not overly amazing, but kind of, okay. But one of the things I do not like is this part

00:36:08.400 | here, right? Does HTTP dot error. Um, and we have this utility here called internal server error. So we'll

00:36:17.760 | actually use this. So we'll call, uh, utils dot internal server error, w and error. Then we remove the other one.

00:36:28.080 | So we want to do this and hopefully going forward, we will actually start using this utility instead.

00:36:36.240 | Why do I want to use this utility? Well, for the one hand, because it logs the error and it returns

00:36:41.280 | with a standardized message. So that's why I want this. And, um, and then this is okay. And so board with recent

00:36:52.000 | is an extended struct that has the board in it plus the extra things here. So this is, this is okay. Um,

00:36:59.040 | so let's say we commit this, we'll leave this for later.

00:37:04.080 | Um, so let's say edit basic board API response. So the next thing we want to do is we want to hook up the end,

00:37:15.920 | the front end, right? So if we go here, we don't see anything. So let's say we want to have the board

00:37:23.760 | show up. I want you to change the front end to show all the boards. Um, for now, I want you to make sure

00:37:31.520 | make sure that we create components for each row on the listing so that we can reuse this later.

00:37:39.120 | These rows should be reused for topics in a board as well as for the board listing overall.

00:37:48.880 | Um, we might need a parameter to change the, actually, I don't want this. Let me, let me do this definitely. Um,

00:37:55.760 | I want you to now show all the boards the most recent topic in the overall, uh, in the index page,

00:38:03.360 | on the index page. Um,

00:38:09.840 | And let's ignore the, um,

00:38:14.000 | I want to now, this, this might, so the problem with whenever it creates a front end from nothing,

00:38:20.880 | it turns into a mess. Since there's basically no real front end, this might be incredibly messy.

00:38:26.960 | And I'm a little bit afraid that it doesn't even manage to set up the router.

00:38:29.920 | Um, so I'll see what it does. If I can, we can watch it. In the meantime,

00:38:36.160 | I can look at this on questions here.

00:38:37.680 | Yeah. So for how to put the browser locks in a terminal, I used a Vite plugin that I wrote.

00:38:45.280 | Um, you can also do this yourself from API endpoint. The Vite plugin was this one here.

00:38:50.800 | Um, and this is what it does.

00:38:54.960 | Uh, one other question is what the font, the font I'm using is MonoLisa.

00:39:02.000 | I think all the time. This one here. Uh, that's the font. What other question?

00:39:10.880 | Yeah. So one question is if you manually edit the code like that, do you have the problem that the

00:39:17.760 | model has unedited versions in the context? And yes, this is a problem. One of the problems with this

00:39:23.760 | is that it will recall things that you have already thrown away. This is actually a pretty big problem.

00:39:28.800 | Um, this is one of the reasons why I clear the context all the time. Um, the same problem,

00:39:34.560 | by the way, also comes up if you do code formatting. It's quite often that the linter and the formatter

00:39:39.360 | edit the file in a certain way and sometimes they should get back and forth.

00:39:44.000 | I don't have a good solution for this, but it is a problem. Um,

00:39:50.960 | I can't really recommend anything here other than I do want to do these commits.

00:39:56.080 | Then I want to clear the context. Sometimes I maintain a to-do list. So before I run out of context,

00:40:03.120 | for instance, I tell the agent to summarize everything that it did into a file and I can look at this file

00:40:08.880 | later and then continue from there. Um, so let's see what it did. It probably has created something here.

00:40:17.440 | Um, so this is actually an interesting thing. It has not managed to run this, right? And so now,

00:40:22.160 | now we can probably see that our tooling comes in helpful, hopefully.

00:40:26.800 | When I navigate to the page, I get a bunch of errors. Please check the log and see what's going on.

00:40:35.040 | Right? So it should now read the log, which it does. And hopefully see what it broke. Um, okay, cool. So it managed.

00:40:50.880 | Probably it wouldn't have needed the log, but having the log now means that it was just able to go back

00:40:57.680 | there and figure this out. And at least we have something now, right? So I, not that I like how

00:41:04.000 | this looks at the moment. You can't even click on it or anything, but, um,

00:41:07.520 | yeah, we, we see something. Um, let's make two changes here.

00:41:14.320 | I want these to be rows. So one below the other, not next to each other. And I also want to not show

00:41:21.680 | the most recent post. I only want to show the most recent topic. So I just want to make this change

00:41:28.400 | and then we're going to figure out how to make it less crappy. Um, cause it probably doesn't look very

00:41:33.600 | nice. Um, the way I do front end code at the moment is I let it write a whole bunch of stuff and then I

00:41:41.120 | ask it to extract components that usually sort of works. Um, but front end code, unfortunately,

00:41:47.760 | turns out to be very sloppy very quickly. Okay. So, um, okay, this at least is getting somewhere.

00:41:59.360 | Um, so let's see what it wrote.

00:42:02.080 | So it created

00:42:05.600 | an index route.

00:42:11.520 | Um, so this is already, we're already sort of down. Um, if I saw, please lend everything.

00:42:18.320 | This is, it's already going to be annoying because it clearly left a bunch of nonsense behind. And so

00:42:26.800 | the linter will immediately complain, hopefully that, um, there's unused stuff. So let's see.

00:42:33.840 | Um, by the way, in this project, I'm not using any hooks. Um, I do use some hooks in other ones,

00:42:40.320 | but I want to start with the basics here. Uh, okay. So we, we got rid of some unused stuff.

00:42:47.760 | Um, I don't know what page is this. We're throwing this away for now.

00:42:51.520 | And then it created this API.ts.

00:42:56.080 | And this is already messy. I don't, this is already too big. So the API client, I'm actually okay with,

00:43:01.840 | it can leave that, but I don't like the types on the same file. So, um, let's do this.

00:43:09.200 | Move the types from API to T or .ts into a separate file.

00:43:12.960 | API. Yes, into a separate file. Um,

00:43:19.920 | Actually, other than the API client itself.

00:43:31.200 | Um, let's see. Just kind of want to move this out.

00:43:36.320 | So the types are here now. The API is here. One of the most important things is to make sure that the

00:43:44.000 | files don't grow too large. The larger the files, the harder it is for the, for the system to work with it.

00:43:50.080 | Um, so this is, this is, this is okay for now. So we're going to just have

00:43:55.040 | not the nicest thing here. I'm going to manually remove this welcome thing, which I think,

00:44:02.160 | where do we have this?

00:44:03.840 | Where is this? Let's see here.

00:44:07.920 | Um, let's throw all of this away. So we have only the boards.

00:44:14.160 | Um, okay. So we have a starting point.

00:44:17.520 | The frontend so far, probably a little bit messy, but, um,

00:44:24.320 | initial display of the boards in the frontend.

00:44:32.560 | The problem immediately here now is going to be that, um,

00:44:35.920 | we don't have the router. We have the router set up, but we don't have query set up. So I think

00:44:42.480 | it uses, no, it does use query.

00:44:45.680 | Okay. It does use query, which is, that's good.

00:44:49.520 | Um, then it uses this get boards function.

00:44:54.640 | It might be okay. Oh, well, we'll see. We'll see how messy it gets as we continue.

00:45:01.280 | Um, okay. So what should we do next? I think next we're going to show

00:45:04.880 | each individual board. So the next thing we need to do is we need to create these boards.

00:45:10.240 | So let's do this. Um,

00:45:12.720 | um,

00:45:16.240 | I wonder if I should continue the session or not. Maybe we'll continue the session. Might be a bad idea, but

00:45:23.920 | might actually help. Now, please add an endpoint to show all the topics in one board.

00:45:31.120 | We will also actually, no, I will, I will, I will do it from scratch here because I want to set up

00:45:37.280 | pagination. Now we need an API endpoint to list all the topics in a board. Note for this, we will need

00:45:43.520 | a pagination helper. We want to use cursor-based pagination. That means not offset, but to continue

00:45:50.080 | from a specific starting point. And we want to take the cursor to continue to the next page from

00:46:00.640 | the URL parameter, because it's going to be a get request.

00:46:03.840 | It's actually going to be a shitty user experience because it means you can't jump to a specific page.

00:46:12.800 | So we will not use cursors here.

00:46:15.280 | I'm going to use offset-based pagination.

00:46:25.280 | We want to take the, take the page and per page parameter from the URL.

00:46:37.040 | Um, or else it's going to be a get request. Default per page 250.

00:46:42.880 | Um, also return the total number of pages that exist.

00:46:50.720 | Don't hook this up to the front end yet.

00:46:52.480 | Okay. In the meantime, I have more questions.

00:46:56.240 | I do not use compact at all. Never, ever use compact. If you run out of context,

00:47:02.880 | compact is basically a command that just screws up everything.

00:47:06.880 | Um, I don't know what happens if you compact. It's going to be a gamble. It's, it is already

00:47:14.080 | random enough. It happens out of the box, but I never, never, never run compact. Instead, if I

00:47:19.360 | notice that I'm running out of context, I'm asking Claude to summarize what it did into a markdown file.

00:47:25.760 | I reviewed a markdown file, start a new session, and then read back from the markdown file.

00:47:32.080 | Um, because then at the very least, I know what it pulls in the context. Compact, I have no idea

00:47:38.720 | what it does. Um, I don't think the tool even shows you what it did after compacting. So it's, it's,

00:47:44.480 | it's a gamble. It's a pure gamble. Um, never do that.

00:47:48.720 | Basically, um, auto compact is, I rather have Claude stop than auto compact because it's just so,

00:47:59.920 | so random. It's, it's absolute random. I also only use Sonnet. Um, I kind of wish I could use Opus more,

00:48:09.840 | but even a $200 subscription, I run out of Opus. So I just stick myself to Sonnet and I use O3 for

00:48:19.920 | planning a lot where I basically go on. Um, I take what I'm working on, copy paste that into just JetGPT,

00:48:28.640 | pick O3 and have a conversation about my architecture there. Um, and I can maybe show this later. I kind of

00:48:34.880 | want to hook up the trip codes to, to post something and then we'll see, um, how well this works. So

00:48:41.920 | what I should have received here now in theory is, uh, what did we get here? We got the pagination.

00:48:49.200 | So these are the parameters for the pagination. We get an offset. Interesting. Why do we have an offset?

00:48:58.720 | Um, okay. So we're pulling this from the query page and per page, the offset is calculated. Um,

00:49:05.440 | and then this pagination meta

00:49:09.680 | is probably used in the API response.

00:49:14.400 | So what did it do? So it lists the topics and this is a list of topics and the pagination meta.

00:49:24.960 | Um, do I like this? Do I like this? I don't know if I like this. Um,

00:49:31.520 | do I like this? I don't know if I like this. We'll figure this out if I like this. Um, I think I might

00:49:39.760 | want to rename this a pagination probably, but maybe this is good enough. Uh, so let's see. So in theory,

00:49:46.800 | there should be an API now for me to hit, uh, API boards, which one test. This has stuff in it.

00:49:55.120 | Uh, what the slash? No. What's, what's the API?

00:50:00.000 | Um, board. What's a board? Board ID. Is it a board ID or is it a board slug? Let's try.

00:50:11.520 | What was the other API?

00:50:12.640 | So that's boards. Uh, board ID one is a test board.

00:50:18.560 | You get something here? No, we don't get anything. Oh, slash topics.

00:50:24.560 | With this. Okay. Board not found. So we need a test probably. Okay. Um,

00:50:35.840 | So we get this total one, total pages one per page 10. So if we do, uh, page equals two,

00:50:42.960 | we get an empty list. Three empty list. Okay. This, this is okay. Um,

00:50:51.120 | I do want to change the meta to pagination though. So I think we're just going to do it manually.

00:51:03.520 | I'm going to call this pagination. Um, okay. So that's up. I changed meta to pagination.

00:51:16.640 | So I just give it this context immediately interrupted just so that hopefully, um,

00:51:23.440 | it doesn't get confused later, uh, through a manual edit. Um, there are some questions. I will quickly

00:51:31.120 | go to them. Have I used context seven? Yes. Um, I don't have good experiences with it. I don't use

00:51:37.600 | any MCP servers other than playwrights. And I try to not use playwright either. Um,

00:51:42.560 | then the other question, uh, yeah, in general, I don't like it looking up docs. I much rather give it

00:51:52.320 | the docs myself, uh, if it needs them. I I'm very conservative on context usage. I don't like any tools

00:51:58.640 | that pull anything in automatically. I optimize everything for low context usage. Um,

00:52:03.680 | I want you to now register a URL for the board. So if a user goes to slash b slash slug,

00:52:13.200 | then we will show the most recent topics there and add a pagination for previous and next page and

00:52:20.560 | a basic overview of how many pages exist and a quick jump to a particular page.

00:52:24.720 | Did manage, um, the like slash b. What do we have here? Slash

00:52:33.600 | slash test. Maybe just do, um,

00:52:41.760 | yeah, let's do a slash b slash slug. I like this. Um, and show the, so the topics there.

00:52:49.200 | Most recent first. Um, and this is good.

00:52:57.120 | Note, we can always rely on monotonic increasing primary key integers for board order.

00:53:06.800 | Um, because we use SQLite here and I want to avoid, uh, it's using dates right now.

00:53:13.440 | Um, also please use the link component to link from the index page to the board.

00:53:21.840 | Let's see.

00:53:28.400 | Okay. Let's see if it manages. Maybe in the meantime, there's some questions.

00:53:32.080 | Uh, again, that was another question what I use for text to speech. Um,

00:53:36.720 | right now this is using whisper flow. I also use voice inc, which is open source. Uh, both of them work.

00:53:43.360 | Um, I just, I'm trialing whisper this week. Um, normally I use voice inc. Um,

00:53:53.680 | say give or take equivalent on a Mac.

00:53:55.600 | Um, are there any other questions that I can answer in the meantime? Because I'm pretty sure

00:54:01.920 | this is going to take like four or five minutes for it to produce something reasonable.

00:54:06.000 | Um,

00:54:08.400 | Earlier there was a question if I'm streaming, this is the first stream I've been doing in three years

00:54:19.840 | probably. So we'll see if I will do this again, but, um, yeah, that's kind of lazy.

00:54:27.360 | Um, yeah. So how do I write the logs automatically to the dev log file? This is what I'm doing with

00:54:39.600 | shorman. So if you, let's do this in the meantime, let's put this on GitHub. Then you can sort of steal

00:54:46.400 | the shorman fork that I have. Um,

00:54:49.760 | don't want to put the whole thing up there, mini db. Let's make a repo mini db. Create new repository.

00:55:01.120 | Mini db.

00:55:05.120 | This is right through that, I put it in port or screen first.

00:55:11.360 | Um, bum, bum, create repo.

00:55:15.280 | And then put this up there.

00:55:20.320 | So, um, in here in scripts, this my shorman fork, I should probably,

00:55:31.680 | now that this is on GitHub, I should probably make sure that the licenses

00:55:35.360 | that should be did correctly.

00:55:38.240 | Because I did not put this in.

00:55:45.040 | Uh, let's edit this quickly, shorman.

00:55:53.360 | Oh.

00:55:53.680 | It's always kind of funny if you have a bunch of black pullet stuff.

00:56:06.720 | Because, um, you cannot really copyright any of this stuff.

00:56:10.880 | Um, very little. There's going to be a bunch of court cases.

00:56:15.440 | Um, edit license to shorman.

00:56:18.000 | Okay. Um, but yeah, shorman is, is what I'm using for the, for the logs.

00:56:23.520 | Hey, it's this. Okay.

00:56:25.520 | So, let's see.

00:56:27.920 | Um, do we have a frontend now?

00:56:30.800 | So, I can click on this and I get not found.

00:56:34.560 | So, clearly it doesn't work.

00:56:36.720 | Um, I navigate it to a board and it didn't work.

00:56:43.280 | Check the logs.

00:56:43.920 | Right. And this is again why the unified logging is so functional.

00:56:48.880 | It sees my browser logs, right?

00:56:51.600 | So, it doesn't just see the server.

00:56:53.120 | So, it should hopefully figure out what it did wrong.

00:56:58.320 | Um, I don't even know what happened.

00:56:59.520 | Because I, I'm assuming that this wasn't too wrong that it did, right?

00:57:04.080 | Because there's a B board TSX.

00:57:06.480 | This should work.

00:57:08.160 | But maybe it doesn't.

00:57:09.120 | Um.

00:57:11.360 | So, hold on.

00:57:16.720 | Could it be that we don't run the 10 stack plugin for Meet?

00:57:20.720 | Um, that would be a problem.

00:57:26.960 | Like this is supposed to be a 10.

00:57:28.240 | Because the plugin for 10 stack should do all of this.

00:57:34.080 | Um.

00:57:36.080 | So, you probably don't have this plugin in there.

00:57:41.200 | Does it work now?

00:57:44.720 | Not so far.

00:57:47.920 | Ah, because I didn't plug it in yet.

00:57:55.920 | I actually think that this is.

00:57:57.040 | Ah, ah, look at this.

00:57:58.480 | But isn't this nice?

00:58:00.320 | I didn't even have to figure out what's going on.

00:58:02.160 | Like, okay, I did give it a hint that it has to set up the plugin.

00:58:06.400 | But, I mean, this is, I love this.

00:58:09.840 | This is just so nice.

00:58:12.080 | Um.

00:58:13.840 | So, here you see one of the problems, right?

00:58:16.560 | So, it wasn't the wrong folder.

00:58:18.400 | So, it couldn't figure out how to make the tail.

00:58:20.880 | And then it immediately ran and went for the, for the log itself.

00:58:24.720 | And that's one of the big problems right now.

00:58:26.960 | Why I'm so careful about giving it, um, the right context.

00:58:30.720 | Because it actually went the wrong way.

00:58:32.640 | It still managed to succeed.

00:58:34.000 | But it should really have cd'd into the right folder.

00:58:38.160 | And run make tail log there.

00:58:39.920 | And it just didn't do it.

00:58:40.960 | And, and this, this is basically contributing to context rod, right?

00:58:45.440 | Now, now it has remembered that this didn't work.

00:58:47.680 | But this worked.

00:58:48.400 | And it shouldn't, right?

00:58:50.880 | It should not make these mistakes.

00:58:53.440 | And I'm, I'm, I'm trying to nudge it in the right direction.

00:58:56.560 | Um, so one of the things I can try here now is that, um, on the make command, maybe we can.

00:59:02.240 | Um, honestly, I, I only have partially good results with this.

00:59:08.080 | If you fail to run the make file, um, you have to remember that you have to run it from top level.

00:59:14.080 | Let's put it here.

00:59:20.400 | Um, the make commands.

00:59:24.320 | Right, so, so maybe this will, oops, maybe this will nudge it in the right direction.

00:59:31.280 | Um, but can't guarantee.

00:59:34.000 | But still, I mean, like, pretty cool.

00:59:36.800 | Okay, so, so we have this now.

00:59:39.360 | So the board roughly works.

00:59:41.120 | So let's double check quickly what it did.

00:59:43.120 | Um, so we have, this is auto-generate.

00:59:47.120 | We don't care.

00:59:48.640 | Uh, now we have here a link.

00:59:51.200 | So we can check this again.

00:59:52.880 | But it added the link component as instructed.

00:59:55.120 | It goes to the board.

00:59:57.120 | And the board itself.

01:00:01.040 | Um, it has imagination somewhere here.

01:00:06.960 | But we don't really see it.

01:00:08.480 | Because we don't have enough topics.

01:00:10.560 | And now, now let's do this.

01:00:12.000 | Uh, now let's be creative.

01:00:13.520 | To test this better, I want you to generate 120 different posts across 10 different topics.

01:00:23.120 | And put them into all the different boards that exist already.

01:00:30.400 | To make this easier, please write yourself a little test script.

01:00:33.040 | Um, and just...

01:00:37.440 | Actually.

01:00:38.640 | Do I want it to write a test script or figure it out itself?

01:00:43.120 | Um, just use Python for this.

01:00:48.480 | Use UV.

01:00:49.040 | And put it into scripts.

01:00:56.160 | We might need this later again.

01:00:58.720 | Right.

01:00:59.840 | So, um, and basically, um, actually hold on.

01:01:02.240 | One, one last thing.

01:01:03.040 | Um, but please use inference to generate a bunch of real sounding conversations.

01:01:10.080 | And pipe them into an input file that this script will then use.

01:01:13.680 | Okay.

01:01:17.040 | So, basically, I want to get it into a situation where it now generates me out an entire board.

01:01:21.760 | So I can test this better.

01:01:22.720 | Um...

01:01:24.720 | Mario, since you're writing, back to Whisperflow.

01:01:28.720 | Um, I'm trialing them both.

01:01:31.360 | But the problem with Voice Inc at the moment is that the AI integration just adds too much latency.

01:01:37.920 | And I want a little bit of fix up.

01:01:40.160 | Um, so for the screencast, I opted to Whisperflow.

01:01:43.520 | Um, it's all about latency for me.

01:01:47.760 | And Whisperflow is, is, is just the lowest latency thing I found.

01:01:51.440 | Um, so this is, this is really, really why.

01:01:54.800 | Um...

01:01:56.480 | For me, one of the really big benefits of agent decoding is actually test data generation.

01:02:07.200 | Because I'm actually struggling a lot with traditional applications that all of my test data just looks

01:02:13.360 | not great enough.

01:02:14.240 | Um, and now you can just get an LLM to really create your pretty good looking test data.

01:02:22.400 | And it makes it much easier to see the product, to feel the product, uh, and to experience what it looks

01:02:27.440 | like.

01:02:27.680 | Um, so that's just such a nice, uh, nice aspect of it.

01:02:32.960 | This is going to take a while.

01:02:34.080 | So maybe we go to questions.

01:02:35.360 | Um...

01:02:36.960 | Try use Tmux for better lock tailing and running servers.

01:02:42.880 | I don't know.

01:02:43.920 | I like what I have.

01:02:45.120 | Works good enough for me.

01:02:46.160 | Uh, what else is here?

01:02:49.360 | How do you disallow MCPs?

01:02:51.040 | I just don't load MCPs into my context in the first place.

01:02:54.560 | Um, one question is, do you have any experience with the amount of users you get out of a $20

01:03:02.000 | cloud subscription?

01:03:05.600 | I don't know is the short answer.

01:03:07.600 | Um...

01:03:08.960 | I think that you don't get that much out of it.

01:03:14.000 | But I'm not sure.

01:03:15.760 | You can try it and see.

01:03:18.560 | I can tell you that with a $100

01:03:20.240 | cloud subscription and you only use one agent at the time with Sonnet, you're not going to hit the limits.

01:03:26.240 | With two or three simultaneously, you can hit the limits.

01:03:31.360 | Um...

01:03:34.080 | With $200 subscription on Sonnet, I don't think you can run up and run into the limits.

01:03:38.080 | I don't think it's possible.

01:03:39.360 | Uh, but with the $20 one, I'm pretty sure that you can run out very quickly.

01:03:44.000 | Um...

01:03:46.640 | Do you use the plan mode?

01:03:48.880 | So because I use dangerously bypassing permissions, I don't really use the plan mode explicitly.

01:03:55.280 | And the problem for this is that it actually disables a bunch of things.

01:04:00.960 | So when it plans, it permanently asks for permissions for all the tools.

01:04:04.880 | So I basically ask it to plan without plan mode.

01:04:08.880 | Because the plan mode, as far as I can tell, at least in parts, auto activates just unprompting.

01:04:13.920 | Um...

01:04:15.040 | But that's really why I don't use the plan mode.

01:04:17.760 | Um...

01:04:19.760 | And that's sort of the answer.

01:04:21.840 | So...

01:04:22.080 | It's still generating.

01:04:24.800 | Um...

01:04:27.200 | Yeah.

01:04:27.680 | They're like...

01:04:28.640 | I think they're probably like 10, 15 different pretty decent voice-to-text things at the moment.

01:04:35.200 | Um...

01:04:36.080 | For all kinds of different setups.

01:04:37.840 | And I think it's a little bit ridiculous to pay for Whisperflow.

01:04:41.360 | And I don't really like that because it is...

01:04:43.280 | The magic is happening on device anyways.

01:04:45.280 | Um...

01:04:46.320 | On...

01:04:46.720 | On the Whisper model, which is the open source one.

01:04:48.560 | So...

01:04:49.280 | Um...

01:04:50.400 | Yeah.

01:04:50.640 | I hope we just get to the point where, um...

01:04:52.960 | Something like Whisperflow in an open source way, um...

01:04:57.760 | Becomes like a...

01:04:58.640 | Like a thing that everybody contributes to.

01:05:03.520 | Okay.

01:05:04.080 | So it's now generating words.

01:05:06.000 | Cool.

01:05:07.040 | So look at this.

01:05:08.960 | I have...

01:05:09.280 | We have stuff to look at.

01:05:11.120 | Is it not nice?

01:05:11.920 | It just auto generates all of it.

01:05:13.360 | Nice.

01:05:13.920 | Best setup for home office.

01:05:15.520 | Coffee setup for home office.

01:05:16.880 | Look at this.

01:05:17.440 | Um...

01:05:18.880 | Cool.

01:05:20.160 | So...

01:05:21.520 | We have...

01:05:24.640 | We have content.

01:05:26.400 | Which is cool.

01:05:28.560 | And it wrote me this little script here.

01:05:32.720 | Um...

01:05:33.360 | To populate the form.

01:05:34.880 | Right?

01:05:35.120 | I will not even look at the script.

01:05:37.200 | Don't have to.

01:05:38.000 | I don't care.

01:05:38.560 | It did its job.

01:05:40.640 | Um...

01:05:41.360 | So...

01:05:43.440 | What I will do now is...

01:05:45.040 | I will commit this.

01:05:46.080 | Uh...

01:05:47.760 | We'll do...

01:05:48.240 | First we do...

01:05:49.440 | Web.

01:05:49.680 | And we'll format this quickly.

01:05:52.320 | And then we check in...

01:05:55.840 | Edit...

01:05:58.480 | Board listing.

01:06:02.240 | No topic listing in boards.

01:06:03.440 | Um...

01:06:06.080 | And now we add the scripts.

01:06:09.120 | Edit...

01:06:11.280 | One pop...

01:06:14.720 | Relator script.

01:06:16.880 | So now that we have this.

01:06:19.600 | We can do one last thing.

01:06:21.040 | Where we do...

01:06:21.920 | Actually I should probably have...

01:06:24.480 | Checked that we had here.

01:06:27.600 | Um...

01:06:28.240 | Anyways.

01:06:29.200 | It doesn't matter.

01:06:29.680 | Next thing is we're going to enumerate the topics.

01:06:34.080 | Now I want you to make...

01:06:36.960 | Um...

01:06:37.440 | Um...

01:06:38.080 | A way to look at all the topics.

01:06:40.000 | So basically we are going to add an endpoint to see all the posts on a topic.

01:06:44.800 | With pagination.

01:06:46.400 | Same general API flow as we had for the...

01:06:49.360 | Board index page.

01:06:52.400 | And we also want to add the front-end component and the front-end page to...

01:06:56.480 | Show that too.

01:06:58.400 | And again also support pagination.

01:07:00.320 | I don't know if this will work.

01:07:05.680 | Let's see.

01:07:06.080 | And I'll go to the questions at any time.

01:07:09.280 | Um...

01:07:09.760 | Yeah.

01:07:17.840 | This monthly paying for basically local whisper models is nonsense.

01:07:23.840 | Um...

01:07:25.600 | I'm...

01:07:26.000 | I'm actually quite okay for paying for the API inference.

01:07:29.520 | But I also don't think that...

01:07:30.960 | I think it could actually fix up a lot of the little issues with

01:07:38.160 | voice input on a very well-trained local model too.

01:07:41.600 | So...

01:07:41.840 | Yeah.

01:07:43.920 | Um...

01:07:45.920 | CI is not an alias for commit.

01:07:47.440 | That's an alias that I set up.

01:07:48.720 | Um...

01:07:49.360 | So I have an alias in my git config.

01:07:53.120 | This...

01:07:53.440 | Git config.

01:07:55.520 | CI.

01:07:56.720 | So I have a bunch of these ones here.

01:07:58.000 | Um...

01:07:59.520 | It was an alias on Mercurial, which I was using before git.

01:08:02.640 | And I got so used to it that when I moved to git, I set up this alias and never went back.

01:08:06.880 | I have no idea what Kimi is.

01:08:13.200 | Uh...

01:08:13.440 | You mean like Kimi v2, the...

01:08:15.200 | This new model?

01:08:15.920 | Is that...

01:08:16.240 | Is that...

01:08:17.600 | Is that the new huge model?

01:08:19.760 | Is that Kimi?

01:08:20.400 | I haven't tried it.

01:08:22.080 | I heard that it's...

01:08:22.880 | Pretty good on open code if you use it through...

01:08:26.960 | Uh...

01:08:29.200 | I guess open router or something.

01:08:30.720 | But I haven't tried it.

01:08:31.600 | Um...

01:08:32.960 | Didn't have the time.

01:08:33.680 | Um...

01:08:36.080 | Okay.

01:08:37.760 | So...

01:08:41.120 | Very slowly this will start working at one point.

01:08:44.640 | We'll see.

01:08:48.080 | We'll see.

01:08:48.480 | I mean this is not a very interesting screencast in many ways.

01:08:54.560 | Because this doesn't really show a trend decoding all that much.

01:08:58.000 | Because there's really not that much to see.

01:09:00.000 | I'm just adding more of the same now.

01:09:01.680 | If I still have some time, I think I have 20 minutes left.

01:09:05.920 | If I still have some time, I will try to add some tests.

01:09:08.560 | Um...

01:09:10.240 | Which I think is more interesting.

01:09:11.680 | The next question.

01:09:13.360 | What do I usually do while waiting for Claude?

01:09:15.440 | Um...

01:09:15.840 | So this is the moment when I'm going to pitch Vibe Tunnel.

01:09:18.160 | This is a thing we built.

01:09:20.480 | Or actually I think I was barely involved at this point in this project.

01:09:24.160 | This is a...

01:09:25.120 | I think this is primarily now Mario's and Peter's project.

01:09:28.000 | Um...

01:09:28.880 | But it's a way to...

01:09:29.840 | Basically run all of your Claude instances through the browser.

01:09:35.760 | So I can go in for a coffee and see what it's doing.

01:09:39.040 | That's the...

01:09:40.960 | That's the general idea here.

01:09:42.080 | Um...

01:09:44.000 | But the answer of like what do you do while...

01:09:46.240 | Um...

01:09:46.720 | While waiting for Claude is you go to Twitter and you write stuff, I guess.

01:09:49.920 | Um...

01:09:50.400 | See the problem with the music is...

01:09:54.080 | Let's see.

01:09:54.400 | Let me see.

01:09:56.400 | I need to...

01:09:57.520 | I need to turn on the screen capture sound.

01:09:59.360 | So now...

01:09:59.840 | Hold on.

01:10:01.040 | Can you...

01:10:02.800 | Can you now hear the terrible music?

01:10:04.080 | No, it doesn't work.

01:10:07.280 | It doesn't work.

01:10:08.880 | Can you hear it now?

01:10:11.040 | Anyways, that's the music.

01:10:21.840 | Let's see if we see our topic.

01:10:26.720 | It's still been in front.

01:10:38.480 | Yeah, so maybe...

01:10:40.800 | Maybe here's an interesting thing.

01:10:42.000 | Why am I using Go?

01:10:42.960 | Uh...

01:10:43.280 | Go is...

01:10:43.920 | Go...

01:10:44.320 | Go is a language I don't like.

01:10:45.680 | Um...

01:10:46.240 | As a...

01:10:46.560 | As a...

01:10:46.800 | As a human writing code.

01:10:48.080 | Maybe now that I'm sort of writing it more indirectly, I don't mind it quite as much.

01:10:55.040 | But I kind of want to show why Go works so well for authentic coding.

01:11:02.160 | Um...

01:11:02.480 | Look at this.

01:11:03.040 | Okay, maybe this is the bad one.

01:11:06.160 | Maybe we are looking at the handlers.

01:11:08.240 | I mean, look at this.

01:11:08.960 | This you would not write in any other programming language than Go, right?

01:11:14.320 | You wouldn't say...

01:11:16.160 | If error not nil, return internal server error.

01:11:20.640 | Like I have one, two, three...

01:11:22.720 | Three branches just to handle server errors.

01:11:26.480 | And I know that there are ways in which I could do this differently.

01:11:29.520 | And then return an error and like handle some of it on a higher level.

01:11:32.160 | But...

01:11:32.320 | My point mostly is...

01:11:35.600 | In Python, you wouldn't write it because it's ugly code.

01:11:38.560 | In Rust, you wouldn't write it because it's ugly code.

01:11:42.480 | In Go...

01:11:43.040 | A lot of Go code looks like this.

01:11:45.760 | And it's perfectly fine.

01:11:48.000 | So the bar of error handling in Go is exactly that bar.

01:11:52.240 | And an agent are writing exactly that code.

01:11:55.120 | So it is not any worse.

01:11:58.320 | And one of the consequences of this, like all the handling is local.

01:12:02.000 | So it's very easy for the agent to understand what's going on.

01:12:07.920 | Because it doesn't have to look through so many layers of abstraction, right?

01:12:11.200 | It sees basically everything that's going on in this function is going on in this function.

01:12:17.520 | And not anywhere else, right?

01:12:18.720 | It doesn't have to understand complicated error handling patterns elsewhere.

01:12:21.760 | It's pretty straightforward.

01:12:22.720 | That's why Go is so good from a code writing perspective.

01:12:25.440 | The other thing is that...

01:12:27.120 | All of the meta shenanigans that this language has is pretty standard too.

01:12:34.160 | Like there's not a lot of complexity you need to understand.

01:12:36.880 | Yes, there's some attributes on it.

01:12:38.160 | But it's good enough at comprehending this.

01:12:40.240 | And the last part is if you run the Go tests, then it caches them.

01:12:45.280 | And so you can basically...

01:12:46.400 | And I don't have the test setup yet.

01:12:47.600 | But you can...

01:12:48.160 | With Go, you can basically tell it to run all the tests at all times.

01:12:51.520 | Without it slowing down the authentic loop.

01:12:55.280 | And that is so good.

01:12:56.720 | Because it means it never accidentally tests too narrow.

01:12:59.440 | So in Python, I have it that tests one function only.

01:13:02.960 | Because it explicitly only tests that function.

01:13:05.440 | And it completely forgets that it five minutes ago broke another function.

01:13:10.000 | And only at the very end, it discovers that it made a huge mess.

01:13:13.840 | And with Go it just doesn't happen.

01:13:15.120 | And I will show this in a bit.

01:13:16.960 | But now supposedly I can look at the topic.

01:13:22.880 | But I'm actually not sure if that is correct.

01:13:27.360 | Because it doesn't seem to work.

01:13:30.800 | So...

01:13:32.640 | Well, I can click on something and nothing happens.

01:13:37.840 | What's going on?

01:13:43.120 | So, when I click on a topic nothing happens.

01:13:46.320 | I don't actually see the topic.

01:13:47.760 | I will stay on the board page what's going on.

01:13:51.120 | But we can in the meantime look at the code that it generated.

01:13:56.400 | So, we got more API to return posts.

01:14:00.720 | Then we get...

01:14:08.400 | What's here?

01:14:10.640 | What do we have here?

01:14:12.640 | Get posts for topic of vaccination.

01:14:15.520 | I'm assuming this is probably okay.

01:14:17.200 | What did it do here?

01:14:21.280 | What did it do here?

01:14:25.200 | ParseInt.

01:14:25.760 | That's...

01:14:29.920 | Why?

01:14:30.880 | Why do I have a parseInt all this time?

01:14:35.360 | That's the kind of slope that should go.

01:14:39.200 | So, this will go into our

01:14:42.400 | to-do list.

01:14:44.400 | Get rid of parseInt.

01:14:50.560 | This slope should go away.

01:14:54.640 | Okay.

01:14:57.200 | So, so this...

01:14:59.040 | Okay.

01:15:01.200 | So, what do we have here?

01:15:02.560 | We have...

01:15:02.880 | What's going on?

01:15:13.760 | VoidT.

01:15:18.320 | I don't actually like to depict T.

01:15:20.080 | Why did it make this folder?

01:15:22.560 | I can delete this folder.

01:15:23.840 | Did it find the problem?

01:15:29.280 | Well, I don't think it...

01:15:37.840 | I think it's completely wrong on what it's trying to debug here.

01:15:42.000 | But look, it's checking the route tree if it's regenerated.

01:15:50.160 | So that's positive.

01:15:52.160 | Wait, I think it's the issue now.

01:15:54.400 | What's the issue now?

01:16:10.960 | Is that blah, blah, blah, blah.

01:16:12.160 | This is all wrong.

01:16:13.680 | Actually, I think the issue here is like...

01:16:16.480 | Let me try.

01:16:16.960 | I think the issue here is that

01:16:20.000 | You need to create $ward here.

01:16:25.840 | Then this has to become index TSX.

01:16:29.520 | Always.

01:16:33.520 | I think it has to go here.

01:16:35.120 | Move.

01:16:35.440 | And then this has to be...

01:16:38.880 | e.topic.

01:16:43.760 | I think that is how this works.

01:16:46.240 | Is this how it works?

01:16:49.920 | Or did I fuck up everything now?

01:16:51.200 | What's going on?

01:16:53.840 | Compare.

01:17:00.800 | Okay.

01:17:06.880 | What's going on?

01:17:09.440 | All right.

01:17:12.640 | Okay, I broke everything.

01:17:17.840 | Classic.

01:17:19.680 | But what did I break?

01:17:28.880 | I think I broke something.

01:17:32.800 | I'm always confused.

01:17:38.240 | So not only am I confused by 10 stack router.

01:17:40.800 | It's also that the LLM is confused by 10 stack router.

01:17:43.920 | But I've been in this situation before.

01:17:46.480 | And I think it's related to that.

01:17:47.760 | It has to be in this right structure here.

01:17:53.920 | Yeah, look at this.

01:17:55.040 | Now it works.

01:17:55.680 | Okay, cool.

01:17:58.720 | So welcome to the new forum.

01:18:00.320 | We see stuff here now.

01:18:01.360 | This works.

01:18:02.960 | Nice.

01:18:04.320 | Still slop though, but slightly better slop.

01:18:09.120 | So let's commit this and then try to make a test.

01:18:14.240 | Let's finish it off by adding a test.

01:18:15.840 | Make format, web, edit, topic listing.

01:18:23.440 | And maybe one last thing we could do is like actually add support for.

01:18:28.160 | But I want to write a test.

01:18:30.480 | I think I want to write this.

01:18:31.440 | Let's see what else was written there.

01:18:36.880 | How do you feed from it?

01:18:38.480 | Yeah.

01:18:41.280 | So once more, the frontend log to Cloud Code is basically a frontend.

01:18:46.000 | I have a plugin that forwards this.

01:18:47.360 | Yeah.

01:18:51.360 | This is nothing like cursor.

01:18:52.720 | Like even the cursor agent is nothing like this.

01:18:54.960 | Like this is a completely different experience.

01:18:56.560 | Okay.

01:18:57.920 | So let's write some tests.

01:18:59.040 | This is what we're here for.

01:19:00.960 | Let's write some tests.

01:19:01.920 | So we want to write some tests.

01:19:05.440 | But the problem with tests is that agents are not very good at writing tests.

01:19:09.040 | That's really the reality of all of this.

01:19:11.440 | So we're going to write one test.

01:19:13.600 | And I think we're going to...

01:19:18.160 | Actually, before we write a test, we will write a way to create posts.

01:19:26.800 | I want you to add an internal function to create posts,

01:19:32.720 | which we will then hook up to an API later, but we don't hook it up yet.

01:19:36.480 | And we also want the function to create topics.

01:19:42.960 | So that is basically creating a post plus a topic in one go.

01:19:50.320 | And then I want you to write a singular test that creates...

01:19:54.720 | No, no, no.

01:19:57.680 | I don't want it yet.

01:19:58.800 | Okay.

01:19:59.040 | Let's do these APIs and I will make a test plan here.

01:20:03.840 | Test plan.

01:20:05.680 | Because the thing with the test plan is that...

01:20:11.680 | Here's how usually I want tests to work.

01:20:13.600 | All the database tests should use rollbacks.

01:20:22.800 | That's actually the biggest problem.

01:20:26.000 | Because the way it wrote the test right now is it wrote it against the underlying SQLite code.

01:20:33.280 | And the problem with this is that this doesn't have enough abstraction to allow you to...

01:20:37.600 | Basically, have implicit rollbacks.

01:20:41.440 | The way I really like my code to work is that you can do something like this.

01:20:44.800 | That you can do...

01:20:45.440 | The way I like it is that you can write tests that...

01:20:55.680 | Insert, insert, insert, insert, insert, but then the rollback.

01:20:57.840 | And for this to work...

01:21:01.200 | We need to change too much.

01:21:03.760 | Because we need to basically...

01:21:06.080 | If you go with post...

01:21:08.880 | Right, this here, for instance, it takes a SQLite DB.

01:21:18.880 | But when you do a transaction...

01:21:20.640 | When you basically do...

01:21:21.520 | Your txn error equals...

01:21:24.320 | DB.begin, I think.

01:21:26.960 | Right.

01:21:27.200 | If error not null return null error.

01:21:31.200 | Right.

01:21:31.360 | It must be this.

01:21:33.600 | Right.

01:21:33.840 | This here is a different type.

01:21:36.960 | Yeah, I also want...

01:21:37.600 | I also want this different.

01:21:38.720 | Come on.

01:21:40.320 | Yeah, there we go.

01:21:41.920 | So this here...

01:21:45.280 | This is not going to be a problem now.

01:21:48.720 | Because my parameter here can be either a database...

01:21:51.440 | Or it can be a transaction.

01:21:52.400 | Right.

01:21:52.640 | So for my test setup to work...

01:21:58.480 | We basically have to refactor the entire code base.

01:22:00.480 | And they're just not amazing ways, I think, to do that.

01:22:06.960 | So this might be annoying.

01:22:10.640 | Let's see.

01:22:11.120 | So create topic.

01:22:13.760 | Right.

01:22:17.840 | So here we have this one.

01:22:18.800 | Right.

01:22:19.120 | So it creates a...

01:22:20.000 | It creates a transaction.

01:22:21.120 | And now for this to work with my intended rollback strategy, this also has to be save points.

01:22:26.160 | So...

01:22:27.040 | So this might be annoying.

01:22:33.200 | So this is going to be the point where maybe we're going to ask Gemini.

01:22:45.840 | Because I think...

01:22:58.960 | Actually, I think that Sonnet might not be able to do this in a good way.

01:23:03.360 | So let's see.

01:23:10.000 | How do we do this.

01:23:10.880 | How do we do this?

01:23:10.960 | How do we do this?

01:23:11.280 | How do we do this?

01:23:12.000 | How do we do this?

01:23:13.280 | How do we do this?

01:23:13.280 | So let's commit these creators.

01:23:14.960 | Edit functionality.

01:23:17.200 | I'm going to add up here.

01:23:17.680 | See, I can't type them down.

01:23:20.320 | Do create posts and topics.

01:23:22.800 | So now we should...

01:23:25.120 | We should come up with this test plan.

01:23:28.160 | Um...

01:23:29.600 | Let's...

01:23:32.960 | Let's see if Sonnet can do it.

01:23:34.240 | Um...

01:23:38.400 | I want to write some tests.

01:23:40.000 | But the way we're doing database transactions right now doesn't work for how I want tests to work.

01:23:45.280 | Please ultra think how to re-architect the code to support this better.

01:23:54.000 | Um...

01:23:57.360 | Let's let it do this thing.

01:24:06.720 | I just don't want it any more complicated.

01:24:08.720 | But there's just one way to test databases and that's rollbacks.

01:24:14.160 | And...

01:24:14.480 | I think...

01:24:16.160 | I think we might need to do this by factor.

01:24:18.560 | Uh...

01:24:20.000 | Any other questions in the meantime?

01:24:21.280 | Uh...

01:24:25.600 | No.

01:24:25.920 | No other questions.

01:24:26.720 | So...

01:24:26.960 | At least I think there are no other questions.

01:24:32.800 | Uh...

01:24:32.960 | So the git...

01:24:34.720 | The git repo is on git already.

01:24:36.960 | Uh...

01:24:37.680 | On github.

01:24:38.160 | It's...

01:24:39.840 | Here.

01:24:42.480 | Mini db.

01:24:44.880 | Why did I call it mini db?

01:24:46.000 | Uh...

01:24:48.160 | It should be mini db.

01:24:49.200 | It's mini bb.

01:24:54.320 | There you go.

01:25:00.320 | Uh...

01:25:00.640 | Yeah, yeah, yeah, yeah.

01:25:01.120 | There you go.

01:25:03.680 | Um...

01:25:05.600 | The other thing is like for for this authentic coding

01:25:14.400 | with streaming,

01:25:17.520 | I don't...

01:25:19.440 | I don't quite work like I work normally because

01:25:23.040 | first of all, I don't talk all the time.

01:25:26.000 | But I also don't stay engaged with the agent as much as I do right now.

01:25:29.440 | Um...

01:25:30.560 | There's a lot of waiting involved.

01:25:33.040 | So I try to paralyze work.

01:25:34.640 | I try to do other things in the meantime.

01:25:37.040 | So it's a little bit...

01:25:37.760 | A little bit different.

01:25:39.360 | So let's see what it did.

01:25:42.320 | It...

01:25:43.600 | Decided that...

01:25:46.240 | We are going to use an interface

01:25:49.440 | called Querier.

01:25:50.400 | Huh.

01:25:56.240 | Really?

01:26:04.400 | That's what we're going to do?

01:26:17.120 | I don't think it's going to work.

01:26:18.960 | I don't think it's going to work, man.

01:26:20.160 | Ah, come on, come on, come on.

01:26:23.920 | Think hard for this problem.

01:26:27.440 | Does this actually work with nested transactions and save points correctly?

01:26:31.840 | Question mark.

01:26:33.360 | Because I don't think it works because it will have to

01:26:39.680 | know how deep it is.

01:26:41.760 | How do we distinguish between Claw and Gemini?

01:26:50.720 | What we use Gemini for?

01:26:51.920 | Gemini, the model,

01:26:55.040 | is excellent at programming.

01:26:57.760 | It is also excellent at

01:27:01.760 | thinking, if you can call it this way,

01:27:04.400 | and creating architecture and back and forth for this.

01:27:07.600 | Gemini CLI, the command line tool.

01:27:09.840 | It's not amazing, and Gemini, the model, is not very good at tool usage.

01:27:15.840 | So, for the agentic loop, I still haven't found anything better than Sonnet and Opus.

01:27:22.720 | But this is also why, and I mentioned this earlier, I use O3, and sometimes I use Gemini

01:27:29.280 | to plan larger changes, and then I give the output of that to Sonnet.

01:27:33.920 | And I just do the planning of larger changes just in the UI and ChatGPT or in the AI studio for the most part.

01:27:39.600 | Then there was a question of vector.

01:27:45.600 | I tried OpenTelemetry and a bunch of other things.

01:27:48.320 | It creates too much nonsense, too much output with all of the spans that it produces.

01:27:55.360 | And it didn't work quite as well as just a simple thing of logging everything into one file.

01:28:02.800 | I actually struggle to make this work.

01:28:06.160 | And I find it also to be quite involved.

01:28:09.440 | And also Gemini, sorry Gemini, Claude, to just not fully understand how OpenTelemetry works.

01:28:16.720 | So, right now at least, simple log files work incredibly well.

01:28:22.720 | Complicated OTel stuff.

01:28:25.360 | It doesn't work good enough for me.

01:28:28.800 | I would actually love to see someone show how to use OTel successfully for agentic workflows.

01:28:37.680 | Just, it didn't work for me is all I can say.

01:28:41.760 | So, what did it say?

01:28:47.200 | What did it say about my interjection?

01:28:54.240 | Did it say something?

01:29:05.680 | So, you're right to question.

01:29:06.640 | My current approach has fundamental flaws with net transaction save points.

01:29:09.360 | What?

01:29:22.080 | How did you fix it then?

01:29:30.560 | I, the problem is like, I know how I set up this to normally work.

01:29:35.840 | And I don't know if the AI can actually one-shot this.

01:29:38.320 | Okay, so it has a nesting level now.

01:29:44.320 | It has save points.

01:29:45.440 | Maybe, maybe, maybe.

01:29:52.080 | Okay, so we have a board test.

01:29:53.120 | So, we're going to set up test DB.

01:29:55.920 | Okay, so it creates a SQLite memory database.

01:29:58.640 | Ah, this is slow, pure slow.

01:30:05.760 | Why do we do this?

01:30:06.800 | Why, why?

01:30:07.440 | Great.

01:30:10.160 | Should run the real migrations.

01:30:15.680 | So, you can already see at this point

01:30:20.400 | that this is it.

01:30:21.600 | I can already see

01:30:22.560 | that it's now no longer going to give good code.

01:30:26.240 | And it doesn't even have that much stuff in the context,

01:30:29.360 | but it's already, it's already making mistakes that it doesn't do on a smaller context.

01:30:33.280 | Like, it's, it's, it went too narrow on one specific problem.

01:30:38.880 | And this is the point where I no longer expect good output from this, actually.

01:30:45.040 | Does it even manage to run migrate?

01:31:04.480 | Migration for testing is run migration.

01:31:06.880 | Like, why?

01:31:08.000 | Why are we doing this?

01:31:11.680 | And, and what is begin tx here?

01:31:13.520 | Uh, now it's, now it's turning into full slope.

01:31:23.280 | Um, and this is all just for the test setup.

01:31:27.440 | So, what I will do now is I will, I will make a branch.

01:31:35.760 | Uh, testing setup.

01:31:37.360 | Because I don't like any of this.

01:31:42.000 | Um, make format.

01:31:45.280 | So, we're going to, um,

01:31:47.280 | pretty initial test setup.

01:31:51.680 | It doesn't quite work.

01:31:55.760 | So, we're, we're going to, we're going to, we're going to go back to the drawing board here.

01:32:00.000 | Um, so, um,

01:32:04.080 | I, I think this is, this is awful.

01:32:10.400 | Um, this, this might be really, really awful.

01:32:12.960 | So, let's do a div to name.

01:32:17.600 | So, where, where did the slope start?

01:32:20.560 | Um, this is, might still be okay.

01:32:23.440 | So, the strategy now will be to unsloppify this and to get it to do something.

01:32:34.960 | And, maybe the way of doing this will be to get the tests rns to run.

01:32:41.760 | So, the board test, we're going to make this not terrible.

01:32:46.480 | Um, so, run migrations for testing should just be run migrations.

01:32:54.800 | Why is run migrations here in lowercase?

01:33:01.520 | Because we have init.

01:33:02.560 | Okay.

01:33:04.800 | So, we have init, which runs migrations.

01:33:07.120 | And, that is what the server is doing.

01:33:10.320 | Right.

01:33:12.320 | So, um, um, um, um, um, um, um, um, um, um, um, um, um, um, um, um, um, um.

01:33:20.640 | So, run migrations for testing does this instead.

01:33:23.440 | So, let's start with this.

01:33:26.480 | Um, take this and make a test utils

01:33:31.120 | package, which creates the test harness.

01:33:36.240 | Run all, add a test setup function, which takes a callback,

01:33:42.400 | which handles

01:33:45.440 | migrations database in memory

01:33:50.800 | set up and tier down.

01:33:53.120 | Then update the tests to use this.

01:33:59.680 | I also want you to reuse

01:34:02.240 | the test

01:34:04.560 | database

01:34:07.120 | between test runs, so we don't waste quite as much.

01:34:13.120 | Um, actually, I don't want to explain it, but I don't want to migrate all the time, basically.

01:34:23.440 | Um, and then we should test if this actually works, so create, let's review this.

01:34:29.360 | Create post now uses q exec.

01:34:36.480 | Um, and so where, let's see what we have to begin, dot begin.

01:34:47.920 | What do we do dot begin?

01:34:50.960 | In transaction goal, libigo, migration, word test.

01:34:56.400 | What?

01:35:04.960 | This is, this is just all nonsense.

01:35:07.120 | Tests are the worst because it doesn't understand how to create a test harness.

01:35:24.720 | Um, this entire thing should go, this transaction manager.

01:35:32.240 | So, set up test DB.

01:35:34.160 | Like, what is it doing now?

01:35:38.800 | What's it doing?

01:35:43.440 | All right, I might actually have to

01:35:46.080 | defer this to next time.

01:35:50.960 | But I would love it to at least set up the harness correctly.

01:35:56.800 | Um, so here is, here's my best recommendation here at this moment.

01:36:03.440 | Don't set up tests initially with Claude.

01:36:08.880 | Because it just doesn't understand how good tests should look like.

01:36:13.920 | And I don't know what it says about us as programmers, but the way it sets up tests is just bad.

01:36:20.720 | I can only assume that the bulk of people out there are writing horrible tests.

01:36:25.360 | Um, all, all of this is wrong.

01:36:29.040 | Like, all of this is wrong.

01:36:30.080 | What, what it should actually do is we should set up like a really good, um,

01:36:34.560 | transaction infrastructure in the beginning.

01:36:36.160 | Um, the, the, the, the pattern I like to use here is actually from Django.

01:36:39.760 | Django has these atomic blocks.

01:36:41.520 | They work quite well and they hide save points and transactions properly.

01:36:44.800 | So, I should actually do that first.

01:36:46.400 | Get this in a good spot and only then start writing tests.

01:36:49.040 | Because everything that is done here so far is really, really bad.

01:36:54.720 | Yeah, so, um, I might do the following.

01:37:10.160 | I might let this run, um, and actually set up the tests correctly in a way that I like.

01:37:15.920 | And then I will show either at the future stream or just in another, um, like a video or, or just

01:37:23.040 | like a follow-up post of how to run the tests.

01:37:25.120 | Because I don't think we're going to get to a reasonable point in the next 20 minutes.

01:37:30.000 | And I don't have that much time.

01:37:32.480 | I, I gave myself an hour and a half and already over time.

01:37:35.120 | So maybe I will do two more minutes of last questions.

01:37:38.240 | Um, yeah.

01:37:44.720 | But I think I will leave it here.

01:37:46.320 | And then, uh, thank you so much for watching.

01:37:49.040 | See you next time.