back to indexAI in Action: Debugging an LLM-Powered Discord Bot

Chapters
0:0 Introduction and Session Goals
5:35 Overview of the AI in Action Bot
6:11 Improving the Contributing Guide
9:53 Debugging Bot Response and Intent Classification
15:49 Bot's Purpose: Coordinating Weekly Talks
23:48 Setting Up the Local Development Environment
26:50 Using Docker for Local MongoDB Setup
33:52 Proposing a New Debugging Feature
42:58 Implementing the "Debug" Intent
00:00:11.680 |
I could hear you giving me a thumbs up on that. 00:00:25.340 |
I got a chance to listen to a bit of the Twitter spaces that you hosted. 00:00:30.040 |
I didn't know you were doing like a full-on podcast. 00:00:32.840 |
Like you had a, like, what was it, the daily engineering thing? 00:00:37.940 |
Yeah, I don't, I mean, that's a, I'm one of a number of folks who do that. 00:00:48.460 |
Increasingly, they're sending me AI-related interviews, too, which is fun. 00:01:01.180 |
I don't do Twitter generally, but I did do a Twitter space. 00:01:05.120 |
Like, my Twitter's, like, all locked down because I left when Elon Musk took over, honestly. 00:01:10.500 |
But I keep it around occasionally, and I'm helping organize a conference, 00:01:13.820 |
and they needed somebody to emcee a discussion on a Twitter space, so I did that. 00:01:27.740 |
Yeah, Flo, I'm just going to let you drive this thing. 00:01:33.980 |
Let me move the Zoom thing, and then I should be able to do share screen this one. 00:01:52.860 |
Okay, I'm assuming you guys are seeing Cursor right now. 00:02:06.920 |
And I have the – so maybe we should start here. 00:02:09.820 |
This is the actual link, the URL to the AI in Action bot that David put together for us. 00:02:17.420 |
I think we've had a couple of commits over time, but the idea that I wanted to – I kind of – I'm bouncing a few ideas around, but I have them written down in the – my to-do list in Cursor. 00:02:29.940 |
So hopefully I'll pull up here in a second and get some feedback from you guys. 00:02:33.780 |
But what I mainly wanted to do is, like, add a couple of – what do they call them? 00:02:38.040 |
Quality of life features to the AI in Action bot. 00:02:40.520 |
And as I'm getting into trying to contribute myself, I think it would be nice to also put together a nice contributing doc for other people to contribute. 00:02:50.880 |
I think maybe there's, like, slight barriers that make it harder than it should be, and maybe that's why we haven't had anyone else, like, really contributing, even though we all kind of, like, use it from week to week. 00:03:02.280 |
So those are my two kind of, like, main things is adding a contributing guide, taking feedback from people in the community on what would make this better, or, like, what would make this easier to use or more likely to be used in our channel. 00:03:17.220 |
Because it seems like we do have a lot of engagement in the channel, but don't quite take advantage of the bot as much. 00:03:22.560 |
And then – this is a personal one, but I would like to brainstorm names so we can, like, anthropomorphize the bot. 00:03:28.820 |
I think instead of just calling it a bot, I think it would be nice to call it a – you know, like, by a name or something. 00:03:34.220 |
I don't have anything good off the top of my head, but – 00:03:44.140 |
So that's basically what I wanted this session to be about, is just those things. 00:03:50.940 |
So what I did, and so I can take notes on what would make it easier for other people to contribute, I basically just cloned the repo into a contributor folder on my desktop, 00:04:04.060 |
and added the original repo, which is, like, David Gutman slash AI in Action Bot, as, like, an upstream remote here on Git. 00:04:14.460 |
So I think once you do those two things, it should be easier. 00:04:17.780 |
As I was going around looking to see, because I'm on Windows, it's a little bit clunky to, like, use something like NVM and Node, but I was trying to see, like, what Node version we need, but it doesn't look like it's been specified. 00:04:31.680 |
So little things like that I would just like to – and then I also recognize that there was a discrepancy between the .EMV example and what the README says should be in the .EMV. 00:04:41.240 |
So just, like, little things like that I think could be cleaned up in this hour, and then also just if we, you know, kind of crowdsource ideas for what would make this better, I think that would be cool. 00:04:52.840 |
So the one thing that I have done is on my fork is pull up jewels.google.com. 00:04:59.800 |
I know everyone's, like, cloud code crazy, you know, using Codex and AMP, but I thought this was a nice free way to, like, maybe ask a few questions or get something rolling at the beginning of the session, 00:05:11.580 |
and then maybe we come back in in the next 20 minutes or so and see what it was able to come up with. 00:05:17.680 |
I'm curious, though, like, what's going on in the minds of the peanut gallery right now? 00:05:23.040 |
Like, what do you guys think we should add to the – off the top, based on what I've said so far? 00:05:35.980 |
And if you guys don't have anything, I'll probably just proceed with, like, the two questions I asked, like, right off the bat to the model. 00:05:43.200 |
And, like, Gemini Pro did a terrible job, but they said, like, literally stopped a couple words into the response. 00:05:51.980 |
But I basically said, give me a tour of the code base. 00:05:55.920 |
And it kind of came up with this, you know, mermaid chart, and then came up with its own – I don't want to say index, but just, like, architecture overview. 00:06:05.240 |
And then I asked about the node version, and it said that there's no node version specified. 00:06:14.420 |
It said that there was no node version specified, so you pretty much are good to go with – I think I'm on 23.3 or something like that. 00:06:22.200 |
And it said I should be good to go with that. 00:06:24.200 |
So I don't know if you guys have any comments on that. 00:06:27.660 |
And if not, I would just kind of go back through the Discord and see what the complaints were. 00:06:33.820 |
I know the most recent complaint was with Slono trying to get it to bring up the schedule. 00:06:40.160 |
And I'll be honest, like, I'm not sure if, like, the bot was just completely off or he wasn't saying, like, the special keywords. 00:06:45.920 |
But that was the most recent one I could think of in terms of complaints of, like, why – 00:06:55.240 |
And see if it's – if what it's doing is hunting for keywords or if it's doing, like, an LLM as judge eval, because I would do probably the second to shoot it to a super cheap, quick model and be, like, of the available tools that you have, should you call one right now and why kind of thing? 00:07:14.040 |
Yeah, like, it shouldn't be – it shouldn't be silent, right? 00:07:17.840 |
Like, maybe it would just say, like, I don't know what you're asking me. 00:07:24.220 |
And potentially adding, like, an indicator that it's typing. 00:07:27.760 |
I don't know if that's easy in Discord or not. 00:07:38.440 |
So a feature would be if the bot doesn't know how to respond, respond with its keywords or something like that. 00:07:50.580 |
So do you think this is a good question for someone that doesn't know the code basis? 00:07:54.720 |
How does the bot determine what to respond to? 00:07:56.500 |
And then we just kind of follow what the agent does? 00:08:01.580 |
I'm sending it anyways, but do you guys have any better initial prompts? 00:08:05.140 |
Yeah, well, so, I mean, I guess it depends on what you're – for, like, quick stuff like this, usually I end up using DeepWiki. 00:08:13.140 |
I don't know if you guys, like, clone it down and do it with Cursor, but presumably they're both the best. 00:08:27.100 |
If you just change the URL on GitHub of the repo to DeepWiki instead of GitHub, then it'll take you to, like, a thing where Devin will, like, index the repo or, quote, unquote, Devin. 00:08:39.700 |
It's not Devin, but basically, you know, you can get a wiki of the repo itself. 00:08:46.460 |
Deep.wiki and leave everything else the same? 00:08:55.720 |
And then you want to drop the commits main part, and then you should be good. 00:09:10.640 |
And you can ask questions directly from here? 00:09:21.480 |
There's a thing at the bottom where you can just, like, just ask one-shot questions, or you can have a deep research through the repo. 00:09:33.560 |
Why don't we ask the same question to Cursor and then ask the question to this thing and see kind of what QLL differences we get? 00:09:42.120 |
In your experience, is it better to click on this deep research thing? 00:09:49.340 |
I think for a short question like this, we probably don't need it. 00:09:52.760 |
If we wanted more, like, a report on, you know, how to modify a feature or something like that, or, like, there, you know, if we click through the articles and the answer, like, wasn't there, so we need, like, a new, our whole article, then I would deep research it. 00:10:08.320 |
But for this, it should be able to just figure it out. 00:10:15.100 |
Okay, so basically, so it gave us a bit of code that doesn't look all that useful in Cursor and then in using Devin and DeepWiki libdiscordindex.js. 00:10:57.360 |
Only processes messages from the configured Discord guild. 00:11:17.920 |
So if I click that here, oh, okay, it just shows. 00:11:22.460 |
I have another idea of what we could build, which is, like, if you set some kind of debug string, it, like, outputs actually debug information about the steps it went through. 00:11:34.000 |
Right, so you can see the classification thing and stuff like that. 00:11:48.480 |
I don't know if that's appropriate to leave at the end, but a general idea. 00:11:55.500 |
So what are you guys' thoughts on this intent system message so far? 00:12:05.180 |
I'm supposed to be paying attention to the chat, too. 00:12:12.680 |
And then Evan responded and said, I believe the bot is used to coordinate speakers for this weekly meeting. 00:12:17.720 |
Yeah, so, so, essentially, yeah, it just allows us to have a schedule for what we're going to talk about every Friday. 00:12:26.800 |
And I guess I should, we probably should have started there. 00:12:31.100 |
But every week, as K-Ball says, we don't need, like, the most polished presentation, but we would like to talk about interesting things. 00:12:42.100 |
And in an effort to do that, we just create a schedule for who's going to be speaking every Friday. 00:12:47.140 |
And kind of, I guess, a general idea of, like, what they're going to be speaking about. 00:12:50.220 |
So, because this is an AI-focused Discord server, where we talk about AI stuff and we put AI into action, we also have created this bot. 00:13:01.320 |
Or really, David, after, like, much prompting from the community, I think, finally was like, okay, I'll build it. 00:13:06.420 |
And built a bot in the Discord server that we can interact with to schedule out these speakers every Friday. 00:13:13.020 |
So, that's kind of, like, the idea of the AI in action bot, is that every Friday we get together and talk about something interesting. 00:13:19.280 |
And this is how we kind of keep the schedule. 00:13:21.700 |
And actually, just for, to be really thorough, I'm going to copy and paste. 00:13:31.820 |
It should be there in, like, one of the main, most recent messages in the Discord server itself. 00:13:37.220 |
But if, I'm taking forever to find it, but if somebody could grab the link to the, oh, that's because I was here and I changed it. 00:14:04.020 |
There's a flowchart that shows, like, message intent and what to do with that right there. 00:14:10.980 |
So, if that's accurate, that should answer the question we had earlier. 00:14:19.640 |
Okay, so, we have system prompt that detects one of those five. 00:14:25.640 |
What was the issue with the intent detection before? 00:14:29.600 |
Okay, so, currently, what they are doing is not keywords. 00:14:35.800 |
So, was it, I'm curious, or, like, when it didn't identify the intent correctly. 00:14:48.240 |
So, presumably, yeah, I got to scroll back to figure out what that issue was, because it seems pretty straightforward to me of, okay, I'll classify one of these five things. 00:15:07.500 |
And then, if the user hasn't clarified which one they want, then, presumably, I will inform them of the categories that they can do. 00:15:18.920 |
So, the issue was that Slono tagged the bot and said, what's the menu tomorrow? 00:15:33.200 |
So, basically, these, I guess we're trying to get to the bottom. 00:15:36.860 |
For this particular issue, we're trying to get to the bottom of why none of these triggered this view schedule intent, I guess. 00:15:46.620 |
Like, why I wasn't able to determine intent based on these words. 00:15:55.520 |
So, I think it looks like these are not, these were not tripping the respond at all thing. 00:16:02.340 |
So, it has an initial gate to determine whether or not it should respond, if I remember. 00:16:11.800 |
So, if it's yes, then it should have responded. 00:16:14.160 |
So, it looks like the intended behavior was that it should be responding to Manuel there. 00:16:21.760 |
So, I don't know if that message create on the, if you go back to your search, I think, just want to make sure that he's not, okay. 00:16:42.420 |
So, what I'm trying to figure out here is, okay, let's see. 00:16:51.600 |
So, we want to look for the, we want to look for the message parser. 00:17:05.680 |
And then, it'll point you to the code of, how does the bot determine whether or not to respond at all? 00:17:12.220 |
So, early exit filters in the events message create. 00:17:30.560 |
For hygiene purposes, I might split that off, but that's fine. 00:17:39.220 |
If message, so ignore bots, ignore messages outside the configured guild. 00:17:43.740 |
Okay, bot author filter, guild restriction filter, context-specific response logic. 00:17:51.720 |
Now that I'm thinking about it, I'm a bit curious what actually ended up working for, because someone got it to work right after that. 00:18:08.020 |
And then, and then help basically triggered the, like, all the keywords, like, for it to respond with all the keywords that it actually recognizes. 00:18:18.700 |
Oh, but I just asked it the exact same question at this time, it, like, answered. 00:18:25.240 |
So, it might have been, like, a, like, a legitimate thing. 00:18:50.600 |
So, what I'm doing right now, I guess just for, I don't know, uh, showing workflow type things, um, is I went to, uh, I went back to the repo on GitHub, and I prepended the URL with sourcegraph.com/github, uh, but unfortunately this, uh, repo is not indexed, because what I wanted to do was hunt through for, um, the symbols. 00:19:18.620 |
Because I want to, I want to double check and see, at least for me, when I was looking at that, I didn't get a, um, a clear, uh, uh, uh, thing whether or not that it's searching for whether or not it got mentioned or not. 00:19:35.540 |
So that's kind of what I'm trying to track down on the code base at the moment. 00:19:38.000 |
Uh, so Evan asked a good question in the Zoom, which is where is the bot running? 00:19:56.840 |
Do we have logs, and I just asked that to David, because I'm not sure. 00:19:59.540 |
I, this might actually be a, a Yikes question though. 00:20:01.940 |
Do you, are you familiar with, um, where it's running and like whether, whether or not we have logs on. 00:20:20.500 |
Do you, do you, do you have a way to run it flow? 00:20:30.360 |
It was, uh, part of the, uh, actually before I do that. 00:20:38.340 |
So it did, it did create the contributing MD already prompted that. 00:20:45.840 |
I would just like, while this is running, I would just go into, uh, into the site chat and just say like, create the contributing MD or something like that. 00:20:53.660 |
It depends on how much you want to scaffold the structure. 00:20:58.720 |
What I, what I like doing these days is just like, oh, write a structure for the contributing MD. 00:21:03.600 |
And then I have like, it fleshed out and then I say like, build it. 00:21:07.680 |
Just GPT five is a little bit too terse for sonnet. 00:21:13.560 |
You probably don't need, you don't need that. 00:21:15.300 |
Um, so Evan, you're saying there are Docker instructions. 00:21:21.160 |
Are you saying this will help me like do, cause, uh, in the read me, he had a few instructions on how to get started. 00:21:27.060 |
Basically just NPM install and then make sure you fill in all of this information. 00:21:31.500 |
Uh, one of the things I've never understood, but I'm interested in, or curious what you mean by there are Docker instructions. 00:21:37.740 |
Um, if, if you, if you're referring to the Docker file, uh, but, but the one thing I never did understand is like, how do we have access to, um, 00:21:48.500 |
Like, cause I thought that was going through yikes, but I guess not, uh, for the, um, like for, for AI in action to have access to the latent space server. 00:22:00.500 |
Um, it would just be that like, wherever, um, wherever David has hosted the bot, uh, there's a, there's like, um, there's a little bot dashboard and then you can, um, uh, you would invite the bot to your server the same way you would invite a person to your server. 00:22:18.760 |
So you would click on a link and then you'll get a little thing that says, do you want to invite the bot, et cetera, et cetera. 00:22:23.180 |
Um, so this is something SWIX would have had to like, do to get access to the latent space server, right? 00:22:28.720 |
Or one of the mods would have needed to like invite it, but that's not relevant for what we're looking for. 00:22:35.400 |
So we need the place where the, where the bot is running. 00:22:38.160 |
So, um, if it were in a Docker container somewhere, we could look at the logs for that, but it would be like, is this on Vercel? 00:22:45.180 |
Is it on AWS, is it sitting on a laptop on, on David's thing or whatever, or you can spin it up locally and, and see if you can play with it. 00:22:54.740 |
Yeah, for that, I was just going to run NPM install and see how it goes, but, uh, we'll see. 00:23:02.280 |
Um, David's using a VPS, um, which he probably only has access to. 00:23:06.900 |
I was just saying, if you want to run this without a lot of NPM stuff set up locally on windows, like Docker is a good option to just to get going faster. 00:23:20.900 |
I forgot what, yeah, I do have Docker on here. 00:23:25.220 |
Um, yeah, so it should be like, uh, or it'll say what, uh, is there a compose. 00:23:36.720 |
Or yeah, do you just do, oh, there should be Docker instructions in the, in the readme. 00:23:59.320 |
You guys will have to help me with the Docker stuff. 00:24:08.200 |
Um, uh, do you have Docker desktop installed? 00:24:18.460 |
And then what you want to check is in the options thing. 00:24:23.560 |
Uh, if it has a used WSL backend option thing. 00:24:34.620 |
So I hit the, the options, the settings here and then Docker engine. 00:24:58.720 |
I guess like give it another, or maybe try, uh, yeah. 00:25:37.820 |
Dov I'm gonna say your dog, cause I'm not sure how to pronounce the first name. 00:25:40.260 |
Unless you want to type that out phonetically, um, yeah. 00:25:42.820 |
Dov says install pod man, replace Docker with pod man as, as, as, as a suggestion. 00:25:50.260 |
I hate when it does this cause windows terminal or like PowerShell in general is a super weird. 00:25:54.460 |
Like my experience with Kurt cursor or really any of these like VS code forks on Mac is so much better. 00:26:00.880 |
Cause you can see what the heck is going on inside the terminal. 00:26:03.040 |
It's like little extra terminals that it pops up, but on windows. 00:26:05.660 |
It's, it's, it's using this in, uh, in, in windows. 00:26:09.520 |
I think you can, or I would assume, or I think you can set the VS code to use or. 00:26:18.040 |
Cursor to use WSL backend instead of using PowerShell. 00:26:32.360 |
So it looks like it just, uh, maybe, um, ran the, ran it with the full path. 00:26:41.660 |
Docker desktop is not running or properly connected. 00:26:45.500 |
What, when you ran that command, did you have Docker desktop open? 00:26:52.640 |
Docker desktop needs to be actually open or like running in the system tray or whatever before 00:27:04.120 |
I'm assuming this is just building whatever, um, Docker build AI in action bot, which is 00:27:11.380 |
this folder, this directory, I'm assuming, and then everything locally in this repo. 00:27:16.220 |
And it's, yeah, so now, now it's running, it's taking that Docker file, it's running all the 00:27:19.620 |
scripts in there and it's building you a little mini machine to run that in basically. 00:27:36.140 |
I guess it would depend on what's in the Docker file, but. 00:27:41.660 |
Uh, in, in the meantime, is there a question that you had for David on the logs? 00:27:46.580 |
We could send him types tabs and see, uh, if he saw anything around that time. 00:27:53.820 |
What might be good is if, um, uh, a thing that we could, uh, or yeah, I'm curious what's 00:28:03.780 |
stored in the MongoDB here, but what might be good is to just add a, add another intent. 00:28:10.940 |
that's like debug and then, um, uh, have basically the bot wherever it is. 00:28:17.900 |
Do like, uh, would be a good place to get system logs from inside the container. 00:28:28.260 |
Um, uh, but yeah, what I would think would be to just like print whatever the, whatever's been 00:28:39.280 |
happening in, in standard out to the, um, to the, uh, uh, to the message center would be my thought. 00:28:54.520 |
I'm not familiar enough with Mongo to, to know, um, what that takes locally. 00:29:02.560 |
Um, oh yeah, it looks like, let's see if there's like a super easy one. 00:29:30.700 |
I guess the contributing, contributing.md would have helped in this case. 00:29:44.840 |
No, this is a good fodder for the contributing.md. 00:29:49.240 |
If we don't get anything else out of this hour, because it's been 30 minutes, um, if we don't 00:29:53.760 |
get anything else out of this hour, I guess I can just open a pull request for the contributing.md. 00:29:59.680 |
Or, or like take this, uh, take this transcript, jump, dump it into GPT and say, give me an issue. 00:30:04.860 |
And, or actually you might even be able to tell cursor, just make an issue now. 00:30:09.340 |
I know you could, if it was Claude, if it was Claude Cody would, but I'm not sure. 00:30:14.000 |
And what, what I would do also flow is like, do the, do the, do the like grub prompting. 00:30:19.320 |
It's like, if you take say line 15 or 16, just paste it into cursor and say like, build it. 00:30:26.000 |
Cause like at the end of the day, that's like the end goal, right? 00:30:32.580 |
It's just like, take, take like the sentence of what you want and then paste it in and like 00:30:40.080 |
I guess, I guess you could just YOLO just the things that we want, the new features. 00:30:54.400 |
We kind of determined that the bot does respond or can parse intent. 00:31:05.220 |
I would, you could, you just close that tab, right? 00:31:10.180 |
I would, I would use that since that stuff is in the context window. 00:31:14.500 |
Like if you now want to add the feature, Oh, if the bot doesn't know how to respond, responds 00:31:18.600 |
with the keywords, you know, that it already has like kind of that research in there. 00:31:25.020 |
And then I would just say like, yeah, please, please make this happen. 00:31:43.060 |
Wait, so are you specifically talking about, cause I just did the, uh, 00:31:53.260 |
Since now we want to have like, oh, well, if it doesn't know how to respond, it should like 00:31:56.680 |
say it's keywords, so in a way it already has like what it should do. 00:32:02.500 |
And I guess right now this function already works. 00:32:11.500 |
Like it will just tell you like a pretty terse kind of thing. 00:32:15.300 |
Um, but in general, I think it's good also to just prompt stuff where you kind of don't 00:32:20.480 |
really know what you want it to do, like just to see what happens. 00:32:29.540 |
Like, especially with GPT five, when you give like weird stuff like that, it will give you 00:32:41.700 |
I do like GPT five more, but I don't think I have the max, uh, cause you were 00:32:45.700 |
saying GPT five, a high works like the old GPT five before the. 00:32:56.400 |
That being said, I think I can close the most of the rest of this and go back 00:33:04.680 |
So I said, help me set up a Mongo container for this project. 00:33:10.880 |
Um, and it looks like it got stuck on one of the, to do's. 00:33:19.120 |
I run into the issue kind of bit, um, so I have to, yeah, it gets stuck. 00:33:26.300 |
Yeah, that's, it's almost impossible to use cursor on windows. 00:33:30.960 |
Cause it's always going to try to use bash and it can't use bash on windows. 00:33:39.140 |
Okay, but, but just so I understand it's basically going to create a Docker 00:33:44.000 |
container, set up Docker compose, update the connection configuration for the 00:33:51.980 |
And definitely being, definitely being extra. 00:34:16.120 |
Let me clean up the failed container and set up MongoDB properly. 00:34:21.440 |
The thing that's like, I have a Mac machine is just not really great for streaming. 00:34:30.360 |
So I, I kind of like default to my windows machine for that type of stuff, but it does suck. 00:34:34.860 |
Um, yeah, it did like, it's stuck here on this type of stuff a lot. 00:34:45.040 |
I wonder if there's like an easy setting to just like point cursor at WSL instead. 00:35:17.040 |
Um, I feel like this is not something that would be in the training data or maybe it is. 00:35:27.840 |
That's the kind of stuff I go to, uh, chat GPT.com for. 00:35:38.400 |
Speaking of being that we're 37 minutes in the, the, the one thing I did want to get done as well is like, okay. 00:36:24.760 |
The default terminal profile and edit in settings, Jason, and just paste that thing that it in. 00:36:33.580 |
I don't, I don't know why it's like a, um, a mental block for me. 00:36:38.920 |
When you're like suspicious on unnecessarily suspicious about stuff like that. 00:36:50.260 |
So I wonder if you, I wonder if you open a new terminal now, if you'll be good. 00:36:53.560 |
Um, it seems to have finished with the Docker thing. 00:37:20.800 |
So it just ran the container without exposing the port. 00:37:24.340 |
Wait, isn't that going to be problematic later when we need access to the. 00:37:29.700 |
Um, I think you're fine because the only, the only thing that's going to need to talk to 00:37:36.160 |
So if it's on the internal Docker network, it's fine. 00:37:39.420 |
It looks like it wrote you a compose file to handle it. 00:37:49.200 |
So the point of setting, like if I understand this correctly, cause I kind of lost my way 00:37:53.720 |
here, the point of doing this, like creating the MongoDB so we can test the bot locally to 00:38:01.340 |
Um, yeah, and I would think, so I think logs just, or yeah, so logs should just be coming 00:38:06.980 |
from the standard out of the running process. 00:38:09.480 |
Um, but I'm, I'm kind of inclined to go with the, um, the manual method of just like, oh yeah. 00:38:31.200 |
It was here and it looked like it crossed off the to do by adding something to the index. 00:38:41.080 |
So, yeah, I would say, I'd see if it can add the debug thing and then YOLO it and see 00:39:00.560 |
I'm not sure what you're asking for here, the keywords I understand. 00:39:03.060 |
I guess better than like, just like added, it just like added emojis basically kind of, 00:39:14.120 |
And then, um, going back to, to do and then saying yes, this part, uh, that's the debug string 00:39:31.120 |
I would just say if the user asks for debug logs, give them debug logs. 00:39:42.120 |
I kind of want to switch to GPT five, but we'll, we'll, we'll keep it going. 00:39:49.120 |
no question mark i don't know i guess son has been doing okay i i i kind of want to switch to 00:40:01.140 |
gpt5 but we'll we'll we'll keep it going it's doing it's doing a decent what you could do is 00:40:06.480 |
like have it do it in sonnet and then redo it in gpt5 because that way you'll you'll like 00:40:14.820 |
the real question is is it using wsl for new terminals it is okay all right cool 00:40:40.160 |
okay just it's added on to the original uh system message 00:40:48.100 |
and now it's okay this is all new which is um debug 00:40:59.100 |
i am a bit curious like uh yikes this is a we're at 43 minutes in let me let me just pull up this 00:41:14.480 |
stuff and do it in the background while it's talking um but i'm curious if there's like a 00:41:18.220 |
like as opposed to david paying for uh this whatever vps he's paying for if we could like 00:41:26.160 |
a vulture vulture vps if we could set this up on something that like we put and this may be for 00:41:33.060 |
after the call like we can maybe chat after the call so we're not taking up too much time but i i'm i'm 00:41:37.160 |
curious if we could like put uh crypto in a wallet and have it pull from that wallet with a hosted vps 00:41:45.100 |
so everyone can have access to it and it's not just you know what i mean like have like logs uh 00:41:50.340 |
that are public and it's not necessarily just sitting on david's vps and he's paying for it 00:41:54.040 |
every month but but maybe like a crowdsourced thing that we could host the bot i don't know if that if 00:42:01.060 |
that makes sense but i think it would be interesting to have like everybody have access to it as 00:42:06.240 |
i don't know it looks like it did okay on the code 00:42:10.780 |
debug intent so i guess my main thing is as we're like coming up on the last 15 minutes how could i run 00:42:22.360 |
this locally to see if the debug thing works but i guess that i'd have to like put in um into my dot 00:42:29.100 |
env file i have to put all this information in that's where every ai in action vibe code fails 00:42:34.040 |
it's like running the discord button yeah for sure 00:42:59.140 |
users can i mention about with debug keywords like debug show me debug info debugging 00:43:05.940 |
well you could try for the last for the last couple of minutes it's like redoing that with uh with gpt5 00:43:13.860 |
so we can see the difference in style oh right so like just rewind and then 00:43:21.140 |
reuse the prompt and just set the model to gpt5 yeah okay good point so index 00:43:27.620 |
or or you could keep that like as a git commit or something or we have it on stream anyway right 00:43:34.820 |
to just you see the difference in style between the two models because they couldn't be any different 00:43:39.460 |
i think yeah um yeah in a case like this what do you because like i i it's a fork right so like i can 00:43:47.860 |
commit do you recommend committing and then we just redo and see how how it turns out that way 00:43:53.700 |
that's what i would do i think but um i don't know if you'll be able to commit to 00:43:59.140 |
david's repo though you might have to commit to your own yeah i i would i had these commands set off 00:44:05.060 |
to the side um it would be git checkout b i don't know what what do we just add here db it was a bunch 00:44:16.420 |
of stuff but like debug um feature i don't know debug message it's better 00:44:26.500 |
and then yes now i can commit i think you need to keep all of the first don't you say one more time i 00:44:38.500 |
think you need to keep all on those first don't you so it actually changes the files yeah the way cursor works 00:44:44.900 |
is like even if you don't hit keep all that this is what's in green is actually set and then what's 00:44:49.620 |
in red comes back if you hit undo interesting okay cool yeah which is so weird so yeah but i do believe 00:44:56.820 |
that's how it works i'll keep all anyways though because that's a good point and then i guess we just 00:45:04.020 |
kind of yolo all of it into commit i mean yeah you know commits are fine as long as it's not a pr to 00:45:13.540 |
no we don't want to publish the branch we just want to go back here 00:45:23.060 |
uh no lower down did i i missed it help me no it wasn't this one it was this one change it to gpt5 00:45:32.340 |
and then send it again yeah yeah and oh yeah that's good continue river yeah okay and then uh evan 00:45:43.140 |
also said if you docker compose up and then set the em okay yeah so i guess the problem is we don't have the 00:45:48.260 |
environment variables to not only the um latent space discord but um david set up his own personal 00:45:56.420 |
discord where we can do the testing for this thing but i don't have those credentials either off the top 00:46:00.260 |
of my head anyways yeah we'd need a bot token and stuff yeah yeah for sure i have a discord set server 00:46:06.580 |
set up so it wouldn't be too difficult but where we got like 10 minutes left so maybe not the best use of 00:46:11.140 |
time um evan also says in the okay cable says it seems very confident it's the story of 2025 llm use 00:46:18.180 |
yeah no no victor says ai is good at writing compose files evan says one consideration for debug on a new 00:46:25.940 |
bot turn the bot can only show what is persisted somewhere memory such mongo um so unless logs are 00:46:32.340 |
persisted in that way debug won't be able to show logs so can you call docker logs from your own from 00:46:38.180 |
inside your own docker container uh is that a real command uh docker logs is a real command yeah um 00:46:47.700 |
so you would need to you need to put the container i know you can do it from the host but i don't know 00:46:54.260 |
if you can do it from inside the container oh okay i see what you're saying i see what you're saying yeah 00:47:00.420 |
because ideally what when the or like in my what i would what i would probably do is uh uh yeah is is 00:47:10.740 |
just run when somebody asks for a debug thing is run docker logs on you know docker logs for 00:47:19.540 |
all the containers and just dump them that would be what i did what i would do 00:47:27.620 |
when i type so this is a learning lesson for me when when using docker when i typed in docker compose 00:47:33.140 |
up it already it automatically called not only the ai in action bot but the mongo ai bot which i'm 00:47:38.820 |
assuming is just a mongo db what how does it know to do that is that all in docker compose uh yeah the 00:47:45.060 |
the bot added to docker compose it added another container you know yeah that's crazy 00:47:55.700 |
yeah okay yeah that's so interesting to me yeah i i need to explore docker i really do but i i'm like 00:48:02.100 |
uh mostly a i'm just gonna run it locally if it messes up my machine it does yolo 00:48:06.260 |
yes docker is very very helpful very powerful would recommend for sure okay so bait let's see the changes 00:48:20.340 |
index.js i'm fairly this similar oh it made changes elsewhere it looks like as well next steps 00:48:31.860 |
so gpt5 did more than four did which is like in that unusual 00:48:46.660 |
what do we think do we do we keep or cancel it's very similar right yeah very similar but 00:48:55.140 |
i'm i'm curious like let's see like so let me let me close all these so i can make this a little bigger 00:49:03.060 |
uh user id um is thread um it has less emojis i think uh i i wonder how much this is influenced by 00:49:14.980 |
all the previous research that we had like you could do yet another experiment right like just start with 00:49:20.980 |
a completely empty completely empty context and say like hey can you just like show me debug stuff when i 00:49:28.020 |
ask for debug and then kind of see what it finds there's also like interesting experiments 00:49:34.900 |
so it looks like it's mostly time stamp stuff and how it handled 00:49:41.940 |
the next steps what is what is telling people to do next after 00:49:49.220 |
oh this was it seems to have been using the sonnet stuff on and then adding on top of it right 00:49:56.740 |
no but i oh well no because i thought i did like yeah i thought you reverted i'm not sure if that's the 00:50:13.460 |
right because it already had like on the left it already has this debug intent which was the sonnet 00:50:17.860 |
stuff i think so maybe because you committed it like decided to not revert 00:50:22.980 |
it's weird i thought cursor did like an auto yeah it like literally just takes you back to whatever the 00:50:32.260 |
status was here but i'm guessing it's the commit i'm if i was cursor what i would probably do to try and like 00:50:38.900 |
save um not have to save as much history as to every time i commit kind of wipe all of the recent 00:50:47.380 |
uh rollbacks that's what i yeah that makes sense if i was building cursor that's how i'd build it 00:50:52.420 |
okay last uh five minutes or so what do you guys like any thoughts closing thoughts that you want 00:50:59.060 |
to close up on i appreciate you guys for having patience with me on the docker side because i know 00:51:02.820 |
you don't have max do you i was gonna say tell it to rewrite it and rust 00:51:10.740 |
yeah the the i i had a window open specifically okay here we go um maybe we can do a bit of 00:51:20.420 |
uh so i'm building out a bot in the ai in action discord for the latent space podcast 00:51:30.820 |
and i would like to anthropomorphize the bot instead of called calling it the ai in action 00:51:37.140 |
bot help me brainstorm names make sure the list is long 00:51:46.420 |
and then if i could pull up david's thing i forgot where i put it oh it's here here 00:51:59.860 |
what oh i see david's using uh devon on this yeah so there was if um one of the first things 00:52:08.900 |
i did was look through the commits and there's like devon claude and uh copilot uh via zach 00:52:13.860 |
so i thought i thought that was pretty cool good 00:52:17.140 |
is there any thinking mode 2.5 pro pro has been so terrible the last couple days i think it's the nano 00:52:26.180 |
banana stuff um is that the new retention policy you just accepted yeah yeah keep all my stuff 00:52:39.220 |
i think so you might want to revisit that yeah well yeah they updated it so now they keep it for five 00:52:46.100 |
five years instead of like a couple months yeah listen take take my whole brain i don't i'll be honest i 00:52:53.460 |
don't care um so are we going with clinker or are we i really don't like alex that's terrible codex is 00:53:03.300 |
taking bites is someone interesting i kind of wanted something it's all cringe yeah gptina all right sir 00:53:17.940 |
it was google studio any better someone decibel damn dude decibel uh latent vector manny ember nah 00:53:29.620 |
lex is nice but we'd have to change the profile to a waifu the picture at least 00:53:37.380 |
so i can feel like i'm talking to a woman for want to my life um the concierge 00:53:50.500 |
all right we got like maybe two minutes left any any any 00:53:58.340 |
i like artificial yeah um so i guess we have to consult with david being that this is like 00:54:05.620 |
technically his project but i think i and the only reason why i want to do that is because like 00:54:09.780 |
ai and action is already uh or just already is yeah i actually like that a lot i kind of i do kind 00:54:14.980 |
of like janice and i also kind of like alfred but i'm not i don't know i'm not married to him 00:54:19.940 |
janice stuck out to me too janice stuck out to me too um mostly just because i spend too much time 00:54:25.220 |
on twitter yeah um yes i guess we'll just consult david with this list uh clinker janice already 00:54:34.500 |
anything else that stands out to you guys and like maybe put up a poll or something in discord and see 00:54:40.820 |
where we end up um i think i'm actually going to open a pr with these changes and see if he accepts it 00:54:51.540 |
and uh maybe it makes sense to do contributing as a separate pr than like some of the other stuff 00:54:57.060 |
because i think a while that he was doing this in raw js instead of typescript with uh with all 00:55:04.020 |
the ai in here like i'm surprised it hasn't broken everything 00:55:07.140 |
but um i don't know uh cable i think that's pretty much for the most part that's a wrap thank you for 00:55:17.700 |
running uh this was fun appreciate you stepping up uh you know we can always do something like this but 00:55:25.140 |
i do want to encourage everyone here bring a topic bring something you want to do maybe you drive uh next 00:55:32.180 |
time you want to keep hacking on this or bring a project you're hacking on to do it 00:55:36.980 |
uh victor i don't appear to have uh the ability to mute him so let's see yeah same same uh but anyway uh 00:55:54.820 |
bring your bring your projects bring your talks bring your learnings um this is fun uh but yeah i think 00:56:03.860 |
a nice mix is good so uh maybe next week somebody can bring a topic to to talk about we can you can 00:56:09.060 |
sign up with the bot um that we just worked on so we'll see you all in a week peace thanks for coming cheers