AI in Action: Debugging an LLM-Powered Discord Bot

Okay, I think it's working. Yeah. Uh-oh. I think you're muted. Yeah, all right. Sorry. I had another screen in front of my Zoom. I could hear you giving me a thumbs up on that. Yeah, sir. Okay, let me take this one. Whew. Busy week. I got a chance to listen to a bit of the Twitter spaces that you hosted.

I didn't know you were doing like a full-on podcast. Like you had a, like, what was it, the daily engineering thing? Oh, software engineering daily? Yeah, I don't, I mean, that's a, I'm one of a number of folks who do that. But, yeah, I interview people for it. That's pretty cool.

Increasingly, they're sending me AI-related interviews, too, which is fun. Nice. Nice. Nice. Wait, what's your Twitter? I've been missing these. I don't do Twitter generally, but I did do a Twitter space. Like, my Twitter's, like, all locked down because I left when Elon Musk took over, honestly. But I keep it around occasionally, and I'm helping organize a conference, and they needed somebody to emcee a discussion on a Twitter space, so I did that.

Nice. So. Okay, I think I can share. Yeah, Flo, I'm just going to let you drive this thing. So whenever you're ready, go for it. I'll step back. Let me move the Zoom thing, and then I should be able to do share screen this one. I hope. Oh, no, that's the wrong screen, actually.

It's this one. Okay, I'm assuming you guys are seeing Cursor right now. Yep. Cool. Cool. Okay, yes. And I have the – so maybe we should start here. This is the actual link, the URL to the AI in Action bot that David put together for us. I think we've had a couple of commits over time, but the idea that I wanted to – I kind of – I'm bouncing a few ideas around, but I have them written down in the – my to-do list in Cursor.

So hopefully I'll pull up here in a second and get some feedback from you guys. But what I mainly wanted to do is, like, add a couple of – what do they call them? Quality of life features to the AI in Action bot. And as I'm getting into trying to contribute myself, I think it would be nice to also put together a nice contributing doc for other people to contribute.

I think maybe there's, like, slight barriers that make it harder than it should be, and maybe that's why we haven't had anyone else, like, really contributing, even though we all kind of, like, use it from week to week. So those are my two kind of, like, main things is adding a contributing guide, taking feedback from people in the community on what would make this better, or, like, what would make this easier to use or more likely to be used in our channel.

Because it seems like we do have a lot of engagement in the channel, but don't quite take advantage of the bot as much. And then – this is a personal one, but I would like to brainstorm names so we can, like, anthropomorphize the bot. I think instead of just calling it a bot, I think it would be nice to call it a – you know, like, by a name or something.

I don't have anything good off the top of my head, but – I like Clanker. Clanker. That's his words there. Yeah, yeah. So that's basically what I wanted this session to be about, is just those things. So what I did, and so I can take notes on what would make it easier for other people to contribute, I basically just cloned the repo into a contributor folder on my desktop, and added the original repo, which is, like, David Gutman slash AI in Action Bot, as, like, an upstream remote here on Git.

So I think once you do those two things, it should be easier. As I was going around looking to see, because I'm on Windows, it's a little bit clunky to, like, use something like NVM and Node, but I was trying to see, like, what Node version we need, but it doesn't look like it's been specified.

So little things like that I would just like to – and then I also recognize that there was a discrepancy between the .EMV example and what the README says should be in the .EMV. So just, like, little things like that I think could be cleaned up in this hour, and then also just if we, you know, kind of crowdsource ideas for what would make this better, I think that would be cool.

So the one thing that I have done is on my fork is pull up jewels.google.com. I know everyone's, like, cloud code crazy, you know, using Codex and AMP, but I thought this was a nice free way to, like, maybe ask a few questions or get something rolling at the beginning of the session, and then maybe we come back in in the next 20 minutes or so and see what it was able to come up with.

I'm curious, though, like, what's going on in the minds of the peanut gallery right now? Like, what do you guys think we should add to the – off the top, based on what I've said so far? And if you guys don't have anything, I'll probably just proceed with, like, the two questions I asked, like, right off the bat to the model.

And, like, Gemini Pro did a terrible job, but they said, like, literally stopped a couple words into the response. So I just switched to Sonnet for this. But I basically said, give me a tour of the code base. And it kind of came up with this, you know, mermaid chart, and then came up with its own – I don't want to say index, but just, like, architecture overview.

And then I asked about the node version, and it said that there's no node version specified. I'll make this slightly larger. It said that there was no node version specified, so you pretty much are good to go with – I think I'm on 23.3 or something like that. And it said I should be good to go with that.

So I don't know if you guys have any comments on that. And if not, I would just kind of go back through the Discord and see what the complaints were. I know the most recent complaint was with Slono trying to get it to bring up the schedule. And I'll be honest, like, I'm not sure if, like, the bot was just completely off or he wasn't saying, like, the special keywords.

But that was the most recent one I could think of in terms of complaints of, like, why – I think that might be a good place to start. Yeah. And see if it's – if what it's doing is hunting for keywords or if it's doing, like, an LLM as judge eval, because I would do probably the second to shoot it to a super cheap, quick model and be, like, of the available tools that you have, should you call one right now and why kind of thing?

Yeah, like, it shouldn't be – it shouldn't be silent, right? Like, maybe it would just say, like, I don't know what you're asking me. Like, here's my standard keywords. Yeah. And potentially adding, like, an indicator that it's typing. I don't know if that's easy in Discord or not. I guess that depends on how the bot works.

So a feature would be if the bot doesn't know how to respond, respond with its keywords or something like that. So do you think this is a good question for someone that doesn't know the code basis? How does the bot determine what to respond to? And then we just kind of follow what the agent does?

Or what do you guys think about that much? I'm sending it anyways, but do you guys have any better initial prompts? Yeah, well, so, I mean, I guess it depends on what you're – for, like, quick stuff like this, usually I end up using DeepWiki. I don't know if you guys, like, clone it down and do it with Cursor, but presumably they're both the best.

But I don't know. Figured I'd – Could you say the name of the tool again? I'm not familiar. It's called DeepWiki. If you just change the URL on GitHub of the repo to DeepWiki instead of GitHub, then it'll take you to, like, a thing where Devin will, like, index the repo or, quote, unquote, Devin.

It's not Devin, but basically, you know, you can get a wiki of the repo itself. Deep.wiki and leave everything else the same? Nope. DeepWiki.com. And then you want to drop the commits main part, and then you should be good. You have a typo in comm. Yeah. Oh, whoops. And you can ask questions directly from here?

Yep. Look at that. There's a thing at the bottom where you can just, like, just ask one-shot questions, or you can have a deep research through the repo. So, actually, maybe a good exercise. Why don't we ask the same question to Cursor and then ask the question to this thing and see kind of what QLL differences we get?

In your experience, is it better to click on this deep research thing? It depends on what you need. I think for a short question like this, we probably don't need it. If we wanted more, like, a report on, you know, how to modify a feature or something like that, or, like, there, you know, if we click through the articles and the answer, like, wasn't there, so we need, like, a new, our whole article, then I would deep research it.

But for this, it should be able to just figure it out. Cool. Yeah, this is awesome. Okay, so basically, so it gave us a bit of code that doesn't look all that useful in Cursor and then in using Devin and DeepWiki libdiscordindex.js. Bot filter. Oh, from other bots, okay.

Only processes messages from the configured Discord guild. Thread versus mention. LOM-based intent detection. So it uses an LOM to classify user intent. And it looks like that's an index.js564573. So if I click that here, oh, okay, it just shows. Yeah, this is super interesting. I have another idea of what we could build, which is, like, if you set some kind of debug string, it, like, outputs actually debug information about the steps it went through.

Oh, that's. Right, so you can see the classification thing and stuff like that. Wow. Detail, what steps it. I guess for a response. I don't know if that's appropriate to leave at the end, but a general idea. Okay, that's interesting. So what are you guys' thoughts on this intent system message so far?

Oh, sorry. I'm supposed to be paying attention to the chat, too. My bad, my bad, my bad. So, Colleen says, I'm fairly new. The bot is new to me. Not sure what it does. And then Evan responded and said, I believe the bot is used to coordinate speakers for this weekly meeting.

Yeah. Yeah, so, so, essentially, yeah, it just allows us to have a schedule for what we're going to talk about every Friday. And then you can book in advance. And I guess I should, we probably should have started there. But every week, as K-Ball says, we don't need, like, the most polished presentation, but we would like to talk about interesting things.

And in an effort to do that, we just create a schedule for who's going to be speaking every Friday. And kind of, I guess, a general idea of, like, what they're going to be speaking about. So, because this is an AI-focused Discord server, where we talk about AI stuff and we put AI into action, we also have created this bot.

Or really, David, after, like, much prompting from the community, I think, finally was like, okay, I'll build it. And built a bot in the Discord server that we can interact with to schedule out these speakers every Friday. So, that's kind of, like, the idea of the AI in action bot, is that every Friday we get together and talk about something interesting.

And this is how we kind of keep the schedule. And actually, just for, to be really thorough, I'm going to copy and paste. It should be there in, like, one of the main, most recent messages in the Discord server itself. But if, I'm taking forever to find it, but if somebody could grab the link to the, oh, that's because I was here and I changed it.

Okay. Yeah, it was here. And if we go here, here, and then I can. On the DeepWiki main page, there's a, yes. Okay, got it. Just roll down a little bit. There's a flowchart that shows, like, message intent and what to do with that right there. That workflow, yeah.

So, if that's accurate, that should answer the question we had earlier. Okay, so, we have system prompt that detects one of those five. What was the issue with the intent detection before? Okay, so, currently, what they are doing is not keywords. It is LLM-based eval. So, was it, I'm curious, or, like, when it didn't identify the intent correctly.

Right. So, presumably, yeah, I got to scroll back to figure out what that issue was, because it seems pretty straightforward to me of, okay, I'll classify one of these five things. And then, if the user hasn't clarified which one they want, then, presumably, I will inform them of the categories that they can do.

So, the issue was that Slono tagged the bot and said, what's the menu tomorrow? What's the schedule? What's the next talk? Okay. This was cracking me up. Sorry, I shouldn't be laughing. Okay. And then, he was like, am I stupid? So, basically, these, I guess we're trying to get to the bottom.

For this particular issue, we're trying to get to the bottom of why none of these triggered this view schedule intent, I guess. I don't know. Like, why I wasn't able to determine intent based on these words. Does that sound about right? Yeah, that sounds good. So, I think it looks like these are not, these were not tripping the respond at all thing.

So, it has an initial gate to determine whether or not it should respond, if I remember. Bot mentioned at starts, no, yes. So, if it's yes, then it should have responded. So, it looks like the intended behavior was that it should be responding to Manuel there. Yeah. So, I don't know if that message create on the, if you go back to your search, I think, just want to make sure that he's not, okay.

If you scroll up a little bit. Scroll up in the code. Yeah, in the code. Block. So, what I'm trying to figure out here is, okay, let's see. If, let's see. So, we want to look for the, we want to look for the message parser. So, we want to go up farther, I think.

Or, you could just ask Devin. It should be able to figure it out. And then, it'll point you to the code of, how does the bot determine whether or not to respond at all? So, early exit filters in the events message create. For hygiene purposes, I might split that off, but that's fine.

If message, so ignore bots, ignore messages outside the configured guild. Okay, bot author filter, guild restriction filter, context-specific response logic. Now that I'm thinking about it, I'm a bit curious what actually ended up working for, because someone got it to work right after that. Oh, help. Okay, I'm sorry.

Interesting. And then, and then help basically triggered the, like, all the keywords, like, for it to respond with all the keywords that it actually recognizes. Oh, but I just asked it the exact same question at this time, it, like, answered. So, it might have been, like, a, like, a legitimate thing.

I thought it was stuck as well. I was like, ah, I think maybe, um, Yeah. Okay. So, I guess this is a. So, what I'm doing right now, I guess just for, I don't know, uh, showing workflow type things, um, is I went to, uh, I went back to the repo on GitHub, and I prepended the URL with sourcegraph.com/github, uh, but unfortunately this, uh, repo is not indexed, because what I wanted to do was hunt through for, um, the symbols.

Because I want to, I want to double check and see, at least for me, when I was looking at that, I didn't get a, um, a clear, uh, uh, uh, thing whether or not that it's searching for whether or not it got mentioned or not. So that's kind of what I'm trying to track down on the code base at the moment.

Uh, so Evan asked a good question in the Zoom, which is where is the bot running? Do we have logs, and I just asked that to David, because I'm not sure. I, this might actually be a, a Yikes question though. Do you, are you familiar with, um, where it's running and like whether, whether or not we have logs on.

The bot. Do you schedule on. Do you, do you, do you have a way to run it flow? Yeah. It was, uh, part of the, uh, actually before I do that. So it did, it did create the contributing MD already prompted that. No, so it's empty. Oh yeah. I would just like, while this is running, I would just go into, uh, into the site chat and just say like, create the contributing MD or something like that.

It depends on how much you want to scaffold the structure. What I, what I like doing these days is just like, oh, write a structure for the contributing MD. And then I have like, it fleshed out and then I say like, build it. Just GPT five is a little bit too terse for sonnet.

You probably don't need, you don't need that. Um, so Evan, you're saying there are Docker instructions. Are you saying this will help me like do, cause, uh, in the read me, he had a few instructions on how to get started. Basically just NPM install and then make sure you fill in all of this information.

Uh, one of the things I've never understood, but I'm interested in, or curious what you mean by there are Docker instructions. Like, how does that help us? Um, if, if you, if you're referring to the Docker file, uh, but, but the one thing I never did understand is like, how do we have access to, um, the server at all?

Like, cause I thought that was going through yikes, but I guess not, uh, for the, um, like for, for AI in action to have access to the latent space server. Uh, yeah, it's not, not coming through me. Um, it would just be that like, wherever, um, wherever David has hosted the bot, uh, there's a, there's like, um, there's a little bot dashboard and then you can, um, uh, you would invite the bot to your server the same way you would invite a person to your server.

So you would click on a link and then you'll get a little thing that says, do you want to invite the bot, et cetera, et cetera. Um, so this is something SWIX would have had to like, do to get access to the latent space server, right? Yeah. Or one of the mods would have needed to like invite it, but that's not relevant for what we're looking for.

What we're looking for right now is logs. So we need the place where the, where the bot is running. So, um, if it were in a Docker container somewhere, we could look at the logs for that, but it would be like, is this on Vercel? Is it on AWS, is it sitting on a laptop on, on David's thing or whatever, or you can spin it up locally and, and see if you can play with it.

Um, but it wouldn't be. Yeah, for that, I was just going to run NPM install and see how it goes, but, uh, we'll see. I'm gonna run it now and see how it goes. Um, David's using a VPS, um, which he probably only has access to. I was just saying, if you want to run this without a lot of NPM stuff set up locally on windows, like Docker is a good option to just to get going faster.

If you have Docker set up, but. I think so. Uh. I forgot what, yeah, I do have Docker on here. Um, yeah, so it should be like, uh, or it'll say what, uh, is there a compose. Uh, like a compose command. Oh, in the, in the Docker file. Or yeah, do you just do, oh, there should be Docker instructions in the, in the readme.

I think it's just a single image. I don't think there's any services. Oh, okay. Yeah. And, and if you, okay. So build a Docker image. So this is what we're looking for, right? Yeah. It's a Docker build and Docker run. You'll be good. Oh, doc. Oh yeah. Docker building.

Okay. I think I can do that. You guys will have to help me with the Docker stuff. I literally do not use Docker at all. Well, yeah, no. Good time to learn. Let's see. Uh, oh God. Oh, okay. Um, uh, do you have Docker desktop installed? Yep. Sure do.

So I'm opening it up. Is it all? Yeah. Open it up. And then what you want to check is in the options thing. Uh, if it has a used WSL backend option thing. So I hit the, the options, the settings here and then Docker engine. Is that what we're looking for?

I'm already seeing it. Uh, so use the WSL two based engine. Do you have WSL installed? Yeah. Uh, from what I remember. Hmm. Yes. Okay. Um, then yeah, I'm not sure. I guess like give it another, or maybe try, uh, yeah. Good point. Okay. Good point. We're literally in person.

Yeah. True, true. Um, actually. Okay. Does that work? It does work. Install pod man replace. So we have some comments from, uh, the chat. Evan says, ask AI and then yeah. Dov I'm gonna say your dog, cause I'm not sure how to pronounce the first name. Unless you want to type that out phonetically, um, yeah.

Dov says install pod man, replace Docker with pod man as, as, as, as a suggestion. I hate when it does this cause windows terminal or like PowerShell in general is a super weird. Like my experience with Kurt cursor or really any of these like VS code forks on Mac is so much better.

Cause you can see what the heck is going on inside the terminal. It's like little extra terminals that it pops up, but on windows. It's, it's, it's using this in, uh, in, in windows. Yeah. I think you can, or I would assume, or I think you can set the VS code to use or.

Cursor to use WSL backend instead of using PowerShell. I see. I see. Well, it's, it's a going. Oh, okay. It's building now. Nice. AI magic. Yeah. Yeah. Okay. So it looks like it just, uh, maybe, um, ran the, ran it with the full path. That's what it looks like.

Yeah. Error scene. Docker desktop is not running or properly connected. Yeah. What, when you ran that command, did you have Docker desktop open? No, no. Okay. There you go. Yeah. Docker desktop needs to be actually open or like running in the system tray or whatever before your Docker will work.

Ah, see, I did not know that. So now it's transferring context. I'm assuming this is just building whatever, um, Docker build AI in action bot, which is this folder, this directory, I'm assuming, and then everything locally in this repo. And it's, yeah, so now, now it's running, it's taking that Docker file, it's running all the scripts in there and it's building you a little mini machine to run that in basically.

Hmm. May need to separately spin up Mongo. Let's see. Yeah. I guess it would depend on what's in the Docker file, but. Uh, in, in the meantime, is there a question that you had for David on the logs? We could send him types tabs and see, uh, if he saw anything around that time.

What might be good is if, um, uh, a thing that we could, uh, or yeah, I'm curious what's stored in the MongoDB here, but what might be good is to just add a, add another intent. that's like debug and then, um, uh, have basically the bot wherever it is.

Do like, uh, would be a good place to get system logs from inside the container. Uh, I'm used to doing that externally. Um, uh, but yeah, what I would think would be to just like print whatever the, whatever's been happening in, in standard out to the, um, to the, uh, uh, to the message center would be my thought.

MongoDB. I don't know. I'm not familiar enough with Mongo to, to know, um, what that takes locally. Like getting that set up. Um, oh yeah, it looks like, let's see if there's like a super easy one. Like neon for Mongo. It's probably another Docker image. You could just spin them.

Oh, is there another Docker image? I think you can run it locally. I would just like from Docker hub. You mean? Yeah. Yeah. I would just ask a, uh, cursor. Yep. Hello cursor. Give me the Mongo. Give me a Mongo container. So I can. All right. I guess the contributing, contributing.md would have helped in this case.

Okay. Yeah, absolutely. No, this is a good fodder for the contributing.md. Yeah. If we don't get anything else out of this hour, because it's been 30 minutes, um, if we don't get anything else out of this hour, I guess I can just open a pull request for the contributing.md.

So, so other people can contribute. Yeah. Or, or like take this, uh, take this transcript, jump, dump it into GPT and say, give me an issue. And, or actually you might even be able to tell cursor, just make an issue now. I'm not sure. I know you could, if it was Claude, if it was Claude Cody would, but I'm not sure.

And what, what I would do also flow is like, do the, do the, do the like grub prompting. It's like, if you take say line 15 or 16, just paste it into cursor and say like, build it. Cause like at the end of the day, that's like the end goal, right?

It's just like, take, take like the sentence of what you want and then paste it in and like something decent comes out. Yeah. I guess, I guess you could just YOLO just the things that we want, the new features. and see what it does. Nice. Okay. Okay. So I think I'm done with the tour.

We kind of determined that the bot does respond or can parse intent. And it just kind of had a hiccup. Um, the contributor. Yeah. I would, you could, you just close that tab, right? Like the one where we did the investigation. I would, I would use that since that stuff is in the context window.

Like if you now want to add the feature, Oh, if the bot doesn't know how to respond, responds with the keywords, you know, that it already has like kind of that research in there. Yup. Good point. And then I would just say like, yeah, please, please make this happen.

Uh, cursor broke the thing. The history selection. I think it's the third one from the. Wait, so are you specifically talking about, cause I just did the, uh, how the bot determines how to respond. Oh, okay, okay. Got it. Right. Since now we want to have like, oh, well, if it doesn't know how to respond, it should like say it's keywords, so in a way it already has like what it should do.

Yeah. And I guess right now this function already works. It's just, you have to type in help. I think it actually works in general, right? Like it will just tell you like a pretty terse kind of thing. Um, but in general, I think it's good also to just prompt stuff where you kind of don't really know what you want it to do, like just to see what happens.

Right. Got it. The vibes, man. Yeah. Yeah. Yeah. Like, especially with GPT five, when you give like weird stuff like that, it will give you like a very like reasoned answer. And it's not necessarily the answer. It's like the reasoning that's useful. I do like GPT five more, but I don't think I have the max, uh, cause you were saying GPT five, a high works like the old GPT five before the.

What it feels like. Um, okay. That being said, I think I can close the most of the rest of this and go back to our Docker thing. So I said, help me set up a Mongo container for this project. Um, and it looks like it got stuck on one of the, to do's.

Oh, right, right. Yeah. I run into the issue kind of bit, um, so I have to, yeah, it gets stuck. Yeah, that's, it's almost impossible to use cursor on windows. Cause it's always going to try to use bash and it can't use bash on windows. Yeah. Okay, but, but just so I understand it's basically going to create a Docker container, set up Docker compose, update the connection configuration for the containerized MongoDB.

Okay. That makes sense. And definitely being, definitely being extra. Yeah. Um, project uses Mongoose and expects. Let me help you set up. Okay. So I ran a command to set that up. 27 0 1 7 is blocked common on windows. Let me clean up the failed container and set up MongoDB properly.

The thing that's like, I have a Mac machine is just not really great for streaming. Or sharing my screen. So I, I kind of like default to my windows machine for that type of stuff, but it does suck. Um, yeah, it did like, it's stuck here on this type of stuff a lot.

Yeah. I would want, uh, I wonder if there's like an easy setting to just like point cursor at WSL instead. Um, yeah, yeah, okay. How this might be like, maybe not. How do I, whoops. Um, I feel like this is not something that would be in the training data or maybe it is.

You can add a web search maybe. Yeah. That's the kind of stuff I go to, uh, chat GPT.com for. Yeah. And then paste them back. Speaking of being that we're 37 minutes in the, the, the one thing I did want to get done as well is like, okay. Terminal integrated profile.

It's if I remember correctly, it's yeah. And then terminal. here integrated profile. Ooh. Okay. So on windows exact is what it's, um, doing the wrong thing, but, uh, yeah. Open settings. Jason on the top. Right. Okay. Here we go. The default terminal profile and edit in settings, Jason, and just paste that thing that it in.

Oh, that would work too. Yeah. I hate doing the Jason thing. I don't, I don't know why it's like a, um, a mental block for me. I have like, what do they call it? When you're like suspicious on unnecessarily suspicious about stuff like that. Okay. So I wonder if you, I wonder if you open a new terminal now, if you'll be good.

Um, it seems to have finished with the Docker thing. Cause I heard that little sound. I think I was here. Docker compose up MongoDB. The poor issue persists. Wait, wait, it fixed it. But what the heck did it do? Oh yeah. Okay. So it just ran the container without exposing the port.

Okay. Okay. Wait, isn't that going to be problematic later when we need access to the. Um, I think you're fine because the only, the only thing that's going to need to talk to the Mongo is this, um, bot. So if it's on the internal Docker network, it's fine. It looks like it wrote you a compose file to handle it.

So it should be okay. Cool. Cool. Okay. So the point of setting, like if I understand this correctly, cause I kind of lost my way here, the point of doing this, like creating the MongoDB so we can test the bot locally to see where the logs end up. Um, yeah, and I would think, so I think logs just, or yeah, so logs should just be coming from the standard out of the running process.

Um, but I'm, I'm kind of inclined to go with the, um, the manual method of just like, oh yeah. Did we already yellow the two do that? Okay. One of them we did, we did do one of them. Um, and, uh, history. Oh, right. It was here and it looked like it crossed off the to do by adding something to the index.

So this is what it looks like. It did. Yeah. Okay. So, yeah, I would say, I'd see if it can add the debug thing and then YOLO it and see if it actually still works. The debug. Okay. Right. Got it. Got it. Got it. It makes sense. Uh, so we are accepting these changes.

Do these changes look decent? Uh, this is something I'm not really sure. Yeah. I'm not sure what you're asking for here, the keywords I understand. Okay. Okay. That does look good. I guess better than like, just like added, it just like added emojis basically kind of, right? Yeah. Basically.

That's hilarious. That's hilarious. It made it prettier. Okay. So we, we are, we're keeping all that. And then, um, going back to, to do and then saying yes, this part, uh, that's the debug string is kind of a, that might throw it off. I would just say if the user asks for debug logs, give them debug logs.

Got it. Can we, um, cross this. If I user. No question mark. I don't know. I guess sonnet has been doing. Okay. I kind of want to switch to GPT five, but we'll, we'll, we'll keep it going. no question mark i don't know i guess son has been doing okay i i i kind of want to switch to gpt5 but we'll we'll we'll keep it going it's doing it's doing a decent what you could do is like have it do it in sonnet and then redo it in gpt5 because that way you'll you'll like see what's different right for sure the real question is is it using wsl for new terminals it is okay all right cool um oh i don't know if i like that okay just it's added on to the original uh system message and now it's okay this is all new which is um debug i am a bit curious like uh yikes this is a we're at 43 minutes in let me let me just pull up this stuff and do it in the background while it's talking um but i'm curious if there's like a like as opposed to david paying for uh this whatever vps he's paying for if we could like a vulture vulture vps if we could set this up on something that like we put and this may be for after the call like we can maybe chat after the call so we're not taking up too much time but i i'm i'm curious if we could like put uh crypto in a wallet and have it pull from that wallet with a hosted vps so everyone can have access to it and it's not just you know what i mean like have like logs uh that are public and it's not necessarily just sitting on david's vps and he's paying for it every month but but maybe like a crowdsourced thing that we could host the bot i don't know if that if that makes sense but i think it would be interesting to have like everybody have access to it as as opposed to just being on his thing i don't know it looks like it did okay on the code debug intent so i guess my main thing is as we're like coming up on the last 15 minutes how could i run this locally to see if the debug thing works but i guess that i'd have to like put in um into my dot env file i have to put all this information in that's where every ai in action vibe code fails it's like running the discord button yeah for sure um huh well let's let's see what it said anyways it seems very confident that it users can i mention about with debug keywords like debug show me debug info debugging well you could try for the last for the last couple of minutes it's like redoing that with uh with gpt5 so we can see the difference in style oh right so like just rewind and then reuse the prompt and just set the model to gpt5 yeah okay good point so index or or you could keep that like as a git commit or something or we have it on stream anyway right to just you see the difference in style between the two models because they couldn't be any different i think yeah um yeah in a case like this what do you because like i i it's a fork right so like i can commit do you recommend committing and then we just redo and see how how it turns out that way that's what i would do i think but um i don't know if you'll be able to commit to david's repo though you might have to commit to your own yeah i i would i had these commands set off to the side um it would be git checkout b i don't know what what do we just add here db it was a bunch of stuff but like debug um feature i don't know debug message it's better and then yes now i can commit i think you need to keep all of the first don't you say one more time i think you need to keep all on those first don't you so it actually changes the files yeah the way cursor works is like even if you don't hit keep all that this is what's in green is actually set and then what's in red comes back if you hit undo interesting okay cool yeah which is so weird so yeah but i do believe that's how it works i'll keep all anyways though because that's a good point and then i guess we just kind of yolo all of it into commit i mean yeah you know commits are fine as long as it's not a pr to the main then you're good um no we don't want to publish the branch we just want to go back here uh no lower down did i i missed it help me no it wasn't this one it was this one change it to gpt5 and then send it again yeah yeah and oh yeah that's good continue river yeah okay and then uh evan also said if you docker compose up and then set the em okay yeah so i guess the problem is we don't have the environment variables to not only the um latent space discord but um david set up his own personal discord where we can do the testing for this thing but i don't have those credentials either off the top of my head anyways yeah we'd need a bot token and stuff yeah yeah for sure i have a discord set server set up so it wouldn't be too difficult but where we got like 10 minutes left so maybe not the best use of time um evan also says in the okay cable says it seems very confident it's the story of 2025 llm use yeah no no victor says ai is good at writing compose files evan says one consideration for debug on a new bot turn the bot can only show what is persisted somewhere memory such mongo um so unless logs are persisted in that way debug won't be able to show logs so can you call docker logs from your own from inside your own docker container uh is that a real command uh docker logs is a real command yeah um so you would need to you need to put the container i know you can do it from the host but i don't know if you can do it from inside the container oh okay i see what you're saying i see what you're saying yeah because ideally what when the or like in my what i would what i would probably do is uh uh yeah is is just run when somebody asks for a debug thing is run docker logs on you know docker logs for all the containers and just dump them that would be what i did what i would do when i type so this is a learning lesson for me when when using docker when i typed in docker compose up it already it automatically called not only the ai in action bot but the mongo ai bot which i'm assuming is just a mongo db what how does it know to do that is that all in docker compose uh yeah the the bot added to docker compose it added another container you know yeah that's crazy yeah okay yeah that's so interesting to me yeah i i need to explore docker i really do but i i'm like uh mostly a i'm just gonna run it locally if it messes up my machine it does yolo yes docker is very very helpful very powerful would recommend for sure okay so bait let's see the changes index.js i'm fairly this similar oh it made changes elsewhere it looks like as well next steps so gpt5 did more than four did which is like in that unusual what do we think do we do we keep or cancel it's very similar right yeah very similar but i'm i'm curious like let's see like so let me let me close all these so i can make this a little bigger uh user id um is thread um it has less emojis i think uh i i wonder how much this is influenced by all the previous research that we had like you could do yet another experiment right like just start with a completely empty completely empty context and say like hey can you just like show me debug stuff when i ask for debug and then kind of see what it finds there's also like interesting experiments so it looks like it's mostly time stamp stuff and how it handled the next steps what is what is telling people to do next after oh this was it seems to have been using the sonnet stuff on and then adding on top of it right no but i oh well no because i thought i did like yeah i thought you reverted i'm not sure if that's the case who knows right because it already had like on the left it already has this debug intent which was the sonnet stuff i think so maybe because you committed it like decided to not revert it's weird i thought cursor did like an auto yeah it like literally just takes you back to whatever the status was here but i'm guessing it's the commit i'm if i was cursor what i would probably do to try and like save um not have to save as much history as to every time i commit kind of wipe all of the recent uh rollbacks that's what i yeah that makes sense if i was building cursor that's how i'd build it okay last uh five minutes or so what do you guys like any thoughts closing thoughts that you want to close up on i appreciate you guys for having patience with me on the docker side because i know you don't have max do you i was gonna say tell it to rewrite it and rust uh yeah the the i i had a window open specifically okay here we go um maybe we can do a bit of uh so i'm building out a bot in the ai in action discord for the latent space podcast and i would like to anthropomorphize the bot instead of called calling it the ai in action bot help me brainstorm names make sure the list is long okay so i i basically want this uh and then if i could pull up david's thing i forgot where i put it oh it's here here what oh i see david's using uh devon on this yeah so there was if um one of the first things i did was look through the commits and there's like devon claude and uh copilot uh via zach so i thought i thought that was pretty cool good is there any thinking mode 2.5 pro pro has been so terrible the last couple days i think it's the nano banana stuff um is that the new retention policy you just accepted yeah yeah keep all my stuff i think so you might want to revisit that yeah well yeah they updated it so now they keep it for five five years instead of like a couple months yeah listen take take my whole brain i don't i'll be honest i don't care um so are we going with clinker or are we i really don't like alex that's terrible codex is taking bites is someone interesting i kind of wanted something it's all cringe yeah gptina all right sir it was google studio any better someone decibel damn dude decibel uh latent vector manny ember nah lex is nice but we'd have to change the profile to a waifu the picture at least so i can feel like i'm talking to a woman for want to my life um the concierge artificial is that it's funny that's funny all right we got like maybe two minutes left any any any i like artificial yeah um so i guess we have to consult with david being that this is like technically his project but i think i and the only reason why i want to do that is because like ai and action is already uh or just already is yeah i actually like that a lot i kind of i do kind of like janice and i also kind of like alfred but i'm not i don't know i'm not married to him janice stuck out to me too janice stuck out to me too um mostly just because i spend too much time on twitter yeah um yes i guess we'll just consult david with this list uh clinker janice already anything else that stands out to you guys and like maybe put up a poll or something in discord and see where we end up um i think i'm actually going to open a pr with these changes and see if he accepts it and uh maybe it makes sense to do contributing as a separate pr than like some of the other stuff because i think a while that he was doing this in raw js instead of typescript with uh with all the ai in here like i'm surprised it hasn't broken everything but um i don't know uh cable i think that's pretty much for the most part that's a wrap thank you for running uh this was fun appreciate you stepping up uh you know we can always do something like this but i do want to encourage everyone here bring a topic bring something you want to do maybe you drive uh next time you want to keep hacking on this or bring a project you're hacking on to do it uh victor i don't appear to have uh the ability to mute him so let's see yeah same same uh but anyway uh bring your bring your projects bring your talks bring your learnings um this is fun uh but yeah i think a nice mix is good so uh maybe next week somebody can bring a topic to to talk about we can you can sign up with the bot um that we just worked on so we'll see you all in a week peace thanks for coming cheers

AI in Action: Debugging an LLM-Powered Discord Bot

Chapters

Transcript