back to index[Full Workshop] Vibe Coding at Scale: Customizing AI Assistants for Enterprise Environments

00:00:18.800 |
It's a talk, I'm going to give you the Expo in a very short version 00:00:23.000 |
So this is whatever, how long this will take. 00:00:27.000 |
We're going to give fully into the Vibe's, embrace Exponential, 00:00:33.000 |
It is about focusing on the output and not actually on the code. 00:00:40.000 |
Like, if I don't look at my code as an engineer, what am I? 00:00:44.000 |
And what does even embracing Exponentials mean in this case? 00:00:52.000 |
But we're definitely going to forget that code exists. 00:00:55.000 |
And I think at the Anthropi conference two weeks ago 00:00:59.000 |
was a great chart of how exponentially agents run longer and longer 00:01:05.000 |
So slowly forgetting that code exists and you want to review every piece of code, 00:01:10.000 |
but building trust and adding guardrails to the AI is what this talk is about. 00:01:16.000 |
And this Vibe Coding journey starts kind of initially what most people saw, 00:01:22.000 |
just like I build an app in just one day and I put it online and I made money 00:01:26.000 |
and then things happened to the app and things got leaked and things were no longer fun. 00:01:33.000 |
And that's that fun chaos state of Vibe Coding. 00:01:36.000 |
And we're trying to move to professionals then. 00:01:38.000 |
And that initial state is what I'm now terming YOLO vibes. 00:01:42.000 |
It's the unofficial wording, but it's all about creativity and speed. 00:02:05.000 |
Like how do you bring maintainability, more readable code, 00:02:10.000 |
like things you might actually, somebody in the end might want to read that code 00:02:16.000 |
And you want to have some quality control that what you built is maintainable 00:02:27.000 |
And if you have done anything on Reddit or on blocks in the past, 00:02:31.000 |
then the past few weeks you've probably seen people sharing their kind of best practices. 00:02:35.000 |
This is how I finally got good value out of AI. 00:02:41.000 |
But they bring you that scale, reliability, and velocity that comes with that. 00:02:47.000 |
While still hopefully giving you some speed and gratification along the way. 00:02:55.000 |
So vibe coding initially where we see this outcome first, as I said, it's all about, 00:03:05.000 |
Like if you're in an editor and I see actually people framing their experience in AI editors of low code mode. 00:03:12.000 |
So they just look at the chat panel and look at whatever comes out of it and that's the outcome first. 00:03:19.000 |
It's all about just staying in the flow and working with the AI. 00:03:27.000 |
So until it maybe no longer works, we might want to undo. 00:03:31.000 |
But otherwise we just keep talking to the AI. 00:03:48.000 |
You want to get a sense for YOLO vibe coding for rapid prototyping. 00:03:53.000 |
For proof of concepts where you just want to get something out. 00:03:56.000 |
That's where I actually have a ton of conversations with larger companies who want to start a conversation like, how can we do vibe coding? 00:04:04.000 |
And for them it's all about getting people who are non-technical to be able to communicate ideas. 00:04:10.000 |
It's about UX people just making a mock-up and being able to bring that to a meeting and being able to communicate what they want to do in the mock-up. 00:04:20.000 |
Like we had last, two weeks ago I built, we had one hour vibe coding on stage live. 00:04:29.000 |
And I use 3.js, which I have tried many years ago. 00:04:35.000 |
But once I got the code running, I could start getting into the code is structured. 00:04:40.000 |
And I could actually understand technology because I have something working. 00:04:44.000 |
And that's really the power of getting something up and running that gives you the technology, something hands-on to actually try out. 00:04:54.000 |
So I'm not sure how many of you had sit down with somebody who's non-technical and showed them vibe coding and just build your water tracking app or you build something with your kids. 00:05:03.000 |
There's all kind of personal projects you can now finally solve over the weekend thanks to Yolo vibe coding. 00:05:18.000 |
So this is really about voice input, relaxing. 00:05:30.000 |
So in VS code for Yolo vibe coding, we're going to start with an empty VS code. 00:05:41.000 |
I might show off Insiders stuff, but all of what I'm showing is also unstable. 00:05:47.000 |
Insiders is the pre-release version of VS code that ships on a daily basis. 00:05:52.000 |
It's like Firefox Nightly, Chrome Canary, Def, or Ships Nightly. 00:05:59.000 |
So on your left side, you will have no folder open. 00:06:02.000 |
On your right side, you will have co-pilot open. 00:06:21.000 |
So agent mode, probably the default you want to have set and default with setting. 00:06:26.000 |
Clots on the four is great at front and stuff. 00:06:33.000 |
And now what's interesting, so just one quick tour of how we're going to do vibe coding. 00:06:39.000 |
is there's actually this interesting setting here. 00:06:46.000 |
And for this first round of vibe coding, I want you to go into this tools picker down here 00:06:56.000 |
Because it will help you scaffold your workspace, but it will lower the vibes if you're just trying to do HTML coolness. 00:07:23.000 |
If you don't see, check that you're in agent mode. 00:07:29.000 |
First step into the panel is switch to agent mode down here. 00:07:37.000 |
And then second step, go into the tools menu, which only appears in agent mode. 00:07:47.000 |
Soon, agent mode is going to be default and this whole where is my tools picker is less of a problem. 00:07:55.000 |
So, and then what you want to uncheck is this new section. 00:08:13.000 |
This is actually, it's very recent and we're actively working on that. 00:08:22.000 |
Let's start with our first Vibe coding using you because it's so hard to disable. 00:08:36.000 |
And the way we're going to do this is create a, let's do React Vibe. 00:08:45.000 |
Use stacks that are kind of popular in front end where the AI doesn't have to reason too much 00:08:51.000 |
So, React and Vibe are good ways to run a project. 00:09:16.000 |
What do we tell the AI to make it really beautiful? 00:09:24.000 |
You can infuse it with whatever design sense you have. 00:10:11.000 |
Now it's going to actually ask me to open a folder. 00:10:47.000 |
A new command is optimized for creating projects from scratch. 00:10:51.000 |
Which, if you look at the internet of how people evaluate AI coding tools. 00:11:06.000 |
But also, it makes for this nice Vibe coding from scratch. 00:11:10.000 |
Because everybody struggles with what is the right stack. 00:11:20.000 |
And this is maybe where we do the first tweak of our settings. 00:11:24.000 |
So, if you go into settings and search for approve. 00:11:42.000 |
In my case, I'm going to actually go over to workspace. 00:11:46.000 |
And that means that setting is only set for this workspace. 00:11:55.000 |
Which means all of these continue buttons won't happen anymore. 00:12:09.000 |
Now we're going to stop worrying about the code. 00:13:38.000 |
Yeah, that's my problem of not being enough in stable. 00:13:46.000 |
Why is that different than what came from Jesus? 00:13:53.000 |
There's two ways to get to the same thing, but it's a different menu. 00:13:59.000 |
There's settings from the bottom and settings from the top and they're not the same. 00:14:43.000 |
I know you keep telling you to run this main terminal. 00:15:00.000 |
So all the continue buttons are basically gone. 00:15:07.000 |
And for MCP tools, we actually do have dropdowns to allow, always allow for session and always 00:15:15.000 |
allow for workspace and, or not always allow. 00:15:22.000 |
It still prompts you to continue when it works. 00:15:26.000 |
I think it's the auto-approved is not applied to the current session. 00:15:37.000 |
And this is where you need to get your coffee and just wait. 00:15:57.000 |
Now in the new window, I'm going to do auto-proof first. 00:16:10.000 |
And we're actually going to use material design. 00:16:34.000 |
And this is, I think, one of the key takeaways. 00:16:37.000 |
Like, trying out different ways to get to the same result is where vibe coding really shines. 00:16:47.000 |
I had really, really quick success of just, like, what are different signup flows that we can create. 00:16:52.000 |
Like, create three different versions of this design to explore what this could look like. 00:17:03.000 |
If you have multiple open now at the same time, you've got to figure out what's running what. 00:17:07.000 |
And now this flow actually over here runs without any confirmation. 00:17:10.000 |
So we set auto-approved in the correct order. 00:17:13.000 |
And it's now just creating byte, installing, installing Fluent. 00:17:23.000 |
And notice that we got the wrong Fluent because there's a dependency. 00:17:31.000 |
Can you show one more time how to do that auto-approved side? 00:17:36.000 |
So the way I did command, comma, is the quick way. 00:17:47.000 |
Most probably have them on the other lower side. 00:18:19.000 |
And I thought we tweeted about it end of last month. 00:18:37.000 |
The person who owns the AI terminal integration just came back from it. 00:18:59.000 |
The person who owns the AI terminal integration just came back from paternity leave. 00:19:12.000 |
We've been looking at how to allow specific terminal commands. 00:19:18.000 |
But if you think about the scary parts of chaining and running multiple commands in one command. 00:19:25.000 |
Predictive as you would think and how you can easily allow list things. 00:19:34.000 |
And I got two vibe coding sessions going on here. 00:19:52.000 |
Who's been using copilot instructions before? 00:21:54.000 |
But it's also like actually running these with different models. 00:21:57.000 |
And getting a sense of how good each model is at design. 00:22:01.000 |
And having design sense without me telling it how to do everything. 00:22:05.000 |
And Claude is definitely usually rocking the icons. 00:22:29.000 |
So now you can actually do a new feature we landed. 00:23:26.000 |
And simple browser is this VS code in browser preview we have. 00:23:39.000 |
To select specific elements that we then attach as a visual reference. 00:24:10.000 |
for me. Okay. Ran into snacks. We don't care about that. Let's check the other one. Ooh. 00:24:19.560 |
Number two, fluent design. This is what it came up with. It's a little bit plain. This 00:24:27.880 |
is sad. Okay. At least it has a goal reach. That's nice. And it has recent entries too. 00:24:34.480 |
It made similar assumptions on what we want. So feature-wise, it somehow got to the same conclusion, 00:24:40.920 |
but design-wise, this is definitely more corporate. Yeah. So that's the simplicity of Vype coding 00:24:49.840 |
and using the new tool out of the box. If you are an insider so you can disable new tool, 00:24:56.840 |
it's easier to do like a single file HTML thing because new tool is definitely biased towards 00:25:01.420 |
using npm and installing packages. So it always ends up a little bit more complex. 00:25:14.300 |
Yes. The team I work on. Hi, I'm Harold. I work on VS Code. 00:25:16.300 |
Sorry. It would be really cool to understand when you open up for new insiders, what are 00:25:22.300 |
the new things that just changed? Like a quick diff somehow. If someone could show me, like, 00:25:30.740 |
And I use them both. But then I, like, fall behind a couple days back. It's like, what are the new things and why I should be using insiders? 00:25:37.180 |
Yes. So the best way to stay on top of what's new is, so we do actually right now this week, it's testing week, and we're writing our release notes. So the release notes are usually capturing everything that's new. But for insiders, it's hard because it's coming out every day. So it's hard to point. 00:25:54.540 |
That's a great idea. We're going to make an MCP server that summarizes what land is. 00:25:58.540 |
Yeah. Yes. I like that. I like that. I like that. I like that. Let's do it for the next demo. 00:26:05.540 |
Next demo. Okay. So what do we have in our YOLO vibing toolbox? We have the agent, which sometimes is hard to find. Now you're all on the agent, so that's great. It's all about -- I actually didn't show that. I could have shown that. It's -- 00:26:22.180 |
different panel styles. So if I go back here, you can actually move this into the editor, which is nice. So some people like that, more space for your chat. 00:26:34.540 |
You can also -- if I go back into my panel -- Oh, I moved into the dropdown here, and you can move your chats into here. So you can have multiple chats, and they actually have names. So it's easy to go back and forth. You can put them in parallel to your code. So you can use 00:26:52.180 |
all the window management things. Can you run them in separate windows? 00:26:58.540 |
Okay, where did we go? In your window. There you go. 00:27:04.540 |
Now you have a chat in its own window. You can put it on your own monitor. So feature can succeed. And you can actually pin it on to be top. 00:27:13.040 |
So now if I run this, and I can close this -- 00:27:20.040 |
let's move that away first. So I can accept this. This we're all going to keep. 00:27:25.040 |
Close it. And then close the other one. And now we have the output, and we can just move our chat across and fully focus on the exponentials that are happening in this window. 00:27:39.040 |
Yeah. So that's one way that you can really manage the space how you want it. New workspace flow we showed. So it's really this optimized CLI first. 00:27:52.040 |
Okay. Okay. Yeah. And then voice dictation I haven't shown. Who has tried voice dictation in GitHub Copilot already? 00:27:59.040 |
Okay. Magic moment of add a dark mode, please. And maybe give it a cool name that works with a younger audience who needs to drink more. 00:28:18.040 |
Maybe my kids. Make it for kids. So a little more kids friendly. Thank you. Bye. 00:28:24.040 |
Okay. So Command-I is actually the default shortcut. It's a local model, which is great for privacy. And it's really fast. It's accurate. And there's an option as well, when you use Command-Voice input, that it also reads back the text, which is great for accessibility. And yeah, by using just your voice, you can now finally -- 00:28:51.040 |
don't put that coffee down and just keep vibing. There's a Hey Copilot as well. I think we did at some point, which I haven't used in a while. 00:29:02.040 |
Okay. I said all that. Keyboard shortcuts. There's a keyboard shortcuts. You want to customize it to actually hold down while you talk and then let go. 00:29:12.040 |
Visual context I showed attaching. It's great for wireframes. The in-letter preview gets hot reload. It just works. And you can attach the elements using that little send button. 00:29:23.040 |
And then auto-accept. I showed you auto-proof tool. There's also an auto-proof tool setting. There's also an auto-accept after delay. If you don't have that on, I love auto-save. 00:29:35.040 |
It's a great VS Code feature that's already working after delay or after focus change. It will just save it for you. And what I haven't showed -- let's see if this works here -- is the undo button. 00:29:47.040 |
Is this still going? What is this still doing? I forgot. Oh, this is adding these particles. Cool. It still worked on that. Good Copilot. 00:30:01.040 |
Okay. Got stunning animations. That's great. That's what I wanted. Is it doing it now? So let's keep -- it does animate. Nice. Particle -- no, look at bubbles. Okay. I don't like it. Undo up here. Those are basically our checkpoints. 00:30:24.040 |
There's a new checkpoint UX coming. But because I have many people saying, tell me, oh, you don't have checkpoints. I can't undo stuff. 00:30:30.040 |
But if you already accepted stuff or if you want to go back to something like this is -- I like these particles, but for V0 -- oh, that's just beautiful. I love it. 00:30:40.040 |
We don't need that. Then you can also bring that back. Just need to see that again. That was really nice. Okay. So for people who don't like particles, we can now undo. 00:30:52.040 |
And it's now back to the original version. So it has stages for each of the work it did. 00:30:58.040 |
And you can easily go back and forth to see the before and after as well. Yeah. 00:31:03.040 |
But in VibeCode, you don't want to look at the code because we don't look at the output. 00:31:09.040 |
Okay. That's the YOLO toolbox. And I think, as I mentioned before, you want to try it out just to get into DFT AI. 00:31:17.040 |
Like, in my case, I mentioned, I like getting a sense of how good AI is at design. 00:31:21.040 |
Like, can I just give it wide tasks to explore a space and it'll, like, come up with something interesting? 00:31:26.040 |
Or do you need to -- how detailed do I need to be? 00:31:29.040 |
When does it make mistakes? If I give it a general task, maybe about Java, where it's not as good at, like, what will it do? 00:31:36.040 |
Next one is known frameworks. We went with Vibe, material design, things that are kind of off the shelf and haven't changed in a long time in a large scale. 00:31:48.040 |
So you want to use something that's popular and has been consistent. 00:31:54.040 |
We showed just attaching a visual element, change this, add some particles. 00:31:59.040 |
It's really about not becoming too attached with whatever you're working on, but being able and willing to throw it out and start from scratch if things go wrong. 00:32:10.040 |
Structured Vibe coding is this middle stage, tries to balance the YOLO, the fun and chaos with a more structured approach. 00:32:22.040 |
And there, it's -- I think it's the biggest impact I see from talking to customers on, like, this is how Vibe coding can work for us. 00:32:33.040 |
This is where we can bring some non-technical in, give them a good starter template that has a consistent tech stack that comes with clear instructions for the LLM and how to work on it and keeps it in its actual guardrails. 00:32:48.040 |
And already brings in some custom tools that bring in expert domain knowledge or internal knowledge that you would need to work on as code base. 00:33:01.040 |
It's faster and gives you more consistent results. 00:33:03.040 |
So you don't end up with something, oh, it used material design, but it should have used Fluent or it should have added dark mode and should have been responsive. 00:33:12.040 |
All of that can already be baked into the instructions. 00:33:15.040 |
So I see a lot of companies bring that into their bootstrapping for Greenfield projects. 00:33:20.040 |
So we have something -- and you can oftentimes -- you go into a meeting and you have a product that looks already finished because you Vibe coded it with your go-to stack and uses your internal design system, so it already looks way more polished. 00:33:35.040 |
And the last piece, I think, out of mainstream workloads is where YOLO, by default, will always bias towards whatever is top of the training stack. 00:33:45.040 |
With this one, you can then customize it further down to internal stacks, internal workloads, internal deployment infrastructure that makes it work better. 00:33:58.040 |
This is now -- the image, as I explained, it now has wireframes and more charts. 00:34:11.040 |
So what I'm going to do now, I think I'm going to push this -- oh, it's still running. 00:34:30.040 |
Just look at this one, and I'm going to push it to GitHub. 00:34:44.040 |
This is all Vibe, so we're going to make this live. 00:36:10.040 |
It doesn't have a dev container, but it's only in Node.js does. 00:36:28.040 |
We want to get the agent to do some special magic. 00:36:43.040 |
You can ask the agent to get clone it for you. 00:36:47.040 |
I'm curious if there was a special tool that would be more than that. 00:37:19.040 |
It would have been a code space and a dev container and you just click open in code spaces and things work. 00:37:28.040 |
Come to my next show and then we'll get that fixed. 00:37:45.040 |
Anybody that's been using code spaces on GitHub? 00:37:57.040 |
between insiders, the regular version, all the plugins, and some of them don't work in... 00:38:05.040 |
If everything just works the same, would it be? 00:38:07.040 |
Yes, it mostly does, right? But mostly, yeah. 00:38:10.040 |
It's 95% there, but it's the 5% where when something doesn't work, you just go back to the other tool. 00:38:18.040 |
So VS Code is offering me to reopen it in a container. Is it a data container? 00:38:25.040 |
You can try. I'm actually not running it in a container, but if you want to... 00:38:31.040 |
The container is just a Node.js one. And it should work, too. I did add a container, see? 00:38:43.040 |
I can now check. So, have you ever wondered what you did on a project? 00:38:48.040 |
So, this is where I created my container. This is where I just asked GitHub Copilot to update my definition. 00:38:55.040 |
DevContainer. Just look at my code base and update my DevContainer. So, I did a good job here. 00:39:00.040 |
I should have maybe remembered that I did that as well. 00:39:08.040 |
Okay. If we're ready, meanwhile, while you clone, while you npm install, anybody got it working already? 00:39:16.040 |
Still? Okay. Cool. I'll give the tour of what we have. 00:39:20.040 |
So, one is we, again, start with good Copilot instructions. And they live in .github/copilot-instructions.md. 00:39:32.040 |
It's a markdown file that's included with all your agent requests, all your chat requests, all your inline chat requests. 00:39:39.040 |
Just Copilot basically gives a grounding foundation knowledge about your code base. 00:39:46.040 |
And sometimes they feel a bit repetitive, like, depending on -- I saw some demos of, like, use -- basically, it repeats linting rules that you expect the AI to just follow anyways. 00:39:58.040 |
But I like -- my go-to is just a one-liner on what's your stack. That's a good starting point. Just point it to what frameworks, what version, and that's one way to just keep it on Rails with what it uses. 00:40:17.040 |
I was experimenting with just figuring out what would be the best way to have that be a standard that gets included in. But, like, let people -- not mess with that. Like, give people some coding structure. 00:40:31.040 |
If you have a whole team of live coders, but you probably don't want them touching that. 00:40:35.040 |
So, this is in your repo. So, I think it's a good team exercise to iterate on it. Like, it shouldn't be a stale document. 00:40:44.040 |
You can put this in your user settings as well. 00:40:47.040 |
You probably don't want it with each app. You probably want it as a different repo, right? 00:40:52.040 |
A set of a way that you code and then each app codes. I've been thinking about that, too. 00:40:58.040 |
So, I've been trying to convince my peers on a GitHub site this should be an organizational setting that people can set easier on, like, an organizational level. 00:41:06.040 |
And, like, something -- as a team, you can select which ones you want to use. 00:41:14.040 |
Is there always one file or can you opt in for, like, because if you have different languages, different settings, different things you're building? 00:41:26.040 |
So, these become rather monolith and large and unwieldy. 00:41:32.040 |
And now, just to point out here before I go to the new ones, I do also guide which tools to use. 00:41:41.040 |
And I already tell it for a front-end Q&A review. 00:41:44.040 |
Use the browser tools that come from Playwright. 00:41:48.040 |
I have context 7 in here, which has library docs. 00:41:52.040 |
And it keeps using this ID tool to look up IDs. 00:41:56.040 |
But I just gave it these are the IDs you should use. 00:42:00.040 |
So, there's ways you can already guide it to specific tools you want it to apply when needed. 00:42:06.040 |
The rest is just syntax formulating, optimizations, key conventions. 00:42:13.040 |
And then the other format we have is .github/instructions/name.instructions.md. 00:42:20.040 |
And those have this front-matter syntax that's becoming more popular for rules of what files it should apply to. 00:42:29.040 |
So, they start to be scoped with a glob pattern. 00:42:32.040 |
And then, right now, they're limited to being applied. 00:42:36.040 |
You actually have to have the file in context. 00:42:38.040 |
So, a TypeScript file would only be applied if I actually do have a TypeScript file in here. 00:42:44.040 |
Or I do have one open and then I enable this context and then it would be applied. 00:42:50.040 |
But if I only have this right now, which means this isn't included, it wouldn't actually apply the rule. 00:42:56.040 |
So, we're fixing that and it's going to be more working as expected, probably. 00:43:06.040 |
Like, it didn't include my rule because right now, it really wants to see that file. 00:43:16.040 |
So, they should be also in stable and we're actively working on those. 00:43:19.040 |
So, and then the new, new thing is plans or prompts. 00:43:23.040 |
And then, we have the first kind of reusable tasks. 00:43:27.040 |
For as a team, how do you think about ingraining? 00:43:31.040 |
Like, oh, we now have finally a way to tell GitHub Copilot to write tests. 00:43:36.040 |
And your AI champion in the team handcrafted this perfect prompt which one-shots your test consistently. 00:43:45.040 |
And now, everybody shares it in Slack, copies it around. 00:43:48.040 |
Once you run the right test, you go back to Slack, copy it back. 00:43:53.040 |
You can finally put these prompts into a place where they can just be used by everybody. 00:44:02.040 |
So, I can also go in here and attach instructions. 00:44:15.040 |
And I can now actually run user prompts that are my own, that I create for myself. 00:44:20.040 |
And I can use my plan and spec prompt you see over here on the left. 00:44:29.040 |
So, the ones I have here, these are already custom in the workspace. 00:44:50.040 |
Because insiders, we can now finally have an entry point. 00:44:53.040 |
Because everybody kept asking, how do I create prompts? 00:44:55.040 |
And then I have to tell them which command to find it in. 00:44:58.040 |
So, this is the new prompt configuration file. 00:45:03.040 |
So, as you mentioned, this is one that's interesting for this one. 00:45:08.040 |
If I open this one, this is like defining how I want to write custom instructions. 00:45:11.040 |
So, whenever I'm in a new project that doesn't have custom instructions yet, I do run this prompt to bootstrip them for me. 00:45:19.040 |
And, yes, there should be a prompt sharing website where you can find these amazing prompts that I create. 00:45:30.040 |
So, each prompt is like a separate, it's like separate from each other? 00:45:36.040 |
So, that's the main difference between instructions. 00:45:40.040 |
If you work on, for example, if you have one for TypeScript and one for your front-end folder, they do combine. 00:45:47.040 |
Because there's multiple instructions that hopefully don't conflict with each other. 00:45:51.040 |
But they allow you to be attaching multiple instructions. 00:45:58.040 |
Whereas, prompts are basically easy ways to inject something in this prompt field. 00:46:06.040 |
But they're mostly around a task and maybe giving the AI something specific to do. 00:46:13.040 |
Instructions, you wouldn't necessarily give it like what to do, but more how to do it. 00:46:18.040 |
What about if you wanted to, for example, teach it to always do TDD when it's writing code? 00:46:34.040 |
That would be a good way to use custom modes. 00:46:43.040 |
So, this is only insiders because it just landed. 00:46:45.040 |
Sorry, you can't follow along or if you're not insiders. 00:46:48.040 |
So, custom modes will show up in the dropdown. 00:46:54.040 |
It just went into the menu, created a custom mode. 00:46:59.040 |
So, it got GitHub/chat modes, which put it into the repository. 00:47:06.040 |
So, a good pattern if you just want to experiment, put it in your user folder. 00:47:10.040 |
If you want to make everybody's life better in your team and you have high confidence that your mode does that, then you put it into the project. 00:47:25.040 |
And then we're going to ask AI to fill it in. 00:47:37.040 |
We need a prompt that enforces test-driven development for GitHub Copilot. 00:47:49.040 |
So, it should probably first make sure it understands the problem, then write tests first. 00:47:54.040 |
And only after tests are done, maybe get confirmation from the user to then write the implementation and then keep running the tests against implementation. 00:48:15.040 |
I'm not worried because I didn't actually activate my file as context. 00:48:19.040 |
I think it should have a tool to just create more for you if you ask it. 00:48:26.040 |
So, it's going to make that an MCP server next. 00:48:32.040 |
So, we have a test-driven development assistance. 00:48:37.040 |
Test-driven development, assistant, core principles, understand. 00:49:44.040 |
So, we have this project which doesn't do anything. 00:50:04.040 |
Just use mock data because I don't want to wire it up to GitHub. 00:50:07.040 |
So, we want to have maybe some interesting contribution metrics. 00:50:29.040 |
So, now, again, we give it a very broad task. 00:50:40.040 |
That follows all the best practices, I assume, that the AI knows about TDD. 00:50:58.040 |
You can say which tools it's supposed to use. 00:51:13.040 |
Because it's going to ship next week on the 11th. 00:52:21.040 |
instead of me copying and pasting that error into-- 00:52:25.080 |
Are they in the output or in the problems view? 00:52:33.480 |
I previously did something wrong, and I wanted to see, 00:52:42.020 |
In order for-- and it goes, tell me if you need anything else. 00:52:45.500 |
I want you to look at my terminal when there's an error, 00:52:49.340 |
right, I think, right now it's like on the base. 00:52:57.700 |
If it runs the commands itself, it will start looking at the terminal. 00:53:01.980 |
So the easiest way, if you run the deployment and the script 00:53:08.200 |
But otherwise, there's also context, actually. 00:53:19.340 |
There's terminal last command, which includes the output as well, 00:53:24.960 |
Now, if you ask me why they're not in the context, 00:53:48.580 |
There's actually-- those are not the right tools. 00:53:54.580 |
It just basically acted like chat and gave me the code. 00:54:05.200 |
So this is probably a good thing to point out. 00:54:19.820 |
And if you just make a tools entry here, to tools, 00:54:23.820 |
you can now actually click here and say which tools. 00:54:28.820 |
So most of you probably wanted to look at perplexity. 00:54:32.820 |
To come up with anything it needs to find on the internet, 00:54:43.480 |
constrained for specific prompt, which always 00:54:54.320 |
And they might solve all the different problems 00:54:57.660 |
But now you can configure it more specifically for domain. 00:55:05.200 |
We have tool groups, tool sets, as we call them. 00:55:17.580 |
Down here in the tool dropdown, configure tool sets, 00:55:24.800 |
But configure tool sets, opens this one here. 00:55:29.180 |
And that's only for anything, both built-in and MCP. 00:55:43.540 |
So we use tool sets internally, because edit files 00:55:53.160 |
has different searches as well, depending on what you're 00:55:56.200 |
So all of these actually are tool sets in our own back end. 00:56:05.080 |
has the Proplexity tool to ask deep research questions. 00:56:33.600 |
a whole talk track where I'll be talking about MCP, 00:56:55.120 |
Oh, it found that there's a no package library. 00:57:18.220 |
because we asked it to ask, actually, for confirmation. 00:57:32.200 |
who has already MCP servers set up in their VS Code? 00:57:48.960 |
Playwright MCP-- who's been using Playwright MCP? 00:57:56.160 |
So Playwright MCP is a browser testing framework. 00:57:59.080 |
And it allows people to access the browser locally 00:58:05.100 |
get accessibility audits, a whole bunch of utility in there. 00:58:08.780 |
And how to get it for VS Code, there's a JSON blob 00:58:11.760 |
that you can all ignore and just hit Install Server. 00:58:21.020 |
that we use to just wire things up into VS Code. 00:58:23.580 |
You see the same if you go to the extensions marketplace 00:58:35.760 |
But Install Server actually puts this now into my user settings. 00:58:39.960 |
And as you can see, you can have MCP servers, both for yourself. 00:58:44.480 |
And I have the one for GitHub and for GIST Pad, 00:58:53.760 |
So you can already see how many tools are provided. 00:58:59.000 |
You can see if anything fails, you can get to the output. 00:59:02.900 |
So if there would be configuration, I can show that here. 00:59:07.940 |
There's actually GIST Pad needs a GitHub token. 00:59:18.620 |
So you can use inputs in VS Code configuration, both in the MCP.JSON 00:59:24.420 |
And inputs, you might have already seen those in tasks.json. 00:59:27.560 |
That's how you configure your tests and your build steps in VS Code. 00:59:35.960 |
So inputs are just an ID, a type, description, a default value. 00:59:41.420 |
And the password true means it's encrypted at rest after you put it in. 00:59:50.960 |
It would, but this shows basically that it already has a token. 00:59:54.780 |
But if you enter it the first time, and we can actually try that, if I-- 01:00:14.800 |
It's done by-- actually, somebody at GitHub, so gistpad.dev, gistpad.mcp. 01:00:20.300 |
It's mostly-- I'm going to show it off tomorrow, as well, in my talk. 01:00:24.000 |
But it's a fun one that uses Gist as a knowledge base and also for prompts. 01:00:36.040 |
But I think the main ones we usually see is GitHub MCP server, gistpad. 01:00:47.220 |
So if you just want to play around with a really well-done MCP server, 01:00:57.160 |
But it's-- there's a lot more coming here, as well. 01:01:02.320 |
So they're all in reactive development to figure out what the best way for MCP is. 01:01:06.820 |
And so just a couple of stuff went pretty fast on. 01:01:10.260 |
So this is using API, so we're using SSC, like let's say Python. 01:01:15.260 |
So how can I-- where did the-- you had the MCP JSON. 01:01:22.260 |
Where-- how did you connect to the-- let's say I have a Python, custom Python, 01:01:26.180 |
and then I have a Python server on SSC port one time for-- 01:01:41.120 |
And in your case, that server is already running. 01:02:03.060 |
So to clarify, you know, mcp.JSON sits in .vscode right now. 01:02:10.860 |
So hopefully it puts stuff-- either you work on it alone, 01:02:13.900 |
and it's just for you, or everybody is happy to have those MCP servers. 01:02:28.940 |
If you hit @Server, you will find what you're looking for. 01:02:31.340 |
And then from @Server, you can hit down on HTTP. 01:02:37.600 |
So we actually do support both SSC, which is actually deprecated, 01:02:41.440 |
and streamable HTTP, which is the new fangled, easier to scale, 01:02:54.460 |
And we do fall back to it on the client side. 01:02:56.980 |
But it's-- the SSC is really hard on hosting, right? 01:03:01.300 |
Because they have these long running connections. 01:03:06.320 |
So that's where you put in your MCP SSC server. 01:03:12.040 |
And if you want to do it manually, it's really just-- 01:03:18.020 |
So if you pick a name, example, and this would be-- 01:03:37.340 |
And then it would already yell at you that you don't have a URL. 01:03:40.760 |
So this is how-- everything is by default as the IDO. 01:03:47.040 |
Once you have a URL, I think I can take this out. 01:03:56.900 |
Is there a page in Dev and then it's leveraging the MCP server? 01:04:05.700 |
So many demos I see, people hit start here as well, 01:04:14.460 |
We actually do cache the tools once we saw them the first time. 01:04:17.840 |
So how MCP works is that on the first initialization 01:04:23.020 |
from the client to the server, it shares its tools back. 01:04:33.420 |
We actually cache them so you don't have to-- 01:04:36.020 |
we don't have to start all the servers proactively 01:04:54.460 |
Agent mode-- ASC mode will not run MCPs for you. 01:04:58.600 |
You can go-- because ASC mode is not actually-- 01:05:05.960 |
ASC mode is really this traditional Ask ChatGPG 01:05:08.460 |
question will answer based on its training there 01:05:34.860 |
Yes, so we're actually blurring the line a bit now. 01:05:41.700 |
Yeah, but if you actually reference specific tools 01:05:46.160 |
But by default, the way where you want to execute tools 01:05:59.180 |
I'm just going to find a configuration as to, like, 01:06:01.620 |
how does it know that that whole model, or the GPP model, 01:06:22.840 |
So I have Gamma 3 through Ollama, which runs locally. 01:06:31.360 |
which is actually a fine-tuned model of DeepSeq R1 01:06:36.960 |
So if you haven't tried it yet, basically go into the model 01:06:40.700 |
And then we can actually custom configure your own API keys 01:06:44.460 |
from Anthropic, Azure, Cerebras, Grammati, Grog, 01:06:53.220 |
sad how many models I can actually run on this. 01:06:56.080 |
But eventually, it's going to be small, powerful models 01:07:05.460 |
And at least there's like 3.5 because I don't know what it is. 01:07:08.660 |
It might be because of your Anthropic tier, right? 01:07:18.620 |
Oh, the other one-- yeah, you might be in agent mode. 01:07:23.200 |
So that's an ongoing improvement we're doing. 01:07:24.820 |
That's why it's not-- it's right now a preview feature only, 01:07:27.360 |
because we're still having to correctly wire up which model 01:07:33.100 |
So there's some-- every provider has different indicators 01:07:38.320 |
And that's one of the matching things we're doing right now. 01:07:40.920 |
So you might not see it because it's not on our list yet. 01:07:43.900 |
Is it verbose enough to say what tool is it calling an MCP? 01:08:06.000 |
If you want to be faster, you can do command down 01:08:14.220 |
But once you start using them, this is command up and down, 01:08:28.480 |
if I want to be very explicit and I know which tools I want, 01:08:33.140 |
Or I can mention the specific tools that are in my list. 01:08:50.920 |
Let's actually use the research one because we created it. 01:09:01.540 |
And now what happens now is this one has now-- 01:09:20.960 |
So you see, I already actually proved this before. 01:09:23.440 |
So you see, A, that it runs the server and you actually 01:09:31.480 |
If it would have not auto-proofed this, because auto-proof is still 01:09:34.600 |
on from our previous session, you can actually go in here 01:09:38.180 |
and edit what it's sending, which now doesn't make sense 01:09:44.640 |
And then it writes up what it found in this case. 01:09:47.460 |
Does that underscore ask if anything can do with what it's doing? 01:09:51.780 |
And that's just the odd name for the perplexity tool. 01:10:09.060 |
It actually did a follow up query as well and explained it. 01:10:17.960 |
So I wrote a spec using forward community dashboard. 01:10:26.260 |
using a little query I have here for the spec. 01:10:28.860 |
So that's one way you can quickly get things done. 01:10:35.220 |
And just to point out this one, it's pointing it to a spec. 01:10:42.000 |
So if you point it to specific files, we do actually 01:10:45.260 |
So if you get them wrong, I think they're on the lane. 01:10:54.620 |
And then you just ask it to write on the spec. 01:11:05.900 |
That's probably more tools we're going to add here. 01:11:19.960 |
So the MCP itself doesn't run anything except when you 01:11:32.240 |
But if you use sampling, actually, I guess I have to explain sampling. 01:11:36.540 |
So sampling is a way for MCP to reach back out from the server to the client to use the LLM 01:11:45.220 |
And you can often think the best use cases are to summarize. 01:11:49.080 |
Use cases are if you want to reduce the amount of tokens you send back to the client to explain 01:11:58.920 |
But overall, there's not enough integration of sampling. 01:12:04.440 |
But so we're the first ones to get it out there because we already have the LLM exposed. 01:12:09.920 |
So, like, it is more, like, some of these tools just want better sort of a level. 01:12:20.940 |
Is there a way, is there a non-deterministic way? 01:12:37.440 |
And you can prompt your way to do it, but it's still highly non-deterministic. 01:12:48.440 |
So what I recommend is, A, in your modes, boil down the tools to what you actually need. 01:12:59.340 |
So this prompt could have tools for, like, what it should actually do. 01:13:05.340 |
And then I can configure what I actually want to have here. 01:13:07.340 |
Like, this should be only doing perplexity because it needs to do research. 01:13:12.340 |
With custom modes, that would be so much better. 01:13:22.340 |
And then the other one is you can actually mention specific tools. 01:13:27.340 |
And then you can actually point it to specific tools. 01:13:29.340 |
So you're not doing, like, the look up things on GitHub and you try to find the right verbiage 01:13:36.340 |
you can just actually mention the tool of it should, for example, resolve library ID. 01:13:44.340 |
And then it will be handed to the AI of, like, these are the tools the user wants to use. 01:13:50.340 |
But you use select, you see, like, do you, in general, basically select that tool? 01:13:57.340 |
No one, no end user is not going to say, here, they don't care about tools. 01:14:07.340 |
Tool calling is inherently always, even in this case, we're telling the AI you should use it, 01:14:19.340 |
So my timer's down to zero, maybe just go back. 01:14:41.340 |
We showed dynamic instructions, which only apply to parts of the tool set. 01:14:46.340 |
We showed custom tools, a little play write, deep research. 01:14:53.340 |
Like, actually, one of my favorites just pointed to an existing repo and say, read this repo if you have questions. 01:15:00.340 |
When I work on MCP server, just tell it, look in the TypeScript SDK server for model context protocol if you have questions. 01:15:06.340 |
Because we have cross-repo search, it just works. 01:15:10.340 |
The agent actually has access to problems and tasks. 01:15:13.340 |
So if you have tasks set up and you have linting set up, things will just work. 01:15:17.340 |
So make sure those are set up in your template. 01:15:25.340 |
If it asks you questions, you can always type something in and keep steering it into the right direction. 01:15:30.340 |
And you can trust read only in specific tools. 01:15:39.340 |
Instructions, keep refining them as it makes mistakes. 01:15:45.340 |
But any time you have a working state, just make sure you commit it so AI can continue making mistakes and be creative. 01:15:52.340 |
And then last one, there's a clear pause button in the lower end. 01:15:56.340 |
So if AI goes off and you're like, what is it doing? 01:16:03.340 |
I showed a bunch of this for Spectrum development already. 01:16:07.340 |
But it's really about having a spec, having a blank and done plan and doing more custom prompts and tools, which I showed. 01:16:17.340 |
There's more MCPs for database access and logging and project tracking like the GitHub MCP. 01:16:24.340 |
And there's also access to actually tests and do debugging within the agent as well. 01:16:29.340 |
So if you ask it to test-driven development like we did, it will actually start running the tests if they're set up in VS Code correctly. 01:16:35.340 |
And then we talked briefly about models as well. 01:16:37.340 |
So if you want to use O3 for any of the cool stuff, the deeper thinking, you can do that as well. 01:16:42.340 |
Spec-driven is really about focusing on the spec. 01:16:47.340 |
And I think a great way to do that is just create the spec from all the conversations you had about the spec. 01:16:56.340 |
So one way, if you have a transcript from a meeting about the project you want to do, just feed that in and make sure you call out what the final decision is. 01:17:04.340 |
It's a great way to have meetings, but it's also a great way to not write the spec yourself in the end. 01:17:10.340 |
Are there any tools to determine whether the spec is good or not? 01:17:15.340 |
Like on the opposite side of that, like it's a big way to spec for a project requirements document. 01:17:25.340 |
What he does is he has to generate the spec and then he has to critique the spec. 01:17:32.340 |
And say what are things that it's missing and how could it be better and stuff like that. 01:17:36.340 |
And he basically argues with the LNM about the spec until he gets to the state that he wants. 01:17:43.340 |
So if I focus on one run prompt or prompt is... 01:17:48.340 |
Where's one critique idea is one I like of just ask me three questions about my idea. 01:17:57.340 |
Like what would you ask somebody for feedback and have it critically analyze your stuff. 01:18:04.340 |
So these prompts, like those are basically the next level of prompt crafting where you don't just ask it to code, 01:18:09.340 |
but pull it in as a thought partner, as a design partner, as somebody who can poke holes in your... 01:18:29.340 |
Don't have a plan mode built in, but we also know ask edit agent will not be there forever. 01:18:36.340 |
Ru has way more modes that you can customize. 01:18:39.340 |
So I think we want to allow developers to create their own. 01:18:43.340 |
Because I even see very few demos of people inclined using plan. 01:18:54.340 |
In Vibe coding, you even do planning and then writing the implementation plan. 01:19:03.340 |
So you wouldn't spend way more time on an initial just what and how we're doing things and then you let it implement. 01:19:10.340 |
So that would be even plan, write the implementation plan or write the spec, write the plan and then implement. 01:19:17.340 |
So you would even have three modes if you do it correctly. 01:19:28.340 |
Like at what point can you just give it a task and it runs with it. 01:19:31.340 |
At one point you want to give it a task and write a spec first. 01:19:45.340 |
And use modes and prompts and instructions to ingrain that. 01:19:59.340 |
You can end up like Emacs where it has major modes and you can have multiple minor modes. 01:20:04.340 |
That's how you end up with prompts and custom modes right now. 01:20:09.340 |
Lastly, I think there is a sweet spot for how you define your code basis for AI. 01:20:14.340 |
So you want to have well-structured, self-explaining code.