[Full Workshop] Vibe Coding at Scale: Customizing AI Assistants for Enterprise Environments

. Okay, thank you. We have Vibe Coding at Scale. It's a talk, I'm going to give you the Expo in a very short version without all the hands-on stuff. So this is whatever, how long this will take. It's all about Vibe Coding. We're going to give fully into the Vibe's, embrace Exponential, and forget that code exists.

It is about focusing on the output and not actually on the code. And that's where people disagree. Like, if I don't look at my code as an engineer, what am I? And what does even embracing Exponentials mean in this case? So we're going to get into that. But we're definitely going to forget that code exists.

And I think at the Anthropi conference two weeks ago was a great chart of how exponentially agents run longer and longer and generate more and more code. So slowly forgetting that code exists and you want to review every piece of code, but building trust and adding guardrails to the AI is what this talk is about.

And this Vibe Coding journey starts kind of initially what most people saw, just like I build an app in just one day and I put it online and I made money and then things happened to the app and things got leaked and things were no longer fun. And that's that fun chaos state of Vibe Coding.

And we're trying to move to professionals then. And that initial state is what I'm now terming YOLO vibes. It's the unofficial wording, but it's all about creativity and speed. And it's a good place to be. It's because there is instant gratification. It's getting things up. It's about learning. It's not about shipping products.

And so we need to get there. Second stage is structured wipes. It's all about balance and sustainability. Like how do you bring maintainability, more readable code, like things you might actually, somebody in the end might want to read that code and you'll have a handover to somebody else. And you want to have some quality control that what you built is maintainable and not just a throwaway project.

And lastly, we get to spectrum vibes. And if you have done anything on Reddit or on blocks in the past, then the past few weeks you've probably seen people sharing their kind of best practices. This is how I finally got good value out of AI. And those best practices are emerging.

But they bring you that scale, reliability, and velocity that comes with that. While still hopefully giving you some speed and gratification along the way. But reducing that chaos. Well, maybe keeping the fun. So vibe coding initially where we see this outcome first, as I said, it's all about, you don't even want to look at the code.

Like if you're in an editor and I see actually people framing their experience in AI editors of low code mode. So they just look at the chat panel and look at whatever comes out of it and that's the outcome first. It's all about natural language. It's all about just staying in the flow and working with the AI.

And it's what auto-accepting changes. So until it maybe no longer works, we might want to undo. But otherwise we just keep talking to the AI. I know you did this wrong. Try again. Fix it or go to jail is a very popular one. So you can try all kind of.

But stay in natural language. Don't be too specific. So there is a use case for that though. You want to get a sense for YOLO vibe coding for rapid prototyping. For proof of concepts where you just want to get something out. That's where I actually have a ton of conversations with larger companies who want to start a conversation like, how can we do vibe coding?

And for them it's all about getting people who are non-technical to be able to communicate ideas. It's about UX people just making a mock-up and being able to bring that to a meeting and being able to communicate what they want to do in the mock-up. It's all about learning.

Like we had last, two weeks ago I built, we had one hour vibe coding on stage live. And people build games. And I use 3.js, which I have tried many years ago. But I haven't used 3.js in a while. But once I got the code running, I could start getting into the code is structured.

How does it make shapes? And I could actually understand technology because I have something working. And that's really the power of getting something up and running that gives you the technology, something hands-on to actually try out. And of course personal projects. So I'm not sure how many of you had sit down with somebody who's non-technical and showed them vibe coding and just build your water tracking app or you build something with your kids.

There's all kind of personal projects you can now finally solve over the weekend thanks to Yolo vibe coding. So let's do some Yolo vibe coding. This is AI generates. So images do reflect the vibe. So this is really about voice input, relaxing. I guess coffee is in there as well.

But let's get it going. Okay. So in VS code for Yolo vibe coding, we're going to start with an empty VS code. So hit Command + Shift + N. Sorry? Are we supposed to use Insiders? I might show off Insiders stuff, but all of what I'm showing is also unstable.

If you have Insiders, use Insiders. Insiders is the pre-release version of VS code that ships on a daily basis. It's like Firefox Nightly, Chrome Canary, Def, or Ships Nightly. So on your left side, you will have no folder open. On your right side, you will have co-pilot open. Everybody got this?

Raise your hand. Cool. Awesome. So who has used agent mode? Just checking. So I don't have to explain everything. Okay. Cool. So agent mode, probably the default you want to have set and default with setting. Clots on the four is great at front and stuff. So for me personally, my favorite.

Is that big enough? Yeah. And now what's interesting, so just one quick tour of how we're going to do vibe coding. is there's actually this interesting setting here. Is this zooming? Yeah. Is new workspace in VS code. And for this first round of vibe coding, I want you to go into this tools picker down here and actually disable scaffold new workspace.

Because it will help you scaffold your workspace, but it will lower the vibes if you're just trying to do HTML coolness. Oh, yes. Tools picker down here. That's the little one. How to get into this menu? There's a tool. There's a tools picker. Okay. Uh-oh. Yes. It might be.

It was in a different spot before. If you don't see, check that you're in agent mode. Otherwise, you don't have tools. Oh, okay. Yeah. Okay. First step into the panel is switch to agent mode down here. It might be an ask by default. Yes, it is. Cool. And then second step, go into the tools menu, which only appears in agent mode.

Soon, agent mode is going to be default and this whole where is my tools picker is less of a problem. Okay. So, and then what you want to uncheck is this new section. And we're going to check out new. Yes. Do you see nothing? And I see those two items and nothing else.

That's not enough. I see more of you. You might be. Yes. I think that's fine. Let me just. Shoot. Okay. Good answer. Yes. This is actually, it's very recent and we're actively working on that. So, keep it going. Okay. So, keep it going. So, keep it going. All right.

Okay. So, let's actually use it. Let's start with our first Vibe coding using you because it's so hard to disable. And the way we're going to do this is create a, let's do React Vibe. And that's first lesson for Vibe coding. Use stacks that are kind of popular in front end where the AI doesn't have to reason too much to make wild guesses.

So, React and Vibe are good ways to run a project. Website. That for, what do we do? Hydration tracking. Water. Hydration. Water consumption app. Simple. And big accessible UI. Following. What do we tell the AI to make it really beautiful? We tell it. I like to tell it Apple design principles.

You can infuse it with whatever design sense you have. Mine is just make it look pretty. Which always helps. So, I'm adding that extra. So, right. Water tracking. Hydration. And that's what we see. So, we don't give it any constraints. We don't tell it how many buttons. Like, is it mobile friendly?

Is it not mobile friendly? Is it what CSS language to use? I might actually not do React right. But might. Oh, yeah. Let's do that with material design. With material. Material design. So, yeah. Now we got the stack. We got a design direction. And we've told it to make it pretty.

Okay. And then hit run. So, we're now in a no folder state. So, what's first happening. It's going to tell us to empty. To open an empty folder. Everybody got that? In their flow. Okay. It's going to hit continue. Now it's going to actually ask me to open a folder.

We're going to make a new folder. Vibing@naiengineering. Put that anywhere where you put your stuff. Where you put your code. And now it's actually opening a new folder. And it's going to continue to set up. And to explain a bit what's happening here. So, this is using the new command.

So, this is using the new command. So, this is using the new command. So, this is using the new command. So, this is going to open a new folder. And it's going to open a new folder. It's going to open a new folder. And it's going to open a new folder.

And it's going to open a new folder. And it's going to open a new folder. And it's going to open a new folder. And it's going to open a new folder. And it's going to open a new folder. And it's going to open a new folder. Where you put your stuff.

Where you put your code. And now it's actually opening a new folder. And it's going to continue to set up. And to explain a bit what's happening here. So, this is using the new command. A new command is optimized for creating projects from scratch. Which, if you look at the internet of how people evaluate AI coding tools.

Is what every second person does. Can it make me a water tracking app? Can it make me a movie database app? And so, we optimize for this flow. But also, it makes for this nice Vibe coding from scratch. Because everybody struggles with what is the right stack. How do I get started?

And this is what this is here for. We got the latest. So, we can now review the commands. And this is maybe where we do the first tweak of our settings. So, if you go into settings and search for approve. You will find the auto-proof. And that's the first rule of Vibe coding.

As we said, we don't want to look at code. And we just want to have the AI do stuff. So, what you can do. In my case, I'm going to actually go over to workspace. From user to workspace. And that means that setting is only set for this workspace.

Which is the safest way to use this setting. So, use with caution. I'm going to auto-proof. Which means all of these continue buttons won't happen anymore. And we're just going to get results. So, check that box. Close it again. And we can hit continue here. This is using Vibe.

This is fine. Now we're going to stop worrying about the code. But I can still read the plan. So, install Material AI. Dependencies. Create hydration tracking. Apple-inspired design. And project structure. So, it's running commands. Wait. Is there files already in here? Okay. Cool. What did it create? Oh. Where does the music come from?

Because I already did something. Cool. Okay. Yes. Keep going. In this case, it tried to create something. And then it ran another terminal command. Keep going. That's fine. Is it doing stuff for you? Do you see things popping up? I can't find the auto-proof. Oh. The auto-proof? It doesn't look like it's in the.

It's not insiders. It's not insiders. I'm not using insiders. I guess I should be. Oh, wait. Did we? I thought it was in the last version. I have insiders. I just haven't switched back. Okay. Okay. Okay. Let me check in VS Code stable. Yeah, that's my problem of not being enough in stable.

I shouldn't run workshops. Okay. Wait. Why is that different than what came from Jesus? So you found it in. Yeah. Thank you so much. There's two ways to get to the same thing, but it's a different menu. But it looks. Oh, it just. Anyway. Yeah. The settings. There's settings from the bottom and settings from the top and they're not the same.

Okay. Okay. I'm happy to go figure that later. The settings up here. The one. The one. I used the gear. I used the gear setting from the bottom. Yeah. It didn't work. But anyway. The settings. You got it. Okay. Where's the other one? Like the one in that. Oh.

In the other setting. Yeah. Yeah. I have customized my UI too much as well. Cool. Question. Is there a way for this to run? I know you keep telling you to run this main terminal. But like cursor, it gives you those. It gives you those. It actually had to run.

Yes. That's why I think. I should have said auto-approved before. So it's active for the chat session. But this is the setting I just showed. It's the auto-approved. So all the continue buttons are basically gone. And it just auto-approves. Yeah. Yeah. Yeah. And for MCP tools, we actually do have dropdowns to allow, always allow for session and always allow for workspace and, or not always allow.

So there's some more fine grained things. We're rolling out to more tools. But for, yeah. It still prompts you to continue when it works. Yeah. I think it's the auto-approved is not applied to the current session. That's, I showed it too late. I should have done it the other way around.

Okay. We got some material design coming in. And this is where you need to get your coffee and just wait. Okay. That's a good idea. Let's do that. Okay. Open your window. Okay. What prompt did we use? Create a write. So, okay. Now in the new window, I'm going to do auto-proof first.

A proof. Auto-proof is already on. That's good. This is another window now. And we're going to do the same thing. And we're actually going to use material design. I guess and is still out there, right? And? Or fluid. Fluent. That's the Microsoft one. Let's see what that looks like.

Again, it's going to prompt me for folder. Vibing@aie2. And this is, I think, one of the key takeaways. Like, trying out different ways to get to the same result is where vibe coding really shines. Is continue. Is just trying. I had really, really quick success of just, like, what are different signup flows that we can create.

Like, create three different versions of this design to explore what this could look like. Okay. It installed. It's updated in the index in the one script. And that's where it gets confusing. If you have multiple open now at the same time, you've got to figure out what's running what.

And now this flow actually over here runs without any confirmation. So we set auto-approved in the correct order. And it's now just creating byte, installing, installing Fluent. And notice that we got the wrong Fluent because there's a dependency. Now it's fixing those. Can you show one more time how to do that auto-approved side?

I'm sorry. Yes. So the way I did command, comma, is the quick way. So I don't use menus. I should use menus. So you go up here. I have my settings up here. Most probably have them on the other lower side. And you go in here and go into settings.

And this is what it should look like. Do your settings look like this? And then if you look for approve. Sorry. Can't type auto-approved. Do you see it? Do you see it? It might be insiders. It's all blur. We're shipping on a monthly basis. And I thought we tweeted about it end of last month.

So we found it. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. The good thing. The person who owns the AI terminal integration just came back from it. paternity leave. So we're back on the game. Yeah. But there is definitely.

we've been looking at. how to allow a specific. terminal commands. So. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. The good thing. The person who owns the AI terminal integration just came back from paternity leave. So we're back on the game. But the. There is definitely.

We've been looking at how to allow specific terminal commands. And that's how most tools do it. But if you think about the scary parts of chaining and running multiple commands in one command. So terminals are not as. Predictive as you would think and how you can easily allow list things.

So if you're mostly thinking about the. How to do it right. Right. Okay. And I got two vibe coding sessions going on here. And hide this one. This is not. This is for tomorrow. This is vibing is happening. So you see it. It creates an app. TSX. It creates an app.

CSS. It also creates copilot instructions. Who's been using copilot instructions before? Yeah. So that's. That's one way. This out of the box experience. Just. Does things for you. It. Comes with instructions. Baked in. Which is nice. It understands. Which stack you're using. That's mostly about it. It already captured my design principles.

That I eloquently put in. Please make it look like apple. And now it's actually. There's a clean minimal intuitive interface. Consistent UI. And everything else. It broke down the technical stack. Even things I haven't mentioned. Like CSS. And responsive design. So it calls out some of the assumptions. That AI will fill in.

If you give it a high level task. Okay. This is still working on the index CSS. And. The other one. Is working on. The HTML file. Yeah. It would just do it for you. It would usually do it. As it. Creates. The project. I haven't typed anything else. So far.

And we're just. We're just vibing. It. Here. Created one. For a page. Let me see if it. Created one both. I didn't see it. Okay. There we go. Intro. Our first app is done. Hydration tractors. Stay healthy. Stay hydrated. Today's progress. We're on a quick add. 500 milliliters. And it went with.

Metrics. Isn't that beautiful? Just how I wanted it. Maybe I got my. Accent. I don't know. You can do plus minus. It's interesting how mine looks different. Yeah. I know. So does. Does everybody looks as nice as mine? Do you already see yours? Okay. Yeah. Yours better? Yeah. Yeah.

Yeah. That's. Like a very wide open by coding workshop. I ran these a few times. And this is probably one of the nicer ones. So. But it's also like actually running these with different models. And getting a sense of how good each model is at design. And having design sense without me telling it how to do everything.

And Claude is definitely usually rocking the icons. It got the colors really nicely. So. That's been great. This is a really nice app. So now. Next on. Because we're visual. Like we haven't even checked the code. We haven't read the CSS. We haven't looked at the TSX. Like is this the doing.

We're just doing functional programming. Like how does it handle the state? I don't. I don't care. It works. So now you can actually do a new feature we landed. You can now say. Work visually. So I can now say this header up here. I don't know what it's called.

Like whatever progress indicator. Let's make this. More animated. Adding particles. Maybe. That's good. So movie paper. So this is all. Is this material design? This is material design. Yeah. Did you copy the component name? I didn't. I hit start down here. If you have the browser preview open. Yeah.

Did it open fail? Cool. So this is to point out two features. So one is in this flow at some point. It basically started the task. It did npm run def. And then the next step it did. It opens simple browser. And simple browser is this VS code in browser preview we have.

And it will just. And what we're injecting here. What just went away. Is a little toggle. You can now use. To select specific elements that we then attach as a visual reference. Reference. And as CSS into the current chat. So if I scroll back down. Here. I see what's being attached.

I actually click it. So everything you see under a message. Is in the context. The element screenshots. Somehow didn't. Didn't make it through. But this one made it in. And that's basically the. The CSS description. And HTML. Of the element we attached. So I didn't have to. Describe the element.

Where it sits. It just did it for me. for me. Okay. Ran into snacks. We don't care about that. Let's check the other one. Ooh. Number two, fluent design. This is what it came up with. It's a little bit plain. This is sad. Okay. At least it has a goal reach.

That's nice. And it has recent entries too. It made similar assumptions on what we want. So feature-wise, it somehow got to the same conclusion, but design-wise, this is definitely more corporate. Yeah. So that's the simplicity of Vype coding and using the new tool out of the box. If you are an insider so you can disable new tool, it's easier to do like a single file HTML thing because new tool is definitely biased towards using npm and installing packages.

So it always ends up a little bit more complex. Do you work with the insiders team at all? Yes. The team I work on. Hi, I'm Harold. I work on VS Code. Sorry. It would be really cool to understand when you open up for new insiders, what are the new things that just changed?

Like a quick diff somehow. If someone could show me, like, I have both of them on the machine. Right. And I use them both. But then I, like, fall behind a couple days back. It's like, what are the new things and why I should be using insiders? Yes. So the best way to stay on top of what's new is, so we do actually right now this week, it's testing week, and we're writing our release notes.

So the release notes are usually capturing everything that's new. But for insiders, it's hard because it's coming out every day. So it's hard to point. We should make an AI to summarize it. That's a great idea. We're going to make an MCP server that summarizes what land is. Yeah.

Yes. I like that. I like that. I like that. I like that. Let's do it for the next demo. Yeah, the next five, 30. Next demo. Okay. So what do we have in our YOLO vibing toolbox? We have the agent, which sometimes is hard to find. Now you're all on the agent, so that's great.

It's all about -- I actually didn't show that. I could have shown that. It's -- different panel styles. So if I go back here, you can actually move this into the editor, which is nice. So some people like that, more space for your chat. You can also -- if I go back into my panel -- Oh, I moved into the dropdown here, and you can move your chats into here.

So you can have multiple chats, and they actually have names. So it's easy to go back and forth. You can put them in parallel to your code. So you can use all the window management things. Can you run them in separate windows? Yes. Wait. How did you know? Okay, where did we go?

In your window. There you go. Now you have a chat in its own window. You can put it on your own monitor. So feature can succeed. And you can actually pin it on to be top. So now if I run this, and I can close this -- let's move that away first.

So I can accept this. This we're all going to keep. Close it. And then close the other one. And now we have the output, and we can just move our chat across and fully focus on the exponentials that are happening in this window. Yeah. So that's one way that you can really manage the space how you want it.

New workspace flow we showed. So it's really this optimized CLI first. Good question. Okay. Okay. Yeah. And then voice dictation I haven't shown. Who has tried voice dictation in GitHub Copilot already? Okay. Magic moment of add a dark mode, please. And maybe give it a cool name that works with a younger audience who needs to drink more.

Maybe my kids. Make it for kids. So a little more kids friendly. Thank you. Bye. Okay. So Command-I is actually the default shortcut. It's a local model, which is great for privacy. And it's really fast. It's accurate. And there's an option as well, when you use Command-Voice input, that it also reads back the text, which is great for accessibility.

And yeah, by using just your voice, you can now finally -- don't put that coffee down and just keep vibing. There's a Hey Copilot as well. I think we did at some point, which I haven't used in a while. Okay. I said all that. Keyboard shortcuts. There's a keyboard shortcuts.

You want to customize it to actually hold down while you talk and then let go. Visual context I showed attaching. It's great for wireframes. The in-letter preview gets hot reload. It just works. And you can attach the elements using that little send button. And then auto-accept. I showed you auto-proof tool.

There's also an auto-proof tool setting. There's also an auto-accept after delay. If you don't have that on, I love auto-save. It's a great VS Code feature that's already working after delay or after focus change. It will just save it for you. And what I haven't showed -- let's see if this works here -- is the undo button.

Is this still going? What is this still doing? I forgot. Oh, this is adding these particles. Cool. It still worked on that. Good Copilot. Okay. Got stunning animations. That's great. That's what I wanted. Is it doing it now? So let's keep -- it does animate. Nice. Particle -- no, look at bubbles.

Okay. I don't like it. Undo up here. Those are basically our checkpoints. There's a new checkpoint UX coming. But because I have many people saying, tell me, oh, you don't have checkpoints. I can't undo stuff. But if you already accepted stuff or if you want to go back to something like this is -- I like these particles, but for V0 -- oh, that's just beautiful.

I love it. We don't need that. Then you can also bring that back. Just need to see that again. That was really nice. Okay. So for people who don't like particles, we can now undo. And it's now back to the original version. So it has stages for each of the work it did.

And you can easily go back and forth to see the before and after as well. Yeah. But in VibeCode, you don't want to look at the code because we don't look at the output. Okay. That's the YOLO toolbox. And I think, as I mentioned before, you want to try it out just to get into DFT AI.

Like, in my case, I mentioned, I like getting a sense of how good AI is at design. Like, can I just give it wide tasks to explore a space and it'll, like, come up with something interesting? Or do you need to -- how detailed do I need to be?

When does it make mistakes? If I give it a general task, maybe about Java, where it's not as good at, like, what will it do? Next one is known frameworks. We went with Vibe, material design, things that are kind of off the shelf and haven't changed in a long time in a large scale.

So you want to use something that's popular and has been consistent. And lastly, we use it as a whiteboard. We showed just attaching a visual element, change this, add some particles. It's really about not becoming too attached with whatever you're working on, but being able and willing to throw it out and start from scratch if things go wrong.

Structured Vibe coding is this middle stage, tries to balance the YOLO, the fun and chaos with a more structured approach. And there, it's -- I think it's the biggest impact I see from talking to customers on, like, this is how Vibe coding can work for us. This is where we can bring some non-technical in, give them a good starter template that has a consistent tech stack that comes with clear instructions for the LLM and how to work on it and keeps it in its actual guardrails.

And already brings in some custom tools that bring in expert domain knowledge or internal knowledge that you would need to work on as code base. So that's really kind of YOLO on guardrails. It's faster and gives you more consistent results. So you don't end up with something, oh, it used material design, but it should have used Fluent or it should have added dark mode and should have been responsive.

All of that can already be baked into the instructions. So I see a lot of companies bring that into their bootstrapping for Greenfield projects. So we have something -- and you can oftentimes -- you go into a meeting and you have a product that looks already finished because you Vibe coded it with your go-to stack and uses your internal design system, so it already looks way more polished.

And the last piece, I think, out of mainstream workloads is where YOLO, by default, will always bias towards whatever is top of the training stack. With this one, you can then customize it further down to internal stacks, internal workloads, internal deployment infrastructure that makes it work better. So let's do structured Vibe coding.

This is now -- the image, as I explained, it now has wireframes and more charts. So that's what makes it more structured. There we go. It's open. So what I'm going to do now, I think I'm going to push this -- oh, it's still running. Let's see if this runs.

I did create this Vibe coding. So I do have another one that I can share. Just look at this one, and I'm going to push it to GitHub. It's going through. You can be fine. Cool. Front-end Vibe. Perfect. This is all Vibe, so we're going to make this live.

This is commit. And then -- yeah. Oh, yeah. This is one. Who has been using this commit button here? So Copilot will write your commit message. Done. Looks good. And sync changes. And I'll share the repository. This repository might still be an old name. Let me see. Yeah. Probably.

Let me just check where it sits. Because I forgot where it sits. Oh, it's perfect. SleepVibes. This is one of my Vibe exercises. Okay. What's your name for browser window? For browser window? Yeah. For browser window? Feeling your -- Oh. Which? The -- It just opens sometimes. Oh, yes.

Yes. Yes. Um, yes. Um, yes. The -- The -- The -- So I forgot where it sits. Oh, it's perfect. SleepVibes. This is one of my Vibe exercises. Okay. This -- What's your name for browser window? In the VS Code? Feeling your -- Oh. Which -- It -- The -- It just opens sometimes.

Oh, yes. Yes. Um, yes. The simple browser. So we're going to simple browser show. That's it. Okay. The repo we're pulling down is -- Whoa. Whoa. This one. SleepVibes. It doesn't have a dev container, but it's only in Node.js does. So it shouldn't be too gnarly. We want to get the agent to do some special magic.

You can pull this or should we do -- Should I try it? It's a -- It's a -- It's a good clone. You can ask the agent to get clone it for you. Yes. I'm curious if there was a special tool that would be more than that. It's the user -- Oh, sorry.

Digiderald. That's me. Yes. Yes. Okay. And then npm install on it. And then -- Yeah. It would have prepared better. It would have been a code space and a dev container and you just click open in code spaces and things work. Come to my next show and then we'll get that fixed.

Yeah. Let me just open the code space to see. Now I'm curious if it just works. Anybody that's been using code spaces on GitHub? Not many. Okay. Occasionally. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. between insiders, the regular version, all the plugins, and some of them don't work in... Out of the bug-in, yeah.

Yeah. If everything just works the same, would it be? Yes, it mostly does, right? But mostly, yeah. It's 95% there, but it's the 5% where when something doesn't work, you just go back to the other tool. Yeah. So VS Code is offering me to reopen it in a container.

Is it a data container? You can try. I'm actually not running it in a container, but if you want to... That's all right. The container is just a Node.js one. And it should work, too. I did add a container, see? You just did one. I vibe-coded my container, too.

I can now check. So, have you ever wondered what you did on a project? So, this is where I created my container. This is where I just asked GitHub Copilot to update my definition. DevContainer. Just look at my code base and update my DevContainer. So, I did a good job here.

I should have maybe remembered that I did that as well. Okay. If we're ready, meanwhile, while you clone, while you npm install, anybody got it working already? Still? Okay. Cool. I'll give the tour of what we have. So, one is we, again, start with good Copilot instructions. And they live in .github/copilot-instructions.md.

It's a markdown file that's included with all your agent requests, all your chat requests, all your inline chat requests. Just Copilot basically gives a grounding foundation knowledge about your code base. And sometimes they feel a bit repetitive, like, depending on -- I saw some demos of, like, use -- basically, it repeats linting rules that you expect the AI to just follow anyways.

But I like -- my go-to is just a one-liner on what's your stack. That's a good starting point. Just point it to what frameworks, what version, and that's one way to just keep it on Rails with what it uses. Okay. Question? I was experimenting with just figuring out what would be the best way to have that be a standard that gets included in.

But, like, let people -- not mess with that. Like, give people some coding structure. Yeah. If you have a whole team of live coders, but you probably don't want them touching that. So, this is in your repo. So, I think it's a good team exercise to iterate on it.

Like, it shouldn't be a stale document. Yeah. You can put this in your user settings as well. You probably don't want it with each app. You probably want it as a different repo, right? You could do it, yeah. A set of a way that you code and then each app codes.

I've been thinking about that, too. I've been really thinking about it. Yes. So, I've been trying to convince my peers on a GitHub site this should be an organizational setting that people can set easier on, like, an organizational level. And, like, something -- as a team, you can select which ones you want to use.

So, we're working on that discovery sharing. Is there always one file or can you opt in for, like, because if you have different languages, different settings, different things you're building? Yes. Great question. It's an API. Yeah. So, we have this one now. These are new instructions. So, these become rather monolith and large and unwieldy.

And now, just to point out here before I go to the new ones, I do also guide which tools to use. I do have my first MCPs in here. And I already tell it for a front-end Q&A review. Use the browser tools that come from Playwright. Research, I use Perplexity.

I have context 7 in here, which has library docs. And it keeps using this ID tool to look up IDs. But I just gave it these are the IDs you should use. Don't use the other tool. So, there's ways you can already guide it to specific tools you want it to apply when needed.

The rest is just syntax formulating, optimizations, key conventions. Yeah. And then the other format we have is .github/instructions/name.instructions.md. And those have this front-matter syntax that's becoming more popular for rules of what files it should apply to. So, they start to be scoped with a glob pattern. And then, right now, they're limited to being applied.

You actually have to have the file in context. So, a TypeScript file would only be applied if I actually do have a TypeScript file in here. Or I do have one open and then I enable this context and then it would be applied. But if I only have this right now, which means this isn't included, it wouldn't actually apply the rule.

So, we're fixing that and it's going to be more working as expected, probably. But that's right now. That's like the biggest question I get. Like, it didn't include my rule because right now, it really wants to see that file. Yeah. So, those are new. Those shipped, I think, the last version.

So, they should be also in stable and we're actively working on those. So, and then the new, new thing is plans or prompts. And then, we have the first kind of reusable tasks. For as a team, how do you think about ingraining? Like, oh, we now have finally a way to tell GitHub Copilot to write tests.

And your AI champion in the team handcrafted this perfect prompt which one-shots your test consistently. And now, everybody shares it in Slack, copies it around. Once you run the right test, you go back to Slack, copy it back. And that's what you want to use prompts for. You can finally put these prompts into a place where they can just be used by everybody.

And how can they be used? So, I showed these can actually be attached. So, I can also go in here and attach instructions. So, you can do it manually too. So, that's one way. But I can also now go in here. I'm in the chat window and hit slash.

And I can now actually run user prompts that are my own, that I create for myself. And I can use my plan and spec prompt you see over here on the left. Can you make that a customer? Yes. These are custom ones. So, the ones I have here, these are already custom in the workspace.

And then the other ones, I don't have. I'm not showing. So, I think we do. Wait, there's a new menu. Can you help a ridiculous name? Yes. Yeah. You can name them for you. So, let's make it here. So, this one actually just landed yesterday. Because insiders, we can now finally have an entry point.

Because everybody kept asking, how do I create prompts? And then I have to tell them which command to find it in. So, this is the new prompt configuration file. And I have some already here. So, as you mentioned, this is one that's interesting for this one. If I open this one, this is like defining how I want to write custom instructions.

So, whenever I'm in a new project that doesn't have custom instructions yet, I do run this prompt to bootstrip them for me. And, yes, there should be a prompt sharing website where you can find these amazing prompts that I create. And next week, we're going to... So, each prompt is like a separate, it's like separate from each other?

Yes. So, that's the main difference between instructions. Instructions are... You can have multiple. If you work on, for example, if you have one for TypeScript and one for your front-end folder, they do combine. Because there's multiple instructions that hopefully don't conflict with each other. But they allow you to be attaching multiple instructions.

And they're really more about code. Whereas, prompts are basically easy ways to inject something in this prompt field. And they stay in the conversation. But they're mostly around a task and maybe giving the AI something specific to do. Instructions, you wouldn't necessarily give it like what to do, but more how to do it.

What about if you wanted to, for example, teach it to always do TDD when it's writing code? Yeah. So, TDD, good point. That would be a good way to use custom modes. So, if we go into here, custom... Sorry. So, this is only insiders because it just landed. Sorry, you can't follow along or if you're not insiders.

So, custom modes will show up in the dropdown. So, this is like going to... It just went into the menu, created a custom mode. And now I can pick where it shows up. So, it got GitHub/chat modes, which put it into the repository. Or I just want to keep it for myself.

So, a good pattern if you just want to experiment, put it in your user folder. If you want to make everybody's life better in your team and you have high confidence that your mode does that, then you put it into the project. And then we're going to name this one, TDD.

And then we're going to ask AI to fill it in. Right? Prompt. Expert. Expert prompt. We need a prompt that enforces test-driven development for GitHub Copilot. So, it should probably first make sure it understands the problem, then write tests first. And only after tests are done, maybe get confirmation from the user to then write the implementation and then keep running the tests against implementation.

Cool. Thanks. That's important. I didn't know. I didn't say it. I'm not worried because I didn't actually activate my file as context. Let's see if it... I think it should have a tool to just create more for you if you ask it. So, it's going to make that an MCP server next.

Okay. Oh, it got it. Okay. Wonderful. So, we have a test-driven development assistance. This is my code. We don't want to read the code. Test-driven development, assistant, core principles, understand. Red. Write failing tests first. Beautiful. I wouldn't have not. Green. Wow. It does follow. Re-factor. Proof code quality. Strict rules.

No implementation of the tests. Beautiful. Beautiful. So, this is our new TDD mode. Okay. We've done it for. We've done it for. Looks pretty good. It has emojis. Example. Should we try it out? Yeah. Okay. Yeah. Okay. Yeah. Yeah. Okay. Yeah. Did it say there was a framework? Did it say there was a framework?

Yeah. Um. Did you scroll back up? There was something that said. I thought it said framework. These are framework. It's framework independence. Looks like. Is it? I thought there was something. Oh, just. Yeah. I think it does some. Yeah. Okay. So, it has it right there. It does make some stuff up.

Yeah. Wouldn't need to do that. I can take this out. I already have my. So. Um. Cool. Test drive the design. Wonderful. Okay. Let's do it. So, we have this project which doesn't do anything. If you just run this. Um. NPM run dev. Probably doesn't. I think I already did this before.

So, this is just a basic plain landing page. So, let's do the. Um. What feature do we have? I want a dashboard for GitHub issues. Just use mock data because I don't want to wire it up to GitHub. So, we want to have maybe some interesting contribution metrics. And.

But first, actually, let's make a plan. Meanwhile, while I type this in. Let's. Stop it. Um. TDD. So, a dashboard for GitHub. Um. Don't wire it up. So, now, again, we give it a very broad task. But we can now put it into. TDD mode. Which is our new, amazing.

Test development mode. That follows all the best practices, I assume, that the AI knows about TDD. And. Let's see. So, you. Wait. That's it. You created a new mode there. Yeah. Because. So, we went. Previously went into configure chat modes. We create a new mode. This mode is now.

Enforcing the technique. We can actually. In a mode. You can say which tools it's supposed to use. So. Is the mode. Is the mode. That's kind of confusing. Is the mode. In agent mode. But based on. Yeah. Markdown file. So, it's a custom agent mode. Yes. Yeah. Yeah. When is that going to be live.

It's an insiders now. Because it's going to ship next week on the 11th. Yeah. So, it's a custom agent mode. Yes. Yeah. When is that going to be live? It's an insiders now. It's going to ship next week on the 11th. Yeah. Yeah. It's an insiders ship. Insiders ships daily.

But it only releases monthly. Yeah. And we're one. One week. Late. Because there was a short. Short week. So we adjusted our schedule. Yes. So. For most of you. This menu will just have these entries. And the. In insiders only. If you look for modes. In the. Command palette.

So command palette. You can also click up here. Show. Run commands. And then modes. And that's. That's the place. Yeah. Yeah. Are they in. In the output. Or in the problems view. It's in the output. It's in the console. Right. instead of me copying and pasting that error into-- Yeah.

Are they in the output or in the problems view? It's in the output. It's in the . OK. Yeah, yeah. Like this one. I previously did something wrong, and I wanted to see, like, the error code is in the terminal. Yeah. In order for-- and it goes, tell me if you need anything else.

Yeah. I want you to look at my terminal when there's an error, right, I think, right now it's like on the base. Yeah. Is there a mode where it just constantly is looking at the terminal? If it runs the commands itself, it will start looking at the terminal. So the easiest way, if you run the deployment and the script itself through co-pilot itself.

But otherwise, there's also context, actually. If you look here, we have the-- That's my terminal . It is, actually. There's terminal last command, which includes the output as well, in terms of selection. Now, if you ask me why they're not in the context, like, I couldn't tell you right now.

Yeah. But that's working. I think I did this thing wrong, though. TDD, let's just see. Oh, it did tools. Oh, it did tools. It configured-- it made up tools. So it did. That's the part it made up. There's actually-- those are not the right tools. This is why it didn't do anything.

It just basically acted like chat and gave me the code. Because all the tools it tried were-- It didn't have any right access. So let's try this again. So this is probably a good thing to point out. So in now, prompts as well. Let me open the plan prompt.

So this one can actually now set tools. And if you just make a tools entry here, to tools, you can now actually click here and say which tools. In this case, this is a planning prompt. So most of you probably wanted to look at perplexity. To come up with anything it needs to find on the internet, I can select that.

So that's the way you can now have tools constrained for specific prompt, which always helps with kind of high quality. Because if you have many tools, which as you install more MCP servers, you always have this tooling explosion. And they might solve all the different problems you're having throughout the day.

But now you can configure it more specifically for domain. And also insiders only. We have tool groups, tool sets, as we call them. So tool sets, I can-- how did I get here? Down here in the tool dropdown, configure tool sets, and add more tools. This, I think, sends you to add server.

But configure tool sets, opens this one here. And that's only for anything, both built-in and MCP. Actually, a lot of the tools you see here, we cleaned this list up. If you use insiders, you see them. Then these are actually tool sets already. So we use tool sets internally, because edit files has multiple ways to edit files.

We give the AI a few ways. Code-based search has grep, has file search, has different searches as well, depending on what you're looking for. So all of these actually are tool sets in our own back end. And we expose this now as something you can create yourself. So my research tool, for example, has the Proplexity tool to ask deep research questions.

And it also has fetch. No, I didn't. No, we can show-- Can you talk about it? Yes, I can. Wouldn't be a talk without MCP. Also, there's a talk to more about MCP, a whole talk track where I'll be talking about MCP, if I can finish my slides. OK, there we go.

So let's talk about MCP. This is doing something. Let's see. Not understanding the requirement. It create mock data. Red face, writing tests. It wrote tests. Oh, it found that there's a no package library. That's sad. And it created the test utility. And then it tried to run the tests.

And then it asked to proceed. So that's cool. So I did the first stage of that mode. But I don't need to go 2D. But that's modes. 2DD will ask now for-- because we asked it to ask, actually, for confirmation. So that's why it's now pausing. It wrote tests.

They're all red. So that's good. OK, we're going to accept it. And let's go into MCP. So MCP servers-- who has already MCP servers set up in their VS Code? One way to get MCP servers is editing JSON. And there's a few other ways. But let me show you another way.

Playwright MCP-- who's been using Playwright MCP? It's probably one of the coolest ones. So Playwright MCP is a browser testing framework. And it allows people to access the browser locally and just take screenshots, run websites, get accessibility audits, a whole bunch of utility in there. And how to get it for VS Code, there's a JSON blob that you can all ignore and just hit Install Server.

So Install Server is just a VS Code protocol that we use to just wire things up into VS Code. You see the same if you go to the extensions marketplace for VS Code. You can hit Install Extension. That powers the same process. So now I can-- don't hit Show Confirmation.

We need to move this down. But Install Server actually puts this now into my user settings. And as you can see, you can have MCP servers, both for yourself. And I have the one for GitHub and for GIST Pad, which is a cool one. I can recommend this one.

And then, yeah, Playwright Now. So you can already see how many tools are provided. You can see if anything fails, you can get to the output. So if there would be configuration, I can show that here. There's actually GIST Pad needs a GitHub token. So it's a local MCP.

And what-- have you ever seen that one? So yay, no tokens in my configuration. So you can use inputs in VS Code configuration, both in the MCP.JSON and your settings. And inputs, you might have already seen those in tasks.json. That's how you configure your tests and your build steps in VS Code.

It's the same system. They're defined up here. So inputs are just an ID, a type, description, a default value. And the password true means it's encrypted at rest after you put it in. So it doesn't ask me anymore. That's going to prompt you then? It would, but this shows basically that it already has a token.

But if you enter it the first time, and we can actually try that, if I-- Or is it actually key? In VS Code's key storage. So on Mac, it actually uses keychain. Oh, OK. Yeah. So-- What other MCP servers do you use? Yeah, so I can show-- GistPad is fun.

It's done by-- actually, somebody at GitHub, so gistpad.dev, gistpad.mcp. It's mostly-- I'm going to show it off tomorrow, as well, in my talk. But it's a fun one that uses Gist as a knowledge base and also for prompts. So we like this one. It adopts a ton of recent MCP stuff.

But I think the main ones we usually see is GitHub MCP server, gistpad. Is it from-- Lost in tangent. OK, yeah. So if you just want to play around with a really well-done MCP server, then that's one. Not saying GitHub one isn't as good. But it's-- there's a lot more coming here, as well.

So they're all in reactive development to figure out what the best way for MCP is. And so just a couple of stuff went pretty fast on. So this is using API, so we're using SSC, like let's say Python. Yeah. So how can I-- where did the-- you had the MCP JSON.

Yeah. OK. Where-- how did you connect to the-- let's say I have a Python, custom Python, and then I have a Python server on SSC port one time for-- Yeah. Just exactly what I have. I can understand how you just did that. How do I-- how do I hook it up?

To SSC? Yeah. Yeah. OK. Right now, it's in your case. OK. For example. And in your case, that server is already running. So how to hook up-- OK. Cool. Yeah. Let's do that. So SSC, same way, basically. So what-- just finding my-- max it out. So go back to my MCP.JSON.

And down here-- Can you talk about this? One thing is just for workspace. Yes. So to clarify, you know, mcp.JSON sits in .vscode right now. And it's per workspace. And that's shared across everybody. So hopefully it puts stuff-- either you work on it alone, and it's just for you, or everybody is happy to have those MCP servers.

Oh, that's awesome. Yeah. That was a little-- Yeah. That was a per workspace. And now, . Yeah. Yeah. OK. And then the-- what's now? If you hit @Server, you will find what you're looking for. And then from @Server, you can hit down on HTTP. So we actually do support both SSC, which is actually deprecated, and streamable HTTP, which is the new fangled, easier to scale, better for your cloud.

SSC is deprecated over? It's no longer in the spec, yes. And we do fall back to it on the client side. But it's-- the SSC is really hard on hosting, right? Because they have these long running connections. It's a long pull. Long pull, yeah. Yeah. Yeah. Yeah. So that's where you put in your MCP SSC server.

And if you want to do it manually, it's really just-- you get a nice autocomplete, too. So if you pick a name, example, and this would be-- the type would be not studio. Actually, it's HTTP. Yeah, you already would use HTTP. And then it would already yell at you that you don't have a URL.

So I'm going to put a URL. So this is how-- everything is by default as the IDO. Once you have a URL, I think I can take this out. Yeah. So it would be just that entry. So are you just starting your chat in ASC? Is there a page in Dev and then it's leveraging the MCP server?

Yes. So to get-- yes. So many demos I see, people hit start here as well, just to see that it's working. It's a nice configuration. I just make sure it's working. We actually do cache the tools once we saw them the first time. So how MCP works is that on the first initialization from the client to the server, it shares its tools back.

And that's what you see here, the one tool. So if you would do it right, you would never know the tools unless you start the server. We actually cache them so you don't have to-- we don't have to start all the servers proactively once you open Copilot just to get the tools.

OK, I just make sure. So that plus symbol on top of chat, you wouldn't be just-- it would now be an IDP server. Yes. Yeah. So we have to get on agent mode, right? No, actually you want to be in agent mode. Agent mode-- ASC mode will not run MCPs for you.

You can go-- because ASC mode is not actually-- there's no function calling in Henry, right? ASC mode is really this traditional Ask ChatGPG question will answer based on its training there or its context. OK, it's very good, right? Yes. So there is also an Ask mode. You don't have tools.

You would see it. But I can do actually-- does this still work? Let's do that more quickly. I think we actually do this still. Yes, so we're actually blurring the line a bit now. So if you do-- that's not working. OK. Yeah, but if you actually reference specific tools in Ask mode, it will invoke them for you.

But by default, the way where you want to execute tools is in agent mode. Oh, I am-- these models come up. You're using, what, more? Yeah. Or a model? I'm just going to find a configuration as to, like, how does it know that that whole model, or the GPP model, what's the structure of it?

They're all coming through GitHub co-pilot. So are they all using your paid-- you can add your own models. Anybody has tried it, managing a model? So I have Gamma 3 through Ollama, which runs locally. And I have open routers, Perplexity R1, which is actually a fine-tuned model of DeepSeq R1 from Perplexity.

So if you haven't tried it yet, basically go into the model picker and hit Manage Models. And then we can actually custom configure your own API keys from Anthropic, Azure, Cerebras, Grammati, Grog, all of these. Ollama is the local one. So if you have a beefy M4 Pro, I'm still sad how many models I can actually run on this.

But eventually, it's going to be small, powerful models that make sense. So-- Is there a reason, when I do that, if I don't see quality 4 of this show up? And at least there's like 3.5 because I don't know what it is. It might be because of your Anthropic tier, right?

That's-- or is it-- I have 7.4. You have 7.4? Oh, the other one-- yeah, you might be in agent mode. So we do actually filter them down. So that's an ongoing improvement we're doing. That's why it's not-- it's right now a preview feature only, because we're still having to correctly wire up which model allows tool calling.

So there's some-- every provider has different indicators of how tool calling works. And that's one of the matching things we're doing right now. So you might not see it because it's not on our list yet. Is it verbose enough to say what tool is it calling an MCP? Yes.

Yeah. You will see that. So if I do just one example here. So just-- so A, you can disable them. If you want to be faster, you can do command down and just go through. So context 7 is the one I want to keep. Play right, I can disable right now.

But once you start using them, this is command up and down, the power user way of navigating those. So these are all built-in MCP servers. And once you start, you can actually now-- if I want to be very explicit and I know which tools I want, I can use my tool sets.

Or I can mention the specific tools that are in my list. But I can also now just go in and say, what do we want to do here? Research GitHub metrics. Let's actually use the research one because we created it. Sounds better when I use research here. Use it in a sentence for productivity.

And now what happens now is this one has now-- is in agent mode. We have the research group set or tool set. So we'll either use perplexity or fetch. And one of my perplexity keys actually outdated because it failed before. Let's see. OK. So you see, I already actually proved this before.

So you see, A, that it runs the server and you actually click the server to see where it comes from. If it would have not auto-proofed this, because auto-proof is still on from our previous session, you can actually go in here and edit what it's sending, which now doesn't make sense because it's already sent.

And then it writes up what it found in this case. Does that underscore ask if anything can do with what it's doing? And that's just the odd name for the perplexity tool. OK. It happens to coincide with it? Yes. Yeah, yeah. It's just the verb should be before. So it's their naming.

Yeah. Yeah. So that's not run to. It actually did a follow up query as well and explained it. I know I could put this into a spec as well. Actually, I did this before. So I wrote a spec using forward community dashboard. So I did the research using perplexity and then ask it to write a spec from it using a little query I have here for the spec.

So that's one way you can quickly get things done. And just to point out this one, it's pointing it to a spec. So these are actually resolved by the AI. So if you point it to specific files, we do actually validate those as well. So if you get them wrong, I think they're on the lane.

You can also click them. So you get all the markdown goodies. And then you just ask it to write on the spec. Do nothing else. Use perplexity to look up stuff. Don't lose details. Keep updating the spec. So that's one way to work on specs. That's probably more tools we're going to add here.

Yeah. So that's MCP. Any other MCP questions? per MCP. So the MCP itself doesn't run anything except when you support sampling, which we do on Insiders. Sneak preview for tomorrow. But if you use sampling, actually, I guess I have to explain sampling. So sampling is a way for MCP to reach back out from the server to the client to use the LLM on the client.

And you can often think the best use cases are to summarize. Use cases are if you want to reduce the amount of tokens you send back to the client to explain something. So there's a few ways. But overall, there's not enough integration of sampling. But so we're the first ones to get it out there because we already have the LLM exposed.

So that's cool. So, like, it is more, like, some of these tools just want better sort of a level. Is there a way, is there a non-deterministic way? Yeah. To pick the model, kind of? No. To pick the Darn tool. So you clearly have multiple tools. Yeah. And one is deciding what tool you need.

Yeah. And you can prompt your way to do it, but it's still highly non-deterministic. Right. So, yeah. Yeah. So what I recommend is, A, in your modes, boil down the tools to what you actually need. So reducing the tools manually. Either deterministic already in your prompt. So this prompt could have tools for, like, what it should actually do.

Right? That would be one way. And then I can configure what I actually want to have here. Like, this should be only doing perplexity because it needs to do research. And that's all it should do. And that's all it should do. With custom modes, that would be so much better.

Yeah. Custom mode. Yeah. Right. Then you kind of pick your pockets of tools. Yes. So custom mode is one way. And then the other one is you can actually mention specific tools. So if you go in here into add context. And then you can actually point it to specific tools.

So you're not doing, like, the look up things on GitHub and you try to find the right verbiage that it gets the right tool. you can just actually mention the tool of it should, for example, resolve library ID. So you can just add these here. And then it will be handed to the AI of, like, these are the tools the user wants to use.

Yeah. But you use select, you see, like, do you, in general, basically select that tool? It's still, yeah. It's not like the end user application. Right. No one, no end user is not going to say, here, they don't care about tools. The end user is working with the chatbot.

Yeah. Yeah. Yes. Yeah. Tool calling is inherently always, even in this case, we're telling the AI you should use it, but it might not use it. Yeah. Yeah. Yeah. Yeah. Yeah. So my timer's down to zero, maybe just go back. to the slides to wrap it up. There are more, right?

Vibe coding. So we showed workspace instructions. We showed dynamic instructions, which only apply to parts of the tool set. We showed custom tools, a little play write, deep research. I haven't showed using web docs. Like, actually, one of my favorites just pointed to an existing repo and say, read this repo if you have questions.

MCP works great for that. MCP works great for that. When I work on MCP server, just tell it, look in the TypeScript SDK server for model context protocol if you have questions. Because we have cross-repo search, it just works. The agent actually has access to problems and tasks. So if you have tasks set up and you have linting set up, things will just work.

So make sure those are set up in your template. Generate commits I showed. And then fine-grained review. You can pause at any time. If it asks you questions, you can always type something in and keep steering it into the right direction. And you can trust read only in specific tools.

And I showed you also editing. So, yeah. Instructions, keep refining them as it makes mistakes. One of the key ones is commit often. I didn't show commits now. But any time you have a working state, just make sure you commit it so AI can continue making mistakes and be creative.

And then last one, there's a clear pause button in the lower end. So if AI goes off and you're like, what is it doing? Like, is it doing the right thing? Just pause and review. And that's possible as well. I showed a bunch of this for Spectrum development already.

But it's really about having a spec, having a blank and done plan and doing more custom prompts and tools, which I showed. I showed user prompts. There's more MCPs for database access and logging and project tracking like the GitHub MCP. And there's also access to actually tests and do debugging within the agent as well.

So if you ask it to test-driven development like we did, it will actually start running the tests if they're set up in VS Code correctly. And then we talked briefly about models as well. So if you want to use O3 for any of the cool stuff, the deeper thinking, you can do that as well.

Spec-driven is really about focusing on the spec. And I think a great way to do that is just create the spec from all the conversations you had about the spec. So one way, if you have a transcript from a meeting about the project you want to do, just feed that in and make sure you call out what the final decision is.

It's a great way to have meetings, but it's also a great way to not write the spec yourself in the end. Are there any tools to determine whether the spec is good or not? Like on the opposite side of that, like it's a big way to spec for a project requirements document.

Right. It's like a... One of the... I would ask AI. Yeah. Yeah. What he does is he has to generate the spec and then he has to critique the spec. Yeah. And say what are things that it's missing and how could it be better and stuff like that. And he basically argues with the LNM about the spec until he gets to the state that he wants.

Yes. Arguing with AI is one great way. So if I focus on one run prompt or prompt is... Where's one critique idea is one I like of just ask me three questions about my idea. Right. Just have AI go into thinking mode. Like what would you ask somebody for feedback and have it critically analyze your stuff.

So these prompts, like those are basically the next level of prompt crafting where you don't just ask it to code, but pull it in as a thought partner, as a design partner, as somebody who can poke holes in your... I do a long way. So... Yeah. Frontline review steps is...

Yeah. Good. Client. Client. You now can create one. So we haven't... Don't have a plan mode built in, but we also know ask edit agent will not be there forever. So it's a series of evolutions. Ru has way more modes that you can customize. So I think we want to allow developers to create their own.

Because I even see very few demos of people inclined using plan. They just give it a thing and then it runs. Right. So... Yeah. It is. Yeah. I mean, planning... In Vibe coding, you even do planning and then writing the implementation plan. So you wouldn't spend way more time on an initial just what and how we're doing things and then you let it implement.

So that would be even plan, write the implementation plan or write the spec, write the plan and then implement. So you would even have three modes if you do it correctly. And then... Yeah. So... That's already the last one. So takeaways. You've got to experiment. You've got to figure out what works for you.

Like at what point can you just give it a task and it runs with it. At one point you want to give it a task and write a spec first. And then implement. Keep giving it feedback and iterate. So never just accept a bad answer. And then really work on your process.

Like what works best for you? What works best for your team? And use modes and prompts and instructions to ingrain that. There's some bonus mistakes. You can screenshot. I'll make them all. Yeah, please. One more. There's one more. One more screenshot. You can end up like Emacs where it has major modes and you can have multiple minor modes.

Minor modes, yes. That's how you end up with prompts and custom modes right now. So I've got to clean this up too. And then, yeah. Lastly, I think there is a sweet spot for how you define your code basis for AI. So you want to have well-structured, self-explaining code.

Want to have the instructions set up. Want to have examples in your instructions. I want to keep instructions updated. So that's it. That was my unplanned workshop. Thank you for coming. It was kind of planned. I learned about Monday. - Thank you. - Thank you. - Thank you. We'll be right back.

[Full Workshop] Vibe Coding at Scale: Customizing AI Assistants for Enterprise Environments

Transcript