We have Vibe Coding at Scale. It's a talk. I'm going to give it the expo in a very short version without all the hands-on stuff. So, this is whatever how long this will take. It's all about Vibe Coding. We're going to give fully in to the Vibes, embrace exponential, and forget that code exists.
It is about focusing on the output and not actually on the code. And that's where people disagree. Like, if I don't look at my code as an engineer, what am I? And what does even embracing exponentials mean in this case? So, we're going to get into that. But we're definitely going to forget that code exists.
And I think at the Anthropik conference two weeks ago, it was a great chart of how exponentially agents run longer and longer and generate more and more code. So, slowly forgetting that code exists and you want to review every piece of code, but building trust and adding guardrails to the AI is what this talk is about.
And this Vibe Coding journey starts kind of initially what most people saw. I built an app in just one day and I put it online and I made money and then things happened to the app and things got leaked and things were no longer fun. And that's that fun chaos state of Vibe Coding.
And we're trying to move to professionals then. And that initial state is what I'm now terming Yolo Vibes. It's the unofficial wording, but it's all about creativity and speed. And it's a good place to be. There is instant gratification. It's getting things up. It's about learning. It's not about shipping products.
And so, we need to get there. Second stage is structured wipes. It's all about balance and sustainability. Like, how do you bring maintainability, more readable code, like things you might actually, somebody in the end might want to read that code and you'll have a handover to somebody else. And you want to have some quality control that what you built is maintainable and not just a throwaway project.
And lastly, we get to spectrum vibes. And if you have done anything on Reddit or on blocks in the past, then you past few weeks, you've probably seen people sharing their kind of best practices. This is how I finally got good value out of AI. And those best practices are emerging, but they bring you that scale, reliability, and velocity that comes with that while still hopefully giving you some speed and gratification along the way.
But reducing that chaos while maybe keeping the fun. So, vibe coding initially where we see this outcome first, as I said, it's all about you don't want to look at the code. Like, if you're in an editor, and I see actually people framing their experience in AI editors of low code mode.
So, they just look at the chat panel and look at whatever comes out of it, and that's the outcome first. It's all about natural language. It's all about just staying in the flow and working with the AI. And it's all accepting changes. So, until it maybe no longer works, we might want to undo.
But otherwise, we just keep talking to the AI. I know you did this wrong. Try again. Fix it or go to jail is a very popular one. So, you can try all kind of, but stay in natural language. Don't be too specific. So, there is a use case for that, though.
You want to get a sense for YOLO vibe coding for rapid prototyping, for proof of concepts where you just want to get something out. That's why I actually have a ton of conversations with larger companies who want to start a conversation, like, how can we do vibe coding? And for them, it's all about getting people who are non-technical to be able to communicate ideas.
It's about UX people just making a mockup and being able to bring that to a meeting and being able to communicate what they want to do in the mockup. It's all about learning. Like, we had last -- two weeks ago at Build, we had one-hour vibe coding on stage live.
And people build games. And I use 3.js, which I have tried many years ago. But I haven't used 3.js in a while. But once I got the code running, I could start getting into how it is structured, how does it make shapes? And I could actually understand technology because I have something working.
And that's really the power of getting something up and running that gives you the technology, something hands-on to actually try out. And of course, personal projects. So I'm not sure how many of you had sit down with somebody who's non-technical and showed them vibe coding and just build your water tracking app or you build something with your kids.
There's all kind of personal projects you can now finally solve over the weekend, thanks to that. So let's do some YOLO Vibe coding. So let's do some YOLO Vibe coding. So images do reflect the vibe. So this is really about voice input, relaxing. I guess coffee is in there as well.
But let's get it going. Okay. So in VS Code, for YOLO Vibe code, we're going to start with an empty VS Code. So hit command shift and -- sorry? I might show off insider stuff, but all of what I'm showing is also unstable. If you have insiders, use insiders.
Insiders is the pre-release version of VS Code that ships on a daily basis, like Firefox Nightly, Chrome Canary, Def, or Ships Nightly. So on your left side, you will have no folder open. On your right side, you will have copilot open. Everybody got this? Raise your hand. Cool. Awesome.
So who has used agent mode? Just checking. So I'm not have to explain anything. Okay. Cool. So agent mode, probably the default you want to have set and default with setting. Cloudstone 4 is great at frontend stuff. So for me personally, my favorite is big enough. Yeah. And now what's interesting, so just one quick tour of how we're going to do Vibe coding is -- there's actually this interesting setting here.
Is this zooming? Yeah. Is new workspace in VS Code. And for this first round of Vibe coding, I want you to go into this tools picker down here, and actually disable scaffold new workspace. Because it will help you scaffold your workspace, but it will lower the vibes if you're just trying to do HTML coolness.
Oh yes. Tools picker down here. That's the little one. How do you get into this menu? There's a tool -- there's tools picker. Okay. Uh-oh. Yes. It might be -- it was in a different spot before. If you don't see it, check that you're in agent mode. Otherwise, you don't have tools.
Yeah. Okay. First step into the panel is switch to agent mode down here. It might be an ask by default. Yeah, it is. Cool. And then second step, go into the tools menu, which only appears in agent mode. Soon, agent mode is going to be default, and this whole, control, where is my tools picker, is less of a problem.
Okay. So, and then what you want to uncheck is this new section. And we're going to check out new -- Yes. Do you see nothing? And I see those two items and nothing else. That's not enough. I see more -- you might be -- yes. I think that's fine.
Let me just -- shoot. Okay. Good answer. Yes. This is actually -- it's very recent and we're actively working on that. So, keep it checked. It's fine. So, let's actually use it. Let's start with our first Vibe coding using you, because it's so hard to disable. And the way we're going to do is create a -- let's do React Vibe.
And that's first lesson for Vibe coding. Use stacks that are kind of popular in front end, where the AI doesn't have to reason too much and make wild guesses. So, React and Vibe are good ways to run a project. Website that for -- what do we do? Hydration tracking.
Water hydration. Water consumption app. Simple and big accessible UI following -- what do we tell the AI to make it really beautiful? We tell it -- I like to tell it Apple design principles. You can infuse it with whatever design sense you have. Mine is just make it look pretty.
Which always helps, somehow, adding it extra. So, right, water tracking, hydration. And that's what I see. So, we don't give it any constraints. We don't tell it how many buttons. Like, is it mobile-friendly? Is it not mobile-friendly? Is it what CSS language to use? I might actually not do React Vibe, but might -- oh, yeah.
Let's do that with material design. So, yeah. Now, we got the stack. We got a design direction. And we told it to make it pretty. Okay. And then hit run. So, we're now in a no folder state. So, what's first happening, it's going to tell us help us to empty -- to open an empty folder.
Everybody got that? In there, Flo? Okay. It's going to hit continue. Now, it's going to actually ask me to open a folder. I'm going to make a new folder. Vibing@naiengineering. Put that anywhere where you put your stuff, where you put your code. And now, it's actually opening a new folder, and it's going to continue to set up.
And to explain a bit what's happening here. So, this is using the new command. A new command is optimized for creating projects from scratch, which, if you look at the internet of how people evaluate AI coding tools is what every second person does. Can it make me a water tracking app?
Can it make me a movie database app? And so, we optimize for this flow. But also, it makes for this nice Vibe coding from scratch. Because everybody struggles with what is the right stack? How do I get started? And this is what this is here for. We got the latest.
So, we can now review the commands. And this is maybe where we do the first tweak of our settings. So, if you go into settings and search for approve, you will find the auto-proof. And that's the first rule of Vibe coding. As we said, we don't want to look at code.
And we just want to have the AI do stuff. So, what you can do, in my case, I'm going to actually go over to workspace, from user to workspace. And that means that setting is only set for this workspace, which is the safest way to use this setting. So, use with caution.
I'm going to auto-approve, which means all of these continue buttons won't happen anymore. And we're just going to get results. So, check that box. Close it again. And we can hit continue here. This is using Vibe. This is fine. Now we're going to stop worrying about the code. But I can still read the plan.
So, install material UI, dependencies, create hydration tracking, Apple-inspired design, and project structure. So, it's running commands. Why is there files already in here? Okay. Cool. This keeps, what did it create? Oh. Where does the music come from? I guess I already did something. Cool. Okay. We're on. Yes. Keep going.
In this case, it tried to create something, and then it ran another terminal command. Keep going. That's fine. Okay. Is it doing stuff for you? Do you see things popping up? I can't find the auto-proof. Oh, the auto-proof? Oh, wait. Did we? I thought it was in the last version.
I haven't decided yet. I just haven't switched back to it. Okay. Okay. Let me check in VS Code Stable. Yeah, that's my problem of not being enough in Stable. I shouldn't run workshops. Okay. There you go. Wait. Why is it different than what came from Jesus? So, you found it in- Yeah, yeah, yeah.
Thank you so much. There's two ways to get to the same thing, but it's a different menu. Oh, it just- Anyway, yeah. The settings? There's settings from the bottom and settings from the top, and they're not the same. Wow. Okay. I'm happy to go through your pet it later.
The settings up here. The one, the one. I used the gear, I used the gear set with the bottom. Yeah. It didn't work. But anyway, the setting up the bottom. You got the top bar. Okay. Where's the other one? Like the one you had. Oh, and the other setting, yeah.
Yeah, I have customized my UI too much as well. Cool. Question? Yes, that's why I think I should have said auto-approved before. So, it's active for the chat session. But this is the setting I just showed. It's the auto-approved. So, all the continue buttons are basically gone. And it just auto-approved.
And for MCP tools, we actually do have drop-downs to allow, always allow for session and always allow for workspace and, or not always allow. So, there's some more fine-grained things. We're rolling out to more tools. But for, yeah? It's still prompt you to continue. Yeah, I think it's the auto-approved.
It's not applied to the current session. That's, I showed it too late. I should have done it the other way around. Okay, we got some material design coming in. And this is where you need to get your coffee and just wait. That's a good idea. Let's do that. Okay, open your window.
Okay, what prompt did we use? Create a write. So, okay, now in the new window, I'm going to do auto-approved first. Auto-approved is already on. That's good. This is another window now. And we're going to do the same thing. And we're actually going to use material design. I guess and is still out there, right?
And? Or fluid, fluent. That's the Microsoft one. Let's see what that looks like. Again, it's going to prompt me for folder. Vibing at AIE 2. And this is, I think, one of the key takeaways. Like, trying out different ways to get to the same result is where vibe coding really shines.
Is continue. It's just trying. I had really quick success of just, like, what are different sign-up flows that we can create? Like, create three different versions of this design to explore what this could look like. Okay, it installed. It's updated in the index in the one script. And that's where it gets confusing.
If you have multiple open now at the same time, you've got to figure out what's running what. And now, this flow actually over here runs without any confirmation. So, we set auto-approved in the correct order. And it's now just creating byte, installing, installing fluent. And notice that we got the wrong fluent because there's a dependency.
Now, it's fixing those. Yes. So, the way I did command, comma, is the quick way. So, I don't use menus. I should use menus. So, you go up here. I have my settings up here. Most probably have them on the other lower side. And you go in here and go into settings.
And this is what it should look like. Do your settings look like this? And then, if you look for approve. Sorry, can't type auto-approved. Do you see it? It might be insiders. It's all blur. We're shipping on a monthly basis. And I thought we tweeted about it end of last month.
So, we found it. Okay. I'm just going to run it. Yeah. Where, I'm not trying to compare files. I really want to use this. I want to stop. Like, the challenge is that auto-approved. Like, I'm just going to do this. Would you like me to run this? And then, you say, yes.
And it doesn't work for you, right? So, I'm looking for that in between. And right now, it's like quite minor. Yeah. Is it going to be to run it or it's not? Yeah. The good thing. The person who owns the AI terminal integration just came back from paternal leave.
So, we're back on the game. But there is definitely – we've been looking at how to allow specific terminal commands. And that's how most tools do it. But if you think about the scary parts of chaining and running multiple commands in one command. So, terminals are not as predictive as you would think in how you can easily allow lists things.
So, if you're mostly thinking about the how to do it right. Okay. And I've got two Vibe coding sessions going on here. I'm going to hide this one. This is for tomorrow. This is vibing is happening. So, you see it creates an app, TSX. It creates an app CSS.
It also creates Copilot instructions. Who's been using Copilot instructions before? Yeah. So, that's one way this out-of-the-box experience just does things for you. It comes with instructions baked in, which is nice. It understands which stack you're using. That's mostly about it. So, it already captured my design principles that I eloquently put in, please make it look like Apple.
And now it's actually there's a clean, minimal, intuitive interface, consistent UI, and everything else. It broke down the technical stack, even things I haven't mentioned like CSS and responsive design. So, it calls out some of the assumptions that AI will fill in if you give it a high-level task.
Okay, this is still working on the index CSS, and the other one is working on the HTML file. Yeah. It would just do it for you. It would usually do it as it creates the project. I haven't typed anything else so far, and we're just vibing. There should be a platform structure.
So, here it created one, let me see if it created one of both. Okay, there we go. Our first app is done. Hydration trackers, stay healthy, stay hydrated. Today's progress. We're on a quick add 500 milliliters. And it went with metrics. Isn't that beautiful? Just how I wanted it.
Maybe I got my accent. I don't know. You can do plus, minus. It's interesting how mine is different. Yeah, I know. So, does everybody look as nice as mine? Do you already see yours? Okay. Yeah. Yours better? Yeah. That's a very wide open Vibe Coding Workshop. I ran these a few times, and this is probably one of the nicer ones.
But it's also like actually running these with different models and getting a sense of how good each model is at design and having design sense without me telling it how to do everything. And Claude is definitely usually rocking the icons. It got the colors really nicely. So, that's been great.
This is a really nice app. So, now, next on, because we're visual, like, we haven't even checked the code. We haven't read the CSS. We haven't looked at the TSX. Like, is this the doing? We're just doing functional programming. Like, how does it handle the state? I don't care.
It works. So, now you can actually do a new feature we landed. You can now say work visually. So, I can now say this header up here. I don't know what it's called. Like, whatever progress indicator. Let's make this more animated. adding particles, maybe? That's good. So, MUI paper.
So, this is all. Is this material design? This is material design. Yeah. So, did you copy the component? I didn't. I hit start down here. If you have the browser preview open, did it open fail? Cool. So, this is to point out two features. So, one is in this flow, at some point, it basically started the task.
It did npm run def. And then the next step it did, it opened simple browser. And simple browser is this VS code in browser preview we have. And it will just, what we're injecting here, what just went away, is a little toggle you can now use to select specific elements that we then attach as a visual reference and as CSS into the current chat.
So, if I scroll back down here, I see what's being attached. I actually click it. So, everything you see under a message is in the context. The element screenshot somehow didn't make it through. But this one made it in. And that's basically the CSS description and HTML of the element we attached.
So, I didn't have to describe the element where it sits. It just did it for me. Okay. Ran into snacks. We don't care about that. Let's check the other one. Number two. Fluent design. This is what it came up with. It's a little bit plain. This is sad. Okay.
At least it has a little reach. That's nice. And it has recent entries, too. It made similar assumptions on what we want. So, feature-wise, it somehow got to the same conclusion. But design-wise, this is definitely more corporate. Yeah. So, that's the simplicity of Vype coding and using the new tool out of the box.
If you are an insider so you can disable new tool, it's easier to do like a single file HTML thing. Because new tool is definitely biased towards using NPM and installing packages. So, it always ends up a little bit more complex. Do you work with the insiders? Yes. It's basically the team I work on.
So, hi, I'm Harold. I work on VS Code. So, it would be really cool to understand, when you open up for new insiders, what are the new things that just changed? Like a quick dip somehow. Like, someone could show me, like, I have them both on my machine. Right.
And I use them both. But then I, like, fall behind a couple days back. It's like, what are the new things and why should we use it inside? Yes. So, the best way to stay on top of what's new is, so, we do, actually, right now, this week, it's testing week, and we're writing our release notes.
So, the release notes are usually capturing everything that's new. But for insiders, it's hard because it's coming out every day. So, it's hard to point. That's a great idea. We're going to make an MCP server that summarizes what land it. Yeah. Yes. I like that. I like that. I like that.
Let's do it for the next demo. Next demo. Okay. So, what do we have in our YOLO vibing toolbox? We have the agent, which sometimes is hard to find. Now, you're all on the agent. So, that's great. It's all about -- I actually didn't show that. I could have shown that.
It's different panel styles. So, if I go back here, you can actually move this into the editor, which is nice. So, some people like that. More space for your chat. You can also -- if I go back into my panel. Oh. I moved into the drop-down here. And you can move your chats into here.
So, you can have multiple chats. And they actually have names. So, it's easy to go back and forth. You can put them in parallel to your code. So, you can use all the, like, window management things. Can you run them in separate windows? Yes. Wait. How did you know?
All right. Where do we go? In your window. There you go. Now, you have a chat in its own window. You can put it on your own monitor. So, feature will succeed. And you can actually pin it on to be top. So, now, if I run this and I can close this.
Let's move that away first. So, I can accept this. This we're all going to keep. Close it. And then close the other one. And now, we have the output. And we can just move our chat across and fully focus on the exponentials that are happening in this window. Yeah.
So, that's one way you can really manage the space how you want it. New workspace flow we showed. So, it's really this optimized CLI first. Good question. Okay. Okay. Yeah. And then voice dictation I haven't shown. Who has tried voice dictation in GitHub Copilot already? Okay. Magic moment of at the dark mode, please.
And maybe give it a cool name that works with a younger audience who needs to drink more. Or maybe my kids. Make it for kids. So, a little more kids friendly. So, thank you. Thank you. Bye. Okay. So, Command-I is actually the default shortcut. It's a local model, which is great for privacy, and it's really fast.
It's accurate. And there's an option, as well, when you use command voice input, that it also reads back the text, which is great for accessibility. And, yeah, by using just your voice, you can now finally put that coffee down and just keep vibing. There's a Hey Copilot, as well, I think we did at some point, which I haven't used in a while.
Okay. I said all that. Keyboard shortcuts. There's a keyboard shortcuts. You want to customize it to actually hold down while you talk and then let go. Visual context. I showed attaching. It's great for wireframes. The in-order preview gets hot reload. It just works. And you can attach the elements using that little send button.
And then auto-accept. I showed the auto-proof tool. There's also an auto-proof tool setting. There's also an auto-accept after delay. If you don't have that on, I love autosave. It's a great VS Code feature that's already working basically after delay or after focus change. It will just save it for you.
And what I haven't showed, let's see if this works here, is the undo button. Is this still going? What is this undoing? I forgot. Oh, this is adding the particles. Cool. It still worked on that. I've got to compile it. Okay. Got stunning animations. That's great. That's what I wanted.
Is it doing it now? So, let's keep. It does animate. Nice. Particle. No, look at bubbles. Okay. I don't like it. Undo up here. Those are basically our checkpoints. There's a new checkpoint UX coming. But because I have many people saying, tell me, oh, you don't have checkpoints. I can't undo stuff.
But if you already accepted stuff, or if you want to go back to something like this is, I like these particles, but for V0, oh, that's just beautiful. I love it. We don't need that. Then you can also bring that back. I just need to see that again. That's really nice.
So, for people who don't like particles, we can now undo, and it's now back to the original version. So, it has stages for each of the work it did, and you can easily go back and forth to see the before and after as well. Yeah. But in VypeCode, you don't want to look at the code, because we don't look at the output.
Okay. That's the YOLO toolbox. And I think, as I mentioned before, you want to try it out just to get into the AI. In my case, I mentioned I like getting a sense of how good AI is at design. Like, can I just give it wide tasks to explore a space, and it'll come up with something interesting, or how detailed do I need to be?
When does it make mistakes? If I give it a general task, maybe about Java, where it's not as good at, what will it do? Next one is known frameworks. We went with Vype, material design, things that are kind of off the shelf and haven't changed in a long time in a large scale, so you want to use something that's popular and has been consistent.
And lastly, we use it as a whiteboard. We showed just attaching a visual element, change this at some particles. It's really about not becoming too attached with whatever you're working on, but being able and willing to throw it out and start from scratch if things go wrong. Structured Vype coding is this middle stage, tries to balance the YOLO, the fun and chaos, with a more structured approach.
And there, I think it's the biggest impact I see from talking to customers on, like, this is how Vype coding can work for us. This is where we can bring somebody non-technical in, give them a good starter template that has a consistent tech stack that comes with clear instructions for the LLM, and how to work on it, and keeps it in its actual guardrails, and already brings in some custom tools that bring in expert domain knowledge or internal knowledge that you would need to work on as code base.
And that's really kind of YOLO on guardrails. It's faster and gives you more consistent results, so you don't end up with something, oh, it used material design, but it should have used fluent, or it used should have added dark mode and should have been responsive. All of that can already be baked into the instructions.
So I see a lot of companies bring that into their bootstrapping for greenfield projects. So we have something, and you can oftentimes, you go into a meeting and you have a product that looks already finished because you Vype coded it with your go-to stack and uses your internal design system so it looks way more polished.
And the last piece, I think, out of mainstream workloads is where YOLO, by default, will always bias towards whatever is top of the training stack. With this one, you can then customize it further down to internal stacks, internal workloads, internal deployment infrastructure that makes it work better. So let's do structured Vype coding.
This is now the image, as I explained, it now has wireframes and more charts, so that's what makes it more structured. There we go. It's open. So what I'm going to do now, I think I'm going to push this up. It's still running. Let's see if this runs. I did create this Vype coding.
So I do have another one that I can share. Just look at this one, and I'm going to push it to GitHub. It's going to be fine. Cool. Front end Vype. Perfect. This is all Vype, so we're going to make this live. This is commit. And then, yeah. Oh, yeah.
Misses one. Who has been using this commit button here? So Copilot will write your commit message. Done. This looks good. And now sync changes. And I'll share the repository. This repository might still be an old name. Let me see. Yeah, probably. Let me just check where it sits. Because I forgot where it sits.
Oh, it's perfect. Sleep Vypes. So that's one of my Vype exercises. Okay. What's the name for browser window? What's the name for browser window? For browser window? It's like a dealing here. Which? It just opens sometimes. Oh, yes. Yes. Yes. The simple browser. So we're going to simple browser show.
That's it. Sleep Vypes. Okay. And then npm install on it. And then, yeah. It would have prepared better. It would have been a code space and a dev container. And you just click open in code spaces and things work. Come to my next show. And then we'll get that fixed.
Let me just open the code space to see. Now I'm curious if it just works. Anybody that's been using code spaces on GitHub? Not many. Okay. Occasionally. The complexity lies in all the different versions of each card. Between insiders, the regular version. All the plugins and some of them don't work in.
Out of the bug. Yeah. Yeah. Yeah. Everything just works the same. Yeah. It mostly does, right? But mostly, right? Yeah. It's 95 percent there. Yeah. Yeah. But it's the five percent. When something doesn't work, you just go back to the other tool. Yeah. So if a lot of plugins are offering me to reopen it in a container.
You could try. I'm actually not running it in container. But if you want to. The container is just a node.js one. And it should work, too. I did add a container. See? I vibe-coded my container, too. Right. I can now check. So if you ever wonder what you did on a project.
So this is where I created my container. This is where I just asked GitHub Copilot to update my dev container. Just look at my code base and update my dev container. So I did a good job here. Should maybe remember that I did that as well. Okay. If you're ready.
Meanwhile, while you clone, while you npm install. Anybody got it working already? Okay. Cool. I'll give the tour of what we have. So one is we, again, start with good Copilot instructions. And they live in .github/copilot-instructions.md. It's a markdown file that's included with all your agent requests, all your chat requests, all your inline chat requests, just Copilot.
Basically, it's a grounding foundation knowledge about your code base. And sometimes they feel a bit repetitive. Like, depending on, I saw some demos of, like, use, basically, repeats linting rules that you expect the AI to just follow anyways. But I like my go-to is just a one-liner on what's your stack.
That's a good starting point. Just point it to what frameworks, what version. And that's one way to just keep it on rails with what it uses. I was experimenting with just figuring out what would be the best way to have that be a standard that gets included in. But, like, let people -- not mess with that.
Like, give people some coding structure. If you have a whole team of live coders, but you probably don't want to touch them with that. So, this is in your repo. So, I think it's a good team exercise to iterate on it. Like, it shouldn't be a stale document. You can put this in your user settings as well.
You probably don't want it with each app. You probably want it as a different repo, right? You could do it. Yeah. A set of a way that you code and then each app goes. I've been thinking about that, too. I've been really thinking about it. Yes. So, I've been trying to convince my peers on a GitHub site this should be an organizational setting that people can set easier on, like, an organizational level.
And, like, something -- as a team, you can select which ones you want to use. So, we're working on that discovery sharing. Yeah. So, we have this one now. These are new instructions. So, they -- these are -- these become rather monolith and large and unwieldy. And now, just to point out here, before I go to new ones, I do also guide which tools to use.
I do have my first MCPs in here. And I already tell it for a front-end Q&A review. Use the browser tools that come from Playwright. Research. I use Perplexity. I have Context 7 in here, which has library docs. And it keeps using this ID tool to look up IDs.
But I just gave it these are the IDs you should use. Don't use the other tool. So, there's ways you can already guide it to specific tools you want it to apply when needed. The rest is just syntax from there. So, we have some syntax formulating, optimizations, key conventions.
Yeah. And then, the other format we have is .github/instructions/name.instructions.md. And those have this front-matter syntax that's becoming more popular for rules of what files it should apply to. So, they start to be scoped with a glob pattern. And then, right now, they're limited to being applied. You actually have to have the file in context.
So, a TypeScript file would only be applied if I actually do have a TypeScript file in here. Or I do have one open and then I enable this context and then it would be applied. But if I only have this right now, which means this isn't included, it wouldn't actually apply the rule.
We're fixing that and it's going to be more working as expected, probably. But that's right now. That's, like, the biggest question I get. Like, it didn't include my rule because right now, it really wants to see that file. Yeah. So, those are new. Those shipped, I think, the last version, so they should be also in stable and we're actively working on those.
So, and then, the new, new thing is this plans or prompts. And then, we have the first kind of reusable tasks for, as a team, how do you think about ingraining, like, oh, we now have finally a way to tell GitHub Copilot to write tests. And your AI champion in the team handcrafted this perfect prompt which one-shots your tests consistently.
And now, everybody shares it in Slack, copies it around. Once you run the right test, you go back to Slack, copy it back. And that's what you want to use prompts for. You can finally put these prompts into a place where they can just be used by everybody. And how can they be used?
So, I showed these can actually be attached. So, I can also go in here and attach instructions. So, you can do it manually, too. So, that's one way. But I can also now go in here. I'm in the chat window and hit slash. And I can now actually run user prompts that are my own, that I create for myself.
And I can use use my plan and spec prompt you see over here on the left. Can you make that custom? Yeah, these are custom ones. So, the ones I have here, these are already custom in the workspace. And then, the other ones I don't have. I'm not showing.
So, I think we do. Wait, there's a new menu. So, it's a new menu. So, you can name it for you. So, let's make it here. So, this one actually just landed yesterday. Because insiders, we can now finally have an entry point, because everybody kept asking, "How do I create prompts?" And then, I have to tell them which command to find it in.
So, this is the new prompt configuration file. And I have some already here. So, as you mentioned, this is one that's interesting for this one. This is like defining how I want to write custom instructions. So, whenever I'm in a new project that doesn't have custom instructions yet, I do run this prompt to bootstrip them for me.
And, yes, there should be a prompt-sharing website where you can find these amazing prompts that I create. And next week, we're going to... So, each prompt is like a separate... Yes. So, that's the main difference between instructions. Instructions are... You can have multiple. If you work on, for example, if you have one for TypeScript and one for your front-end folder, they do combine because there's multiple instructions that hopefully don't conflict with each other.
But they allow you to be attaching multiple instructions. And they're really more about code. Whereas, prompts are basically easy ways to inject something in this prompt field and they stay in the conversation. But they're mostly around a task and maybe giving the AI something specific to do. Instructions, you wouldn't necessarily give it, like, what to do, but more how to do it.
What about if you wanted, for example, to always do TDD when it's writing code? Yeah. So, TDD, good point. That would be a good way to use custom modes. So, if we go into here, custom... Sorry. So, this is only insiders because it just landed. Sorry. You can't follow along or if you're not insiders.
So, custom modes will show up in the drop-down. So, this is, like, going to... It just went into the menu, created a custom mode, and now I can pick where it shows up. So, it got GitHub/chat modes, which put it into the repository, or I just want to keep it for myself.
So, a good pattern if you just want to experiment, put it in your user folder. If you want to make everybody's life better in your team and you have high confidence that your mode does that, then you put it into the project. And then we're going to name this one, TDD.
And then we're going to ask AI to fill it in, right? Prompt. Expert. Expert prompt. It's just typing. Hi there, AI. We need a prompt that enforces test-driven development for GitHub Copilot. So, it should probably first make sure it understands the problem, then write tests first, and only after tests are done, maybe get confirmation from the user, to then write the implementation, and then keep running the tests against implementation.
Cool. Thanks. That's important. I didn't know. I didn't say it. I'm not worried because I didn't actually activate my file as context. Let's see if it... I think it should have a tool to just create more for you if you ask it. So, it's going to make that an MCP server next.
Okay. Oh, I've got it. Okay. Wonderful. So, we have a test-driven development assistance. This is my code. The one we decode. Test-driven development, the system, core principles, understand, red. Write failing tests first. Beautiful. It wouldn't have not. Green. Wow. It does follow. Re-factor, improved code quality, strict rules, implementation of the tests.
Beautiful. Beautiful. So, this is our new TDD mode. We've started four. Looks pretty good. It has emojis, example. Should we try it out? Yeah. Okay. We have a framework or no. Did it say, didn't it say it was a framework or no? Do you scroll back up? Do you scroll back up or something to say?
Do you use a framework? It's framework independence, it looks like. Is it? I thought there was something... Oh, just... Yeah. I think it does some... Yeah. Oh, okay. So, it has it right there. It does make some stuff up. Yeah. Wouldn't need to do that. I can take this out.
I already have my... So... Cool. Test drive the design. Wonderful. Okay. Let's do it. So, we have this project, which doesn't do anything. If you just run this... NPM run dev, probably doesn't... I think I already did this before. So, this is just a basic plain landing page. So, let's do the...
What feature do we have? I want a dashboard for GitHub features. Just use mock data, because I don't want to wire it up to GitHub. So, we want to have maybe some interesting contribution metrics. And... But first, actually, let's make a plan. Meanwhile, while I type this in, let's stop it.
TDD. So, dashboard for GitHub. Don't wire it up. So, now, again, we give it a very broad task, but we can now put it into TDD mode, which is our new amazing test and development mode that follows all the best practices, I assume, that the AI knows about TDD.
And... Let's see. So, you... Wait. That's it. You created a new mode there. Yeah. Because... So, we went... Previously, we went into configure chat modes. We created a new mode. This mode is now enforcing the technique. We can actually... In a mode, you can say which tools it's supposed to use.
So... Is the mode that's kind of confusing to use the mode in agent mode, but based on another markdown file? Yeah. So, it's a custom agent mode? Yes. Yeah. Okay. When is that going to be live? It's an insiders now. It's going to ship next week on the 11th.
Yeah. No, I'm going to the insider ships. Insiders ships daily, but it only releases monthly. Yeah. And we're one week late, because there was a short week, so we adjusted our schedule. Yes. So, for most of you, this menu will just have these entries. And the... In insiders only, if you look for modes in the command palette.
So, command palette, you can also click up here, show, run commands, and then modes. And that's the place. Yeah. Yeah. Yeah. Are they in the output or in the problems view? Okay. Yeah, yeah. If it runs the commands itself, it will start looking at the terminals. The easiest way, if you run the deployment and the scripts itself through copilot itself.
But otherwise, there's also context actually, if you look here, we have, um, um, the, uh, is actually terminal last, there's terminal last, there's terminal last command, which includes the output as well in terms selection. Now, if you ask me why they're not in the ad context, like, I couldn't tell you right now.
Yeah. But that's working. I think I did this thing wrong, though. TDD. Let's just see. Oh, the tools. It configured, it made up tools. So it did. That's, that's a part it made up. There's actually, those are not the right tools. Um, this is why it didn't do anything was why it just basically acted like chat and gave me the code because all the tools it tried were, uh, it didn't have any right X's.
So let's try this again. So this is probably a good, good, good point to, good thing to point out. So in now prompts as well, let me open plan prompt. So this one can actually now set tools. And if you just make a tools entry here to tools, you can now actually click here and say, which tools in this case, this is a planning prompts.
So mostly you probably wanted to look at perplexity to come up with anything it needs to find on the internet. Uh, I can select that. So that's the way you can now have tools constrained for specific prompt, which, which always helps with kind of higher quality, because if you have many tools, which as you install more mcp servers, you always have this tooling explosion and they might solve all different problems you're having throughout the day, but now you can configure it more specifically for domain and also insiders only.
We have tool groups, uh, tool sets as we call them. So tool sets, I can, how did I get here? Um, down here in the tool dropdown, configure tool sets and add more tools. This I think sends you to add server, but configure tool sets, open this opens this one here.
And that's only for anything, both built-in and mcp. Actually, a lot of the tools you see here, we cleaned this list up. If you use insiders, you see them, then these are actually tool sets already. So we use tool sets internally because edit files has multiple ways to edit files.
We give the AI a few ways. Uh, code-based search has crap, has file search, has different searches as well, depending on what you're looking for. So all of these actually are tool sets in our own backend. And we expose this now as something you can create yourself. So my research tool, for example, has the perplexity tool to ask deep research questions.
And it also has fetch. No, I didn't. No, we can show. Can you talk about anything? Yes, I can. Wouldn't be a talk without mcp. Also, there's a talk to more about mcp. A whole talk track where I'll be talking about mcp, if I can finish my slides. Um, okay.
There we go. So let's talk about mcp. This is doing something. Let's see. Not understanding requirement. It created mock data. Red face writing tests. It wrote tests. It -- oh, it found that there's a no package library. That's sad. And it created the test utility. And then it tried to run the tests.
Um, and then it asked to proceed. So that's cool. So I did the first stage of that mode. Um, but don't need to go too deep. But that's modes. TDD will ask now for -- because we asked it to ask actually for confirmation. So that's why it's now pausing.
It wrote tests. They're all red. So that's good. Good. Okay. When accepted. And let's go into mcp. So mcp servers. Who has already mcp servers set up in their VS code? Good. Okay. Um, one way to get mcp servers is editing json. And there's really -- there's a few other ways.
But let me show you another way. Um, playwright mcp. Who's been using playwright mcp? It's probably one of the coolest ones. Um, so playwright mcp is a browser testing framework. And it allows people to access the browser locally and just take screenshots, run websites, get accessibility audits, a whole bunch of utility in there.
And how to get it for VS code? There's a json blob that you can all ignore and just hit install server. So install server is just a VS code protocol that we use to just wire things up into VS code. You see the same if you go to the extensions marketplace for VS code.
You can hit install extension. That powers the same process. So now I can -- um, don't hit show confirmation. We need to move this down. But install server actually puts this now into my user settings. And as you can see, you can have mcp servers both for yourself. And I have the one for github and for gistpad, which is a cool one.
I can recommend this one. And then, yeah, playwright now. So you can already see how many, um, tools are provided. You can see if anything fails, you can get to the output. If there would be configuration, I can show that here. There's actually gistpad needs a GitHub token. So it's a local MCP.
And what -- have you ever seen that one? So, yeah, no tokens in my configuration. So you can use inputs in VS code configuration, both in the mcp.json and your settings. And inputs, you might have already seen those in tasks.json. That's how you configure your tests and your built steps in VS code.
It's the same system. They're defined up here. So inputs are just an ID, a type, description, a default value. And the password true means it's encrypted at rest after you put it in. So, it doesn't ask anymore. It would if it -- but this shows basically that it already has a token.
But if you enter it the first time, and we can actually try it if I -- Or is it actually key? So, yeah. So, yeah. So, yeah. So, I can show -- gistpad is fun. It's done by -- actually, gistpad, so gistpad.dev, gistpad.mcp. It's mostly -- I'm going to show it off tomorrow as well in my talk.
But it's a fun one that uses gist as a knowledge base. And also for prompts. So, we like this one. It adopts a ton of, like, recent MCP stuff. But I think the main ones we usually see is gistpad.mcp server. Just pet. Just -- Lost in tangent. Yeah. So, if you just want to play around with a really well-done MCP server, then that's one.
Not saying the GitHub one isn't as good, but it's -- there's a lot more coming here as well. So, they're all in, like, reactive development to figure out what the best way for MCP is. Yeah. Yeah. To SSC? Yeah. Okay. Right. And in your case, that server's already running.
So, how to hook up -- Okay. Cool. Yeah. Let's do that. So, SSC, same way, basically. So, what -- just finding my -- max it out. So, go back to my MCP.json. And down here -- Yes. So, to clarify, mcp.json sits in .vscode right now. And it's per workspace.
And that's shared across everybody. So, hopefully, it puts -- either you work on it alone, and it's just for you, or everybody is happy to have those MCP servers. Yeah. Okay. And then the -- what's now -- if you hit add server, you will find what you're looking for.
And then, from add server, you can hit down on HTTP. So, we actually do support both SSC, which is actually deprecated, and streamable HTTP, which is the new fangled, easier to scale, better for your cloud. It's no longer -- in the spec, yes. And we do fall back to it on a client side.
But it's -- it's -- the SSC is really hard on hosting, right, because they have these long-run connections. Long-run point. Long-run point, yeah. Yeah. So -- yeah. So, that's where you put in your -- yeah, in your HTTP, SSC server. And if you want to do it manually, it's really just -- you get a nice autocomplete, too.
So, if you pick a name -- example, and this would be -- the type would be -- not studio -- actually, it's HTTP. Yeah, you already would -- you would use HTTP. And that would already yell at you that you don't have a URL. So, I'm gonna put a URL.
So, this -- this is how -- everything is by default as SDIDL. Once you have a URL, I think I can take this out. Yeah. So, it would be just that entry. So, to get -- yes. So, many demos I see, people hit start here as well, just to see that it's working.
It's a nice configuration that just makes sure it's working. We actually do cache the tools once we saw them the first time. So, how MCP works is that on the first initialization from the client to the server, it shares its tools back. And that's what you see here, the one tool.
So, if you would do it right, you would never know the tools unless you start the server. We actually cache them. So, you don't have to -- we don't have to start all the servers proactively once you open Copilot just to get the tools. Wait, I just want to make sure.
So, that plus symbol about the chat, you would use it -- it would take down . Yes. Yeah. You don't need to get out of agent mode, right? No, actually, you want to be in agent mode. Agent mode -- ask mode will not run MCPs for you. You can -- you can go -- because ask mode is not -- it's not actually -- there's no function calling inherently, right?
Ask mode is really this traditional ask Chachapd question that will answer based on its training there or its context. Yes. So, there is also an ask mode. You don't have tools. You would see it. But I can do actually -- does this still work? Let's do that more quickly.
I think we actually do this still. Yes. So, we're actually blurring the line a bit now. So, if you do -- it's not working. Okay. Yeah. But if you actually reference specific tools in ask mode, it will invoke them for you. But by default, the way where you want to execute tools is in agent mode.
Yeah. Yeah. So, they're all coming through GitHub Copilot. So, they're all using your paid -- you can add your own models. Anybody has tried it? Managing your model? So, I have Gamma 3 through Olama, which runs locally. And I have open routers, Perplexity R1, which is actually a fine-tuned model from -- of DeepSeq R1 from Perplexity.
So, if you haven't tried it yet, basically go into the model picker and hit manage models. And then we can actually custom configure your own API keys from Anthropic, Azure, Cerebras, Gramini, Grock, all of these. Olama is the local one. So, if you have a beefy M4 Pro, I'm still sad how many models I can actually run on this.
But eventually, it's going to be small, powerful models. That makes sense. So -- It might be because of your Anthropic tier, right? That's -- or is it -- I have sign in four. You have sign in four? Oh, oh, the other one -- yeah, you might be in agent mode.
So, we do actually filter them down. So, that's an ongoing improvement we're doing. That's why it's not -- it's right now a preview feature only, because we're still having to correctly wire up which model allows tool calls. So, there's some -- every provider has different indicators of how tool calling works.
And that's one of the matching things we're doing right now. So, you might not see it because it's not on our list. Yeah? Yes. Yeah, you will see that. So, if I do just -- one example here. So, just -- so, A, you can disable them. If you want to be faster, you can do command down and just go through -- So, context seven is the one I want to keep, play right now.
I can disable right now. But once you start using them, if you want to be faster, you can do command down and just go through -- So, context seven is the one I want to keep. Play right now. But once you start using them, this is command up and down, the power user way of navigating those.
So, these are all built-in MCP servers. And once you start, you can actually now -- if I want to be very explicit and I know which tools I want, I can use my tool sets or I can mention specific tools that are in my list. But I can also now just go in and say, what do we want to do here?
Research GitHub metrics. Let's actually use the research one because we created it. Sounds better when I use research here. Use it in a sentence for productivity. And now what happens now is this one has now is an agent mode. We have the research group set or tool set. So, we'll either use perplexity or fetch.
And one of my perplexity keys actually outdated because it failed before. Let's see. Okay. So, you see, I already actually proved this before. So, you see, A, that it runs the server and actually clicked the server to see where it comes from. So, if it would have not auto-proofed this because auto-proof is still on from our previous session.
You can actually go in here and edit what it's sending. Which now doesn't make sense because it's already sent. And then it writes up what it found in this case. Does that underscore ask for anything to do with which code it's in? No, that's just the odd name for the perplexity tool.
Okay. It happens to coincide with it. Yes. Yeah, yeah. It's just, the verb should be before. So, it's their naming. Yeah. Yeah. So, that's now it ran two. It actually did a follow-up query as well and explained it. And now I could put this into a spec as well.
It's actually, I did this before. So, I wrote a spec using, for a community dashboard. So, I did the research using perplexity and then ask it to write a spec from it using a little query I have here for the spec. So, that's one way you can quickly get things done.
And just to point out this one, it's pointing it to a spec. So, these are actually resolved by the AI. So, if you point it to your files, we do actually validate those as well. So, if you get them wrong, I think they're underlined. You can also click them.
So, you get all the markdown goodies. And then you just ask it to write on the spec. Do nothing else. Use perplexity to look up stuff. Don't lose details. Keep updating the spec. So, that's one way to work on specs. That's probably more tools we're going to add here.
Yeah. So, that's MCP. Any other MCP questions? Per MCP? So, MCP itself doesn't run anything except when you support sampling, which we do on Insiders. Sneak preview for tomorrow. So, but yeah. But if you use sampling, actually, I guess I have to explain sampling. So, sampling is a way for MCP to reach back out from the server to the client to use the LLM on the client.
And you can often think the best use cases are to summarize. Use cases are if you want to reduce the amount of tokens you send back to the client to explain something. So, there's a few ways. But overall, there's not enough integration of sampling. So, we're the first ones to get it out there because we already have the LLM exposed.
So, that's cool. Yeah. To pick the model kind of? Yeah. To pick the model kind of? So, you pick the Dart tool. So, you clearly have multiple. Yeah. And while you're deciding what tool you need. Yeah. So, you can prompt your way to it. Right. But it's still .
Right. So. So. So. Yeah. Yeah. So, what I recommend is A, in your modes, boil down the tools to what you actually need. So, reducing the tools manually. Either deterministic already in your prompt. So, this prompt could have tools for like what it should actually do. Right? That would be one way.
And then I can configure what I actually want to have here. Like this should be only doing perplexity because it needs to do research and that's all it should do. With custom mode. Yeah. Custom mode. Yeah. Pockets of like, go like this, go like this. Right. Then you kind of pick your pockets of tools.
Yes. So, custom mode is one way. And then the other one is you can actually mention specific tools. So, if you go in here into add context. And then you can actually point it to specific tools. So, you're not doing like the look up things on GitHub and you try to find the right verbiage that it gets the right tool.
You can just actually mention the tool of it should, for example, resolve library ID. So, so you can just add these here and then it will be handed to the AI of like, these are the tools the user wants to use. It's it's still it. Yeah. Right. Yeah. Yeah.
Yeah. Yes. Yes. Tool calling is inherently always, even in this case, we're telling the AI you should use it, but it might not use it. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Tool calling is inherently always, even in this case, we're telling the AI you should use it, but it might not use it.
Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. So my timer's down to zero. Maybe just go back to the slides to wrap it up. There are more, right? So, um, right coding. So we showed workspace instructions. We showed dynamic instructions, which are only applied to parts of the tool set. Uh, we showed custom tools, a little playwright, deep research.
Uh, I haven't showed using web docs. Like actually one of my favorites just pointed to an existing repo and say, read this repo if you have questions. MCP works great for that. When I work on MCP server, I just tell it, look in the TypeScript SDK server for the context protocol if you have questions.
Because we have cross repo search, it just works. Um, the agent actually has access to problems and tasks. So if you have tasks set up and you have linting set up, things will just work. So make sure those are set up in your template. Um, generate commits I showed.
And then fine grained review. You can pause at any time. If it asks you questions, you can always type something in and keep steering it into right direction. And you can trust read only in specific tools. And I showed you also editing. So, yeah. Uh, instructions. Keep refining them as it makes mistakes.
One of the key ones is commit often. I didn't show commits now, but anytime you have a working state, just make sure you commit it. So AI can continue making mistakes and be creative. And then last one, there's a clear pause button in the lower end. So AI goes off and you're like, what is it doing?
Like, is it doing the right thing? Just, just pause and review. And that's possible as well. Uh, I showed a bunch of this for spectrum development already. But it's really about having a spec, having a blank and done, a plan and doing more custom prompts and tools, which I showed.
Um, I showed result prompts. Um, there's more MCPs for database access and logging and project tracking like the GitHub MCP. And there's also access to actually tests and do debugging within the agent as well. So if you ask it to test in development, like we did, it will actually start running the tests if they're set up in VS code correctly.
And then we talked briefly about models as well. So if you want to use O3 for any of the cool stuff, the deeper thinking, you can do that as well. Um, spec driven is really about focusing on the spec. Um, on the spec. And I think a great way to do that is just create the spec from all the conversations you had about the spec.
So one way, if you have a transcript from a meeting about the project you want to do it, just feed that in and make sure you call out what the final decision is. It's a great way to have meetings, but it's also a great way to not write the spec yourself in the end.
Are there any tools to determine whether the spec is good or not? Like on the opposite side of that, like it's something like a spec for a project . Right. It's like a, one of the, one of the- I would ask AI. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah.
And then he had to critique the spec and say what are things that it's missing and how could it be better and stuff like that. And then he basically argued with the LLM about the spec until he gets to the state that he wants. Yes. Or arguing with AI is one great way.
So I have focused on one run prompt or prompt is where's one critique idea is one I like of just ask me three questions about my idea, right? Just have AI go into thinking mode. Like what would you ask somebody for feedback? And have it critically analyze your stuff.
So these prompts, like those are basically the next level of prompt crafting where you don't just ask it to code, but pull it in as a thought partner, as a design partner, as somebody who can poke holes in your deal along the way. So. Yeah. Um, further on review steps is.
Yeah. Good. Client. You now can create one. So we haven't, don't have a plan mode built in, but we also know ask edit agent will not be there forever. So it's, it's a, like series of evolutions. Rue has way more modes that you can customize. So I think we want to allow developers to create their own.
Um, cause I even see very few demos of people inclined using plan. They just give it a thing and then it runs. Right. So. Yeah. It is. Yeah. I'm planning in, in, in vibe coding, you even do planning and then writing the implementation plans. So you wouldn't spend way more time on that initial, just what, and how we're doing things and then you let it implement.
So that would be even plan, write the implementation plan or like write the spec, write the plan and then implement. So you would even have three modes if you do it correctly. And then, yeah. So that's already the last one. So takeaways. You got to experiment. You got to figure out what works for you.
Like at what point can you just give it a task and it runs with it. At one point you want to give it a task and write a spec first. And then implement. Keep giving it feedback and iterate. So never just accept a bad answer. And then really work on your process.
Like what, what works best for you? What works best for your team? And use modes and prompts and instructions to ingrain that. There's some bonus mistakes. You can screenshot. Yeah, please. One more. There's one more. One more screenshot. You can end up like Emacs where it has major modes and you can have multiple minor modes.
Miner modes, yes. So that's how you end up with prompts and custom modes right now. So you got to clean this up too. And then, yeah. Lastly, I think there is a sweet spot for how you define your code basis for AI. So you want to have well-structured. self-explaining code.
You want to have the instructions set up. You want to have examples in your instructions. I want to keep instructions updated. So that's it. Okay. So that's it. Okay. So that's it. Okay. Okay. So that's it. Okay. So that's it. Okay. So that's it. Okay. So that's it. Okay.
So that's it. Okay. So that's it. Okay. So that's it. Okay. So that's it. Okay. So that's it. Okay. So that's it. Okay. Okay. So that's it. Okay. That was my unplanned workshop. Thank you for coming. Good job. Thank you for coming. That was really good. Good job. Thank you for coming.
Good job. Thank you for coming. Good job. Thank you for coming.