AI Templates: Gabriela and Aishwarya

Thank you so much for coming. My name is Gabriela de Queiroz. I work at Microsoft, as you can tell. I'm director of AI, working with startups and doing events like this, outreach, talking to founders. And then I have Ash, my colleague. Hi, I'm a senior advisor with Microsoft for startups.

So pretty much spending my day today talking to startup, helping them build their AI tech stack, primarily on Azure, but no bugs. I'm Pamela. I'm a Python cloud advocate. And I spend most of my time working on open source repositories, like the ones we'll be using today. And then also doing live streams and conferences and all that sort of fun stuff like here.

Awesome. So we'll have a lot of like hands-on. But before that, we are going to set the stage, share a few things, and we have the instructions so everybody will follow. So today we are going to talk about the AI template. So it's a way for you to run your AI application in minutes.

And the agenda is more or less like this. We are going to talk about Microsoft for startups, something that we call Founders Hub, and then the partnerships, the AI templates, and then we go to the hands-on workshop. How many of you have heard of Founders Hub? One, two. Okay.

Okay. Well, we have something called Founders Hub, where we offer you a bunch of things. Not only credits, but a bunch of other things that I'm going to talk about. And the sign-up process is very easy. It takes you like less than five minutes. It doesn't matter where you are in the stage, if you only have an idea, or if you already have a startup and you are incorporated.

It doesn't matter if you have funding or not funding at all. And there are like several cool things about this platform, the Founders Hub platform or product. You have the benefits piece, but you also have one of my favorite pieces, which is like the guidance. So you have one-on-one calls with experts to ask technical questions to like anything like go-to-market or like how do I go about my strategy, anything that you can think of that you can leverage.

And then there is something new, and Ash, please chime in if I'm missing something, but there is something new that we added to the platform, which is called Build with AI. And that's where we are going to be focusing. It is an open source piece. It's all on GitHub, but you can access through the Founders Hub, which again is the Microsoft first startup platform that everybody can sign up and join.

One of the cool things is like you get a lot of credits, right? So like everybody likes free credits, especially when you are trying things out. And we offer up to $150,000 in Azure credits that you can use across Azure services, including, which not a lot of people know, Azure OpenAI.

So we have the same OpenAI APIs on Azure with the whole security compliance capabilities of Azure. Plus a lot of like other benefits like GitHub Enterprise products, Microsoft 365, LinkedIn Premium and more. And then you can use Azure, for example, Azure AI Studio. You can use models from OpenAI, as I mentioned, by Lama and others.

So one of the main slides, if you are looking for credits, guidance and so on, this is like the place where you take a picture and you get the URL and you apply in minutes. Do you want to talk about a little bit about more about the cloud credits?

Yeah, I think one of the most important things that I personally like about the entire Microsoft for startups program, it's not quantitative. It's not just that we're throwing out a bunch of credits at you and you're like, build it yourself or like, you know, figure it out yourself. There is a lot of cross collaboration that's happening.

So for example, like we spoke about GPT-4, GPT-3.5, Turbo, etc. Right? Our product team work very closely with the startup. So if there is any private preview happening, we are working personally with startups to have them on board, for them to like try out these products, give us feedback that, hey, this is something, this is one of the feature capabilities I would like to have.

And that gets like passed on to the product team for them to work on. There is my team, which is AI advisors, which is working with startups one on one. So you're getting like expert advice. And as a startup, I think like getting advisors, getting time, timely advice, getting like timely resolutions is really, really important.

And those are the qualitative benefits that I personally feel is really important when you're working with Microsoft for startups. It's not just AI advice, you can get a bunch of different experts by you, like there is a tool within Founders Hub, which helps you match with different experts. You have a question about infrastructure, you have a question about Kubernetes, you have a question about how do I think about like my go to market?

How do I price my subscription model? There could be a ton of different things that a startup is going to be doing, and you don't have enough resources to help you with that. And that's where you get experts from Microsoft, who are working in a variety of different products to help you navigate these challenges.

So think about the qualitative benefits and the support that you're getting, apart from just the credits. So that's, that's something I would like to add. Yeah, and then we go with you throughout all the stages. If you only have an idea, or if you are building, if you are in the scale phase as well.

So it's for any startup at any stage. As I mentioned, we are helping with all the cutting edge AI tools and helping you streamline the AI development. And then we have something like very special as well, which goes beyond the Founders Hub, which is the partnership that Ash is going to touch a little bit.

So one of the things I was talking about, right, Founders Hub is a platform that becomes like an intake for all the startups. This becomes your like, if any of you are like familiar with like YC's platform or like portal, right? Similar to that, like we have Founders Hub as a portal, which has like end-to-end everything.

It keeps track of how much credits you've consumed, what are the products that you've been using, what are the benefits that you've gotten. It's not just the Azure credits, there's a ton of like third-party credits that you get, and productivity tools that you get for free. And as a startup, you need all of these tools to like help you build that ecosystem.

So that's one of the most important things. Now, as a startup, you're getting started at Founders Hub. That's where you get like $150,000 in credits. As you grow in your journey, if you're getting involved with one of the Microsoft strategic VC partners, which includes M12, Y Combinator, Neo, Alt Capital, All Chemist, to name a few, then you get a bunch of extra credits.

So outside of the $150,000, you get an extra credit, which becomes total of $250,000 in credits. And then you get like startup development managers who are like, you know, dedicated cloud solutions architects to work with you. Pegasus program is another elite level partnership program that we have with startups, which helps them go to market, which helps them get you onboarded on Azure marketplace, if you need help, like, you know, to amplify what you're doing.

So you're coming and telling us that, hey, Microsoft, help us reach your like, you know, millions of viewers or like millions of people who follow you on your LinkedIn page or on your blog, and help us amplify our startups. So we do blog pieces with you, we do video interviews with you, we do YouTube interviews with you, which helps you amplify that product.

Outside of that, we're also getting startups involved in conferences. So Microsoft takes up a lot of like, speaking slots at conferences, and we get our startups that we're working with, come and talk about their products on stage. Yeah, we have, we have one, for example, Nixla is coming to talk about their product, and they got integrated to Azure, so you can run the time gen, that they have.

So they are coming to talk. Yeah, so exactly. Like, so it's a lot of like, amplification that you're getting from Microsoft's perspective as well. Yeah, I think we briefly covered about some of the pain points. I'll give you like a quick TLDR, right? As a startup, if I'm building a startup, I really do not have a lot of time to like spend two weeks on getting a support ticket cleared off, right?

Like, I need quick solutions. I need really, really fast experimentation pace. I want to try out different things, different tools, different models, very quickly, and then decide for myself, what's the best fit for me. And that's one of the places where one of the pain points that we've heard from startups is that, hey, it's really hard to go through like tons of documentations and figure out how to get started, like the first quick start program to like run a particular application, it takes us weeks altogether to run our first end to end application.

That's the solution that we wanted to target with AI templates. So, and one of the major major disclaimers, it's not 101 level examples. These are really, really complex examples. So, one of the ones that we're going to be showing is rags with AI search. So, these are really complex examples that you can run within minutes.

And that helps startups get started very quickly and run these applications, customize it as they want. Because it's open source, any of the issues that you're facing, just put it on GitHub and then it'll get resolved. We have an entire Cloud Advocate team working on these templates. If you have any requests that, hey, help us with this particular other template, we're here to help you on that.

All right. So, now it's the fun part. So, now we are going to be doing like hands-on. We have the setup instructions in this URL. You can go over there or scan the QR code and we are going to go through all the setup instructions. Actually, Pamela is going to go through.

Okay. Well, let's make sure everybody has this URL because you are going to actually want to have that doc open. So, it's aka.ms/aie-workshop or you can scan the QR code. And it should open up a document that looks like the little screenshot there that says, you know, quick start with AI templates with setup instructions.

And if you have any trouble, any issues, we are three. So, you can raise your hands and then we'll go and help you. Does anyone not have the doc URL yet? Still working? Okay. We need to just have a whiteboard. Remember whiteboards? Whiteboards were great. Yeah, this is the whiteboard.

Do you think they'll mind if we just -- did you bring spray paint? I'll get some shots. Okay, great. Thanks. All right. We good? All right. Okay. Just holler at us. Because we're -- I'm going to step through it first. So, you still have time. But we want to make sure you all have that doc.

Okay. All right. I just remembered how PowerPoint works. Okay. All right. So, okay. So, if we look at that -- those instructions. The first thing is that we -- you do need a GitHub account. So, if for whatever reason you don't have a GitHub account yet, this is a good time to get it.

You should be able to get it for free. And if you do have a GitHub account, just make sure you are logged into your GitHub account. And what we provided today for this workshop is two different things that are going to help you deploy these templates. So, one is an Azure Pass.

So, this is really cool. Because normally things cost money. But we're giving you an Azure Pass, which will create a Azure subscription for you, which will give you up to $50 worth of credits. And it will expire in seven days. So, I guess you can keep working with it for the next seven days.

And it'll certainly give you enough to get started during this workshop. So, that's really cool. So, none of you should have to worry about any of this costing you money. We don't want -- we don't want that to happen. And then the other thing that we've got you is, normally, when you're using Azure Open AI, you have to sign up to get permission to use it.

You actually have to go fill in a form and get approved for it and get your account opened up for it. So, since it takes time to fill up that form -- and the reason we do this is for responsible AI reasons. We want to make sure people are using Azure Open AI for good reasons, which is good.

I like that about Microsoft. But it does take time. So, we have set up this Azure Open AI proxy that all of you will be able to use during this workshop. So, you're going to use Azure Pass so that you can freely deploy these things. And then, Azure Open AI proxy so that you don't have to worry about getting permission to use it.

So, that's what a lot of this setup is. So, the first step is to get that pass set up. So, I'll demonstrate from here. Let me go to my Firefox. Okay. So, when you go to this Azure check-in link, it's going to ask you to log in with GitHub.

Oh, let me do it in my -- I've got three browsers open right now. Here we go. Let's do it in Chrome. Yeah? The initial lamp that takes you to the OneDrive, it says request is blocked. Request is blocked? Uh-oh. Is -- was anybody else able to open the dock?

Yeah? The Wi-Fi is wrong. Okay. Yeah. The Wi-Fi was -- Oh, sometimes you mean that maybe the Wi-Fi is bad, it misinterprets it. It could also just be that the URL was slightly wrong. I think that's what happens for ACCA links. Okay. All right. Cool. So, then that has a link to this check-in website.

So, on the check-in website, we see that it has two options. Either create a GitHub account, if you don't have -- if you don't have one, or just log in with GitHub. So, I'm going to click log in with GitHub. And here we go. I was already logged in in this browser.

And then you'll see this Azure pass. So, this is just a promo code that you're going to use for the next stage. And you can copy to clipboard. And there's this button here that says get on board with Azure. So, when you click on that, then you get brought to this screen here.

And it says what Microsoft account I'm currently signed in as. So, you know, so at this point, you have to figure out what Microsoft account you want to use. So, you could make up a new one. You can just, like, make a new Outlook address. I made one this morning.

It's no big deal. You could use a Gmail account. I don't recommend using your work account if you do have a work account, just because work accounts tend to have a lot of restrictions on them. And I just think things will not work out. So, I recommend some sort of personal account.

Either create a brand new Microsoft Outlook account today or just use your Gmail. Whichever those you want to do. So, you do need to be logged into some Microsoft account. So, it says, okay, I'm going to be logging in with my Gmail. I confirm. I enter the promo code here.

And then I have to do this horrible. I don't, like, what is, oh, okay. DBW, okay. Okay. And then in this case, I got an error. And that's because I did already redeem my pass for this account. But you should get a success if you haven't redeemed your pass for the account yet.

And then that should give you, set you up with this new subscription in your Azure portal. So, you can see here, now on my pamela.fox@gmail.com, I've got actually two subscriptions. Because I'm actually a paying user of Azure. So, this is my paid Azure subscription. You can see I'm spending $40 this month.

And then here's my sponsorship, which I, you know, won't be paying for. So, that's, and if you're doing a brand new account, you're only going to see this sponsorship here. So, that's the expected flow to being able to get set up with this Azure pass. You don't have to do it right now.

You could wait until we, you know, break into hands-on time. Because then it probably is easier for us to walk around in case there's any issues. And just to be careful, make sure that in the next step, wherever you're using the subscription ID, you use it for the Azure sponsorship one, and not the other one, or you're going to get billed on your, on your credit.

Yeah. Yeah. So, if you're worried about that, just make a whole new account. That's what I have in my, let's see, that's Firefox. So, in, um, in this account, and in my other browser, this is an account I made, uh, this morning, uh, with just a new outlet account.

PamelaFox, AIFair@outlook.com. It's pretty good. All right. And that one only has, in this case, I only have the sponsorship. Yeah, question. I see. Okay. So, how many of you have sons with Minecraft? All right. Uh, probably best, as I was saying, that is why I used a whole new browser, not my normal browser.

Um, so did it already, if you already granted you the past. I just went with, you know, it came up with my email address. Okay. And then it takes me to my account, but there's nothing here that says anything about Azure. And when I put in Azure, um, portal.

What do you do if you don't have the hardware? If you don't have? Uh, I don't think it makes you write it. It shouldn't make you, oh, to get a whole new Outlook account. Uh, but once I, uh, log into my office, ask me the new pass information, and then if I enter my normal number, it says it's not a management number.

Oh, okay. I didn't remember that step. Maybe we can just give him our number. Okay. So if anyone else is having trouble, let me know. I can help. Yeah. Do we want to just get through this stage now? Or how do we want to do it? Sure. Like, do you want to like, see what other issues everyone has?

Yeah. Thank you. Thank you. Here we go. So, um, how many folks are working on getting the Azure pass redeemed right now? Are people working on that now? Okay. And how many of you have actually successfully gotten it? Yeah? Okay. Good. Good. Okay. Cool. Okay. So we'll just get through that.

So what should I move on at some point? Or should we wait? Okay. All right. So we'll spend five minutes getting through this step and answering any questions. So, uh, so go ahead if you haven't yet, you know, try to get your Azure pass redeemed. And, uh, we'll just make sure we have enough time to get that redeemed for everyone.

You can pretty much use any account, like your university account, personal account. Just not work account. Just not work account. Would the university account still be okay? Because university tenants sometimes have restrictions too, don't they? Can try. I've used my university account, it works fine. Okay. So with these ones?

Yeah. With this one. Okay. With Azure one, yeah. Okay. No, but with these templates. Okay. It might not work. Um, do you all remember if we need to add the address and all of that? I don't remember when the Azure pass. You have to. Okay. So you have to.

Thank you. Wait. So how did he get over the number stage? He did. I just, I just, I just, I just asked him for a 204 number. Oh, okay. Starting 646. Okay. The next step is the proxy. Okay. We'll see until Gabriella's out of questions. Yep. Uh, raise your hands if you've got that pass in.

We're just trying to see. Uh, okay. Activated it on Azure. Okay. Okay. Okay. Okay. Yeah. No. Oh, question. Okay. Clicking on Word. You already went to the next time? Thank you. Thank you. Yeah, so to confirm that you have the Azure thing working, you can go to portal.azure.com and then it's going to look super empty.

You'll see nothing under resources, but if you click on subscriptions, then that's where you should see Azure Pass sponsorship. So your subscription is basically like, it's kind of like a billing account sort of thing. So that's how you know you've got it is if you see that under your portal under subscriptions.

All right, so let's try, now let's look at the proxy. Okay, so for the proxy, that's another URL which is linked from the docs. So I'll go ahead and open that. I think I've already logged in on this one. Okay, so what you're going to see is this page that looks like this and it has this login with GitHub.

So then I log in with GitHub. And now I'm logged in. You can see it says welcome to my GitHub username at the top. And when I scroll down, I can see an API key and an endpoint. So here's the API key and here is the endpoint. So this is what we're going to be using in order to interact with OpenAI, Azure OpenAI models.

And we'll just have to specify this key in this endpoint when we're using a template. So you're going to basically like, you just keep this open so that you can continually copy and paste these two fields here. I'll just like create a small node where you're like, you have your subscription ID, your endpoint, and your key copy pasted on that.

Yeah. And declare by, normally, I don't recommend using API keys. And I've got all these like videos and blog posts about how you should never use API keys. Because when you're actually using Azure OpenAI, you can do keyless authentication using this thing called managed identity. And we have a talk coming up next week about that.

But in order to use this proxy and, you know, take care of the permission issue, we are temporarily sinning and using keys. Okay. So that's, that's a proxy. So you just have to log in and then you should get the info for the proxy. Okay. All right. So now I'm going to step through one of the actual templates.

So, you know, all the templates are open source repos, but we've put together instructions specific to this workshop in, in this, uh, read me here, these three read me's and really specific to using that proxy. So we're going to ramp up in terms of complexity. So we'll start off one that's, uh, really simple.

Like this is one where you, you know, it's, uh, just to show you how things are working and, and get things going. And then we'll move on to two different rag applications, uh, that are more sophisticated and ending with our most sophisticated one that has been deployed like a hundred thousand times at this point by Azure developers.

So it's a very, very popular one. Um, so let me start with, you know, this first one. So we go to the read me and the first step is to open the project using GitHub code spaces. Uh, has anyone here used GitHub code spaces? Okay. A few people. Okay.

So GitHub code spaces is very cool. Any GitHub repository you go to, you can open them up in a code space. So you can like start hacking on that repository immediately. Uh, and then we can also customize like the environment for that. So what it's going to open is actually a VS code in the browser that has that project loaded in.

So we're going to do you know one, one very cool thing about GitHub code spaces is like, you know, when you share something with like someone and they say, well, it was working on my machine, but it's not working yours, right? That problem that we all have with all the setting, the local environments and all of that code spaces, you, if you use code spaces, you don't have that problem because I will have the same environment as Ash, as Pamela.

So I will not have the problem. Like it's not working on my computer, but it's working yours, right? So it's one of the pain points that code spaces came to solve and all of us, we use code spaces on a daily basis because setting up your local environment in your computer can be very, very painful.

Um, so, so yes. Does everybody have access to their Azure subscription by now? Anybody who's not? Okay. If you're not, you can just, okay. Can we just like try to solve that and then we go through? Yeah. Okay, cool. Azure? The proxy? Well, that's a proxy. Yes. So the question is, how do you get to the repo?

So there is a link. This one? Yeah, this link over here. It takes you to the GitHub repo with all the, everything that you need. It's here. With the code and everything. No, I can't edit. Yeah, there's like a blank page. Skip the blank page and then go to the not blank page.

It can blame on me. I'll help you. Control F hands on. So the other question was, how do you know if you have the subscription? You go to portal.azure.com and then you find subscription. Yeah, and then you click on subscriptions. So now I'm going to step through the instructions just for the quick start one and then we'll really set everyone loose and so that we can walk around.

So this one is, yeah, just a simple chat application. So we're going to start off with running this local, well, I'm going to call it local. I'm inside GitHub code spaces, which is VS code in the browser in this containerized environment, but I'm going to start a local server inside GitHub code spaces.

So I like to start with local development first when I can, so I can like make sure that things are working. And then once I know it's working locally, then I can deploy it to Azure. So we're going to start with a local server inside the code space. So looking at the instructions here, so I've got the project open.

The first step is to make a .EMV. So we have a sample here. So I'm just going to copy and paste the stuff from the sample. And you know, the one negative about using code spaces with conference Wi-Fi is that it is a online environment. So if you want, you are also welcome to try these out, these projects locally.

This is just how we can guarantee less issues with developer environment setup. Okay, so you can see in this environment file, we need to specify the endpoint and the key and the deployment. So for the endpoint, we're going to go to the proxy page and grab that endpoint URL and put that here.

And so it looks like this HBS politeground blah, blah, blah, blah, slash API slash V1. That's what your endpoint should look like for all of these. So literally, the OpenAI SDK is going to send requests to this endpoint, and you know, get responses from it. The next step is the key.

So we go here. And we're going to copy and paste that key. There's my key. Love it. And then the deployment name. So this is the name of our deployment of our GPT model. So if any of you use OpenAI.com. Okay, so on OpenAI.com, you just use things by their model name, you just say, oh, I want to use GPT-40.

I want to use GPT-35 Turbo. On Azure OpenAI, you actually make deployments of models where you say, okay, I want a new deployment of GPT-35 Turbo, and then you give that a name. So we've actually named the deployment the same as the model here to make it either more or less confusing.

But that is one big difference between OpenAI.com and Azure OpenAI is that the Azure OpenAI has this notion of deployments. But everyone's going to put in GPT-35 Turbo here. So there we go. Okay. So I've got a .env. And the next step is to run the server. All right.

So this is running the Python backend. We can open this up. And so this will, when I click on this URL inside the code space, it'll actually open up a very different URL. So this is the, this port running on side the code space. So it's got this like really funky URL.

So then I can say like, write a haiku about AI engineer world fair. Okay. I always do all my testing. I say to write a haiku because otherwise LLMs get really, you know, verbose and you're just sitting there waiting forever. So there we go. That is working. That's getting back responses.

So you, you know, send any message that you can, haiku about San Francisco. And there we go. There we go. I actually just watched a video yesterday about how the golden gate bridge was built. It's fascinating. Okay. So there we go. Now it is actually running. So this is running the app that's inside the source folder.

So if you want to explore it, you can, this is a court application. Um, is anyone actually heard of court? It's not well known. Okay. Who's heard of flask? All right. Core is just the asynchronous version of flask. Like literally probably flask will become court at some point or vice versa.

So core is just flask, but async. Uh, and we always want to use an async framework when we're building applications that you make, uh, calls to LLMs because we want better concurrency. So you'll see for all our samples, they're all either using court or fast API for the Python backends, because those are the ways that we can have async.

So you'll see, uh, async. If you haven't, you know, worked with async a lot in Python, you just see asyncs and awaits all over the place. Um, because that's how we build, uh, with async backends. Uh, but this is the actual code that's happening here. We're streaming in the response from the open AI client and, uh, we're getting back the responses and we're streaming it back to the front end using something called JSON lines or new line delimited JSON.

It's just a way of streaming one line at a time. So let me do a longer one just so I can show you, uh, how the streaming works. Do do do. And streaming is also another general practice. If you're, if you're making a user facing application, that's making a call to an LLM, you really want to stream that response in ideally because then it's going to appear faster to the user because the time to get the first token, as soon as you get that first token, you can start streaming in that first token.

So we are actually streaming in a token at a time. So like write, uh, a long essay about San Francisco. Okay. So we should see this actually stream in here. Uh, once it, so it still takes some amount of time to get that first token, but then once you get that first token in, there we go.

That was so fast. I wonder if the proxy is actually making it be a bit different. Let me see if I can see it in the stream what happened. Um, I should see it in the response. Fascinating. Um, normally I can see the stream tokens in the response here.

I can try it with another of ours though. Okay. All right. So we'll have to trust that, but, um, but yeah, there you go. So this is just our, our getting started experience. So this is the local server, right? So we are running the local server, but we are hitting up as your, uh, as our, uh, you know, so we are hitting up a cloud resource and that's because it's hard to have a local, uh, a local GPT-3-5.

Now, if you want, you can actually use these models with like Olama. I don't know if any of you use Olama, but Olama is a really great way to run small language models. And so I add support for Olama to all my samples when possible. So if you want, locally, you can actually run against these small local models, like five, three or Llama two or whatever, uh, it's just not going to be the same thing because they're different models, but that is an option for local development.

All right. So that's all set up. The next step is to actually deploy this to Azure. So that's when we are, you know, going to be using that Azure account that you set up. So the first thing we have to do is log in. So we're going to do AZD off login.

And we're going to use the device code flow when we're inside a code space. So that's going to have us copy and paste some, uh, some code here. So I press enter and it opens up this new tab and I'm actually going to open this up in, uh, another browser.

We'll just pick a random one. Okay, here we go. And then I'm going to grab the device code here and then I put it into here, sign in. Okay. So I guess I'm using my Gmail here. All right. So I am signed in. Okay. So now I've logged in.

So you want to make sure you log in with the account that you just set up for that Azure pass. Then we're going to create a new, uh, AZD environment. So this is kind of like a new deployment environment. So a lot of times when I'm developing these, these samples, I've got like 20 different deployment environments where I'm trying out different configurations and stuff.

So I'll make a new one here for this one in the chat. Quick start. So you just give it, give it a little name. And then the next step is that we need to set some environment, some AZD environment variables. These are like our deployment variables. That's going to tell the, um, the infrastructure how to provision everything.

So we're going to tell it to not make Azure open AI because we're using the proxy. We're going to tell it the name of our deployment. So I can just go ahead and copy and paste those two things. So I'll just paste them here. Okay. And then I need to tell it the key.

So this is the same key that we did earlier, but now this is going to be used by the actual deployment flow. So I go and find, oh, I deleted way too much. Come back. Okay. So just delete that part and grab the key. All right. So now I've set the key for deployment and then I'm going to set the endpoint for deployment.

And here we go. Where's that proxy at? There we go. All right. Okay. So now I've set all these AZD environment variables. These are going to be used when we are configuring the infrastructure. And so then we run AZD up. So what this is actually doing is that we're using this, um, infrastructure as code.

Does anybody here use terraform or bicep or arm? Okay. Yeah. So terraform is probably the more well-known one. So at Azure, we have our own version. Um, originally it was arm and it was JSON. Now we have bicep, which is like a better version of arm stronger. Uh, but these are all our bicep files.

You can also write terraform if you want to do that. I just, I know bicep more than terraform. So we are using bicep, which is infrastructure's code and that bicep describes how everything is going to be made. So how are we going to make the open AI? How are we going to make analytics, container apps, uh, roles, all that stuff.

All right. So I need to select an, a subscription to use. So I'm going to use the sponsorship subscription. Um, for many of you, you might just have one subscription and then a location. This is going to be where like our container app is going to go. So I'll just pick a random location there.

Okay. And now it is packaging everything up. Uh, so it is, yeah, it's actually building Docker image. So this one gets deployed to Azure container apps. That's like an Azure option for running containerized applications. We also have like Azure app service, Azure functions, Azure Kubernetes. Uh, but for this one and the second one, we're using container apps as a, you know, a nice place if you're using Docker.

How many of you like Docker? I shouldn't say like, how many of you use Docker? Okay. It's a similar, okay. You know, I don't want to presume. Um, so if you do like Docker, you know, Dockerized, um, environments, you know, this is using a Dockerfile. Uh, you can see the Dockerfile here.

Uh, you know, installing requirements, running the server, all that sort of stuff here. Uh, so then it's provisioning the resources. So it's going to do that whole step. And then I've already got one, uh, you know, pre, uh, already deployed here. So we, it, once it's deployed, we'll have a container apps URL.

And this is a URL that you can, you know, tweet, share publicly, whatever. Try not to use up all, I mean, I guess use up all your credits, whatever. Uh, hi, LLM, what's up? I don't know. I never know what to say. Um, I don't have feelings. Thanks. Uh, so now it is deployed there.

And then if I can, I can look at my portal and see, you know, actually see what it made. So I can go here and look at maybe container apps and, uh, that's actually a different one. So let me look for, let me go. I know. I think I just have to remember which thing I, I wish I have got three different portals going right now.

Here we go. Uh, so we'll do, uh, a quick start. Maybe this one, there we go. So once it's all deployed, you can go into your portal and actually find the resource group that was made and then you can see what was made underneath it. So here we have a container app, you need a registry.

These are everything we need in order to make a containerized app. So that's the flow for that one. The flow is similar for the other ones, but the other ones are a lot more, uh, a lot more sophisticated. I'm saying like they're ones that people are actually using, uh, in production.

So I'll just talk about RAG. Um, RAG is stands for retrieval augmented generation. This is our solution for the fact that LLMs like to make stuff up. I mean, that's kind of their, it's kind of their, the way they work. They're just word prediction machines. And so if you get them to predict something that they don't know, then they'll, they'll go ahead and predict something, right?

So how do we get LLMs to give us reliable output for a particular domain? We can use retrieval augmented generation. And so how this works is that we get in a user question. We use that to search some sort of database, whether it's a, you know, a search engine, a vector database, whatever you want.

We search it, we get back results, and then we send both the original user question and the search results to the LM and say, Hey, now please answer the user question based off the search results. And then you'll get a really good answer. So as long as you have a very good search search engine.

So you really want to pick really good retrieval mechanism search engine at that step. Because if you get good results, then you'll get a great answer from the LLM. Because LLMs are incredibly good at synthesizing information, summarizing based on what they see. So they just need to have this, you know, really good search step.

So we have two different RAG options that you can try deploying today and over the next seven days. So the first one is RAG on Postgres. And so this is, if you like already had a database, imagine you've got like a retail website, you've got a bunch of products that you're selling, and you wanted your customers to be able to ask questions about those products, right?

So you can, you can just search off those table rows for, you know, what the user is asking about, get back the matching table rows, and then you pass those table rows to the LLM and say, hey, answer this user question about the table rows. I'll show you what that actually looks like when deployed here.

So here I've got, you know, a product table for this outdoor, outdoor shoe company. And so the user puts in, puts in a question here, and we get back all these results, and where they say, you know, this is just info from the rows. And we can look at the thought process here.

And so we can see that actually, actually, it's even, I did the fancy one. Okay, I'll use the simple flow first, so I can show the simple flow. And, and then we'll move on to the advanced, advanced RAG. Okay, all right, so here, now if we look at the thought process here, we get the search query, we use that to query the database.

So these are the rows we get back from the database. And we do both a vector search and a text search. Now, I'm sure you've heard lots of things about vector databases. They're great, but you need vector search and text search, and you need to combine those results together.

If you use vector search alone, you will not get good results. I've done, like, hundreds of evaluations of this sort of thing. You need to have a hybrid search, which is going to do both a vector search and a text search. So for Postgres, we can use PG vector for vector search, and then we can use their built-in full text search for text search, and then we can combine them together.

So we get back results, and then we, you know, send to the model. We say, "Hey, your job is to answer questions based off of sources. Here's the user question, and here's the sources." So at its simplest, this is what RAG is. This is the actual call that we make to the model, is please answer according to these sources.

Here's the question. Here's the sources. We get back the response. Now, we can get a little fancier with that, so we go to the advanced flow here, and I say it's fancy, but I think it's actually what most people are doing at this point for their RAG at the least.

So in this flow here, the first thing we do is we take the user's question and we rewrite it into a better query, because user questions aren't really optimized for searching databases or searching search engines. So we first ask an LLM, like, "Hey, here's a user query. Make this into a better query." So this is what we can call the query cleanup phase or the query rewriting phase, and it's a really useful first stage to have in a RAG application.

And so then we get back, you know, search results, and then, well, we get back the query, right? So in this case, it actually ended up giving the same query in this example, and then we get back the results, and then we send it. But this gets particularly helpful when we have multi-turn conversations.

Like, if I type in, let's see if it's going to perform for me today, more options, right? More options on its own is a terrible query to send to a search engine, right? What is more options? More options about what? So I'm hoping that my query rewriting phase is going to clean this up.

Now, yeah, see, I should test this before I do it. All right, so I can demonstrate that more in our other example, because I have done that demo more. This one's -- repo's a little fresher. I made it like a month ago. Okay. So that's what we're going to need for it.

Now, another thing we can do in the query rewriting phase, though, is that we can actually use OpenAI function calling in order to get the model to generate SQL filters for us. So that's what we've done here is that the user asked, I want climbing gear cheaper than $30.

Well, we can make that into a SQL filter. So we actually asked the LLM. We use OpenAI function calling and say, hey, can you tell us if there -- if we should do a price filter here? And so it comes back and says, yeah, you should do a price.

It should be less than and 30. And then we can use that to construct a SQL filter. So that's another really cool thing about having that first query rewriting phase is that then you can start doing more sophisticated things and having it actually come up with more structured queries and not just do just like a full-text, you know, full-text search.

Okay. So that's RAG on Postgres. And so there we saw the flow of that. The other one that we have -- and this is the one that's super popular that's been deployed thousands of times -- and this is what many people think of when they think of RAG -- is being able to do RAG on unstructured documents.

So you've got PDFs and docs and Excels and HTML or whatever, right? You've got all these documents and you want to be able to ask questions about it. And people are really excited about being able to finally ask questions about PDFs because then we don't have to open PDFs because nobody wants to open a PDF, right?

So we can do -- we can do RAG on documents. And so that's what this demo does here. And so let me go ahead and show the deployed version of that one, right? So I asked, what does a product manager do? I get back citations. I click on iCitation.

That will load in the particular page number that it got it from, right? So this is the PDF and the page number where we got it from. And this is, you know, a question that's specific to this particular employee handbook. And we could do this RAG with anything. And so here you can see the, you know, all the citations it found.

And for the thought process here, it's similar, right? We have a query rewriting phase, we have the search results phase, and we've got the prompt to generate the answer. So a lot of this is really similar. The big difference is that here we have to have a data ingestion phase because we need to figure out a way to take these, like, you know, 50 page long PDFs or something, and store them in, you know, in a searchable way.

So we have a data ingestion that will take a PDF, we crack it using Azure Document Intelligence, which is very good extracting text from documents. Then we chunk it, we do a token-based chunking. So we try to come up with chunks that are about 500 tokens large. And then we vectorize those chunks with, you know, the OpenAI embedding models, and then we store them into Azure AI Search.

So that's the data ingestion phase. So you're going to need, that's why this is the most complicated architecture, is because of that data ingestion phase there, right? So data ingestion, we're using Document Intelligence, Azure Storage to store them, Azure OpenAI to embed, and then Azure AI Search to store that there.

But then it's really cool, and you can do it with all sorts of things. So I've got one that has my blog in it so I can ask questions about myself. Let me get that one open. Let's see my blog. Here we go. And there we go. Good sleep strategies.

I was just telling them how bad I am at sleep, but I have researched it a lot because I'm so bad at it. These are all from my blogs. Okay. And that's from, this is from parsing in like an HTML site. Yeah. So there you go. So those are, those are our, there's my blog.

Those are, those are the RAG ones. And so with RAG with the Azure AI Search, that's one where you could immediately get start, like get started with putting your own documents into it and seeing what it's like to be able to chat off them. What? Oh, yeah. Good question.

Can you put them somewhere? Yeah. We'll put in the same doc, the same word doc. We can put the slides over there. Okay. And I'm, I'm done. Okay. Cool. I know like, let's go for the questions. And then I know that some people were a little bit behind. I want to make sure that you have something running.

Of course, everything that Pamela showed you, you're going to be able to do in your own time. Like we ran this over and over and over and over again. We tried different things. We customized the HTML of the chat. We also changed the, the, the, the, the message. Like for example, one of the things that you can do is like, instead of like you are AI assistants, I can say, you only know about Nintendo.

Any other thing, just say, whoa, right? So you can do things like that. So there is a lot of like customization that you can do on this, on this applications, uh, as she was showing you, you can use your own data. She was showing with the data from her blog, right?

Uh, but I want to make sure that everybody's more or less on the same page, or if you have any questions like, whoa, Pamela, what did you just do? I have no idea what is rag or any questions so we can help you get up to speed. Uh, yeah, great question.

So what, um, formats can it handle? Uh, so now as your document intelligence can handle quite a few, uh, so we have the, let me find the document about it, um, data ingestion. Okay. Uh, so these are all the ones supported by DI, right? PDF, HTML, docx, pbdx, xls, images are all supported from document intelligence.

And then we built our own parsers for text. What is what, images? Right. So if you put, yeah, it's a good question, you see images. So if you send images to document intelligence, it'll OCR them basically. It'll extract the text. Now, the different thing is that you might want to do, um, like a GPT vision, uh, which is, which is a whole different thing, which is where you're actually asking, sending an image to GPT like 4.0 and asking a question about it.

Now this repo does actually optionally support that. So if that's something you're interested in, you can try that out. That's, that's actually different from ingesting an image, um, slightly different process. But that's, that's also an option. So it just depends what are your images and what are you hoping to get out of them.

Yeah. What about, like, PDFs with images inside that have text that's, like, context to what's going on? So document intelligence will extract as much as it possibly can. To me, that's sometimes too much, but I did have an, uh, an incident where, uh, I did another one based off the Python playwright documentation and the Python playwright docs has some images of the node playwright.

And so it actually extracted the JavaScript out of those images. And then that messed up my whole rag, uh, because it did extract it. So document intelligence will generally try to extract as much as it can. And so if there are text in the images, it'll just bring those out as text.

Yeah. Yeah. So you could do, you can add additional fields, um, where you just mark them as searchable, and then they would get searched as part of the full text search. Uh, so I think we even start off with three different searchable fields, but you can, yeah, you can add fields, you mark them as searchable, then they'll get searched as part of the full text search.

In terms of the vector, only what you vectorize will be searched. So if you do really want something to be searched in the vector search, then you'd want to do like what's called a content stuffing or content expansion, which is where you take everything and you stuff it into the same field and then you vectorize it.

Um, which is totally something you can try if you think it's going to be useful for vector. But I, I like, I have to like really warn about vectors is that you, you just want to be careful with your vectors because sometimes we like put too much faith in, in vectors.

Like I'll show you like the blog post I did, uh, last week where I ran, um, I ran, um, I ran the, the, the stats. Um, so vector search is not enough, right? But look at the stats, uh, down here. Um, the other thing you should do is evaluate.

Okay. So if I did, this is a text only search, it got a groundedness rating of 4.87, which is fairly high. And, um, I only was able to get 0.02 improvement by moving to text plus vector. So you'll hear a lot about vector, but, but please remember to use hybrid and to use good hybrid.

So if you just, if you just blindly like, you know, you just combine vector and text and kind of just use a basic algorithm, which is the reciprocal rank fusion algorithm, then you'll actually get pretty poor results. Cause those vector results. Cause you have to remember with, uh, when you do vectors and you do a search cross vectors, you will always get results.

Cause it's going to give you the most similar, even if it's really far apart. Right? So that's the danger of vector search is that you're always going to get results. And those results might be noisy. They might be distraction. And if you distract an LLM, it is very, very distractible.

So, um, yeah, so, so the best, like with Azure AI search, this is the AI search. The best results is if you do hybrid with their semantic ranker model. And that's an additional machine learning model that actually reranks results according to the original user query. And so that's the only way in my experience that I can use vectors and actually get, you know, get back to the results of a, uh, you know, better than a full tech search.

So just to, just to plead, uh, evaluate and be careful with your vectors. Any, any other question or do you want us to repeat like one of the apps we can go over slowly, making sure that you are getting every little step? Cause some people were having issues with the key and not able to run it.

Uh, it's good, but we can also walk, like walk around, right? Yeah. Yeah. Yeah. Uh, like, okay. I have a question then for you. Uh, were any, like, were you able to get at least one thing running? Okay. Anybody didn't get anything running? Like have no idea what we are doing?

Okay. So we are going to help you. Um, you, you didn't get anything running? Okay. And before that, I know that some people will probably walk around, um, but I really, really want something from you all. So, uh, one of the things that I want you all and you can save for later because this is very important for us is like, we love feedback.

So we give workshops and talks all the time and we want to improve, right? But don't say like, oh, I wish the wifi was better. Sure. Yeah. But this is out of control, right? So if you can take a picture, I would really appreciate any feedback that you have for us to improve or to make this.

This is like very dense, this workshop, like there is a lot, uh, we try to compress. So you have like a lot of like materials that you can go and work by yourself. Um, yes. So I have one of the questions for people who are like either working at a startup or like want to build a startup or are currently a founder of a startup would love to know what are some of the use cases that you're working on.

And does it like closely relate to any of the AI templates that we showed today? If not, like tell us more because this is not the only templates. This is just like three of the examples that we showed in the, for the workshop, there's tons of more AI templates available on the GitHub repository.

But for now, like would love to learn more about like, you know, some of the use cases that you're working on or you're interested in building and maybe like, you know, we can, we can answer some specific questions. So it's, is somebody interested in sharing a use case that they have?

But yeah, like really appreciate if you have any questions. Yeah. Any like examples. Uh, so one thing that it's, it's a good practice now that I'm seeing is when you open code spaces, you are paying for it. So make sure that you either pause it or you delete it.

Otherwise your GitHub account, uh, Yeah, you have 60 hours for free, but right every month, I guess, but make sure that, uh, if you go to github.com slash code spaces, you can see everything that is running because I saw someone had like three or four or five. So, so pay attention to that.

Uh, so you don't, I don't, you can, you can expect, you can set, you can, can I, can I do a hip? You can set up like, you can say in my case, if, if it's like, I'm not using for more than 30 hours, I just shut down automatically.

Uh, but it's something that sometimes you forget. Um, Yeah. So make sure that let's see how many Pamela has. Pamela has a bunch of them running as you can see. Yeah. So yeah, quite a few people are actually using this in production, which, um, at first we were a little surprised by because originally it was a sample, but now we've really hardened it.

Uh, so people are using it for public facing stuff like, um, government websites using it because governments have lots of PDFs and documents and it's hard to sort through their stuff. Right. So making it easier for, citizens to interact with government data is one use case. The other big one is internal HR stuff or internal meeting transcripts.

Um, like being able to look through all the transcripts and ask like, you know, what did the CEO say then? Um, uh, internal sales training manuals. Like there's, there's like, there's so many people using it for lots of things. Yeah. So in terms of being able to automatically update the index, like you just had it in manual ingestion.

Um, but you could use instead integrated vectorization, which is an AI search, uh, option where you set up an indexer. So you would point it at like a blob storage, your, uh, blob storage and say, Hey, every time this updates, every five minutes, make sure you refresh the index.

And then it would, uh, refresh it. Uh, so that would be one option or you could use like an Azure function with a trigger. And that's, uh, there's another, uh, there's another repo that does that. Uh, it's the chat with your data solution accelerator repo. And this one sets up an Azure function that has a trigger.

So, um, you've got a few different options for how you could keep it updated. Uh, people just figure out what's, you know, what works out for them. Cool. Thank you so much. Yeah. Thank you so much. Yeah. Thank you. Um, you've got a few different options for how you could keep it updated.

Uh, people just figure out what's, you know, what works out for them. Uh, people just figure out what's, you know, what works out for them. Cool. Thank you so much. Yeah. Thank you. Sure. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you.

Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. We'll see you next time.

AI Templates: Gabriela and Aishwarya

Transcript