Build, Evaluate and Deploy a RAG-Based Retail Copilot with Azure AI: Cedric Vidal and David Smith

00:00:00.000 | This is the workshop on developing a production level RAG workflow so you're

00:00:20.240 | in the right place if you want to learn how to build the backend for a chat

00:00:24.840 | application that works also off of open AI and builds its answers based on

00:00:29.400 | information that we draw from databases and vector databases we'll see all about

00:00:34.320 | that in this presentation today my name is David Smith I'm a principal AI

00:00:40.680 | advocate at Microsoft I've been with Microsoft for about eight years now after

00:00:46.200 | my startup was acquired back in the big data space and I've been at Microsoft

00:00:49.680 | ever since my background is in data science also did a lot of work as as a

00:00:55.500 | statistician and these days I'm a specialist in AI engineering and I have

00:00:59.760 | with me today two other members from Microsoft that are also specialists in

00:01:04.560 | AI engineering first I'd like to introduce Cedric Vidal who's on my team in AI

00:01:08.820 | advocacy you wouldn't mind introduce yourself Cedric hello everyone so like

00:01:13.920 | David said I'm a Cedric Vidal I'm a principal AI advocate at Microsoft and I

00:01:19.780 | I have a background in AI said driving cars software design architectures and

00:01:27.820 | everything in between I've been working in space for 20 years and today I'm gonna

00:01:33.640 | help David with the workshop welcome everyone thank you Cedric and we also got

00:01:40.180 | Miguel Martinez come all the way from Houston is a technical specialist at

00:01:44.860 | Microsoft we go tell the crowd a little bit about yourself absolutely hello

00:01:49.240 | everyone welcome today my name is Miguel Martinez I am a senior technical

00:01:53.860 | specialist for data and AI at Microsoft so a lot of our clients you know this can be

00:01:59.440 | startups businesses they hear about open AI and chat GPT and all of those things so

00:02:05.500 | they think about well how can I actually use it for my business how can I use those

00:02:10.540 | tools to drive business value and that's where me and my team come in and we

00:02:15.140 | help all of our clients develop some new solutions to drive that value all right

00:02:20.440 | well Cedric and Vidal will be here after the next two hours helping you as you go

00:02:24.400 | through this workshop so they'll be wandering around if you guys want to head

00:02:27.640 | out and if you need any help during the workshop raise your hand and one of the

00:02:31.120 | three of us will come up and help you so thanks guys all right let's jump right in if

00:02:36.020 | you would like to get started you can use that URL on your screen right there just pop

00:02:41.540 | up a browser on your laptop I'll give you more information about what's going on when

00:02:46.120 | you get there but you should be able to follow along and get started if you like now to participate

00:02:53.840 | in this workshop it is going to be hands-on so you will need to have your own laptop and

00:02:57.880 | I can see looks like everybody does have their own laptop that's great this is not something

00:03:02.060 | you'll be able to do on your phone or on a tablet because there's lots of lots of work that we'll be

00:03:07.740 | going through as we work through a github repository and that is the second thing that

00:03:12.540 | you'll need to have to participate in this workshop is a github account if you don't yet have a github

00:03:19.360 | account please go ahead now to github.com/signup and create yourself a brand new account we'll be using the

00:03:27.060 | github code spaces feature to provide our development environment but the free github

00:03:33.060 | code spaces is more than sufficient for the work that we'll be doing here today we are going to be

00:03:38.820 | building an application in the Azure cloud but you do not need to have an Azure account for this

00:03:45.780 | workshop we're going to provide you with a login to an Azure account and we will have already set up all the

00:03:52.020 | resources that you need to work with this application and that specifically is things like

00:03:58.020 | Azure AI search the vector database cosmos DB the database which we're going to be using for our

00:04:03.940 | customer information connections to open AI which we're going to be using for our LLM and various other

00:04:10.020 | resources and tools that we'll be using in Azure you can do all of this on your own if you do have your

00:04:16.980 | own Azure accounts and are willing to spend a few dollars in credits to run the resources over a few

00:04:22.420 | hours for this resource by the end of this workshop you will have excuse me you will have all of the

00:04:30.020 | information code data and everything you need to recreate everything that i show you here today one of the

00:04:35.860 | first things that we will do in fact is to clone to fork in fact a repository into your own github account

00:04:40.980 | account and everything you need will be right there in there all right but if you would like to run

00:04:47.380 | through this at home you will need an Azure account and you can create one at the link you see on the screen

00:04:52.420 | right there so before we jump in and i start giving you some demos about how this works let me orient

00:04:58.420 | you a little bit that link that i gave you on the last slide will have launched you into a virtual machine

00:05:06.420 | it is a windows virtual machine but we're not going to be using windows at all in fact the only thing

00:05:12.260 | we're going to be using this virtual machine for is the instructions for the workshop which you can see

00:05:17.460 | on the right hand side of the screen there in front of you you'll be going through that page after page

00:05:22.100 | this instruction workshop feature has some nice tools you can use for example wherever you see the green

00:05:29.940 | text you can click on the green text and it will paste that directly into your browser that'll save

00:05:36.660 | you some time when you're entering in the passwords and urls and things like that and so forth i will

00:05:42.660 | mention that we are in a virtual machine environment it looks like the wi-fi is pretty good here but if you

00:05:48.420 | find that the virtual machine is slow or there's lots of lagginess because of the wi-fi or if you just

00:05:54.020 | prefer to use your own browser on your own laptop you can totally do that you can just open up a browser

00:05:59.780 | and follow all the same set of instructions the only difference is you'll have to open up that virtual

00:06:03.940 | machine every now and again to look at those instructions and you'll have to manually cut and

00:06:07.860 | paste the green parts from the guide into your own browser so it's your choice about which direction

00:06:13.860 | you want to go there

00:06:14.580 | when we actually come to running shell commands and things like that the dev environment that we're going

00:06:26.820 | to have you use is github code spaces you could totally run all this on your local desktop

00:06:32.740 | but we've got to make sure you've got vs code installed you've got to have all the right python

00:06:37.060 | libraries installed etc etc so to make things easy here we're just asking everybody to go straight into

00:06:42.820 | github code spaces where everything is set up and you can do exactly the same process for yourself at home

00:06:49.060 | if you're not familiar with github or not familiar with github code spaces just pop your hand up when

00:06:54.340 | we get started we'll come and chat to you and talk to you about how that works and what's going on there on

00:06:58.660 | there but if you have used github code spaces before it should be pretty straightforward right there

00:07:07.620 | all right so this is what we're going to do today we are going to build this app right here or at least

00:07:14.980 | we're going to build the back end to this app and have you interact with that back end through a little

00:07:20.500 | ui that you'll be working through if you do want to build the front end to this as well we do provide

00:07:25.460 | the code for the front end and all the data that's also linked from the github repository so you'll have

00:07:30.180 | have access to that but i'm going to give you a little demo about this website just to set the scene

00:07:36.340 | so let me go over to my browser here i think i have the website open so the the um the sort of the the

00:07:45.140 | the idea here is that we are engineers working for a retail company you know something like rei that sells

00:07:53.860 | camping equipment and backpacks and trail hiking shoes things like that they already have a website

00:07:59.460 | which you can scroll through and you can see the products that they have available it's pretty

00:08:04.340 | limited selection just to keep things simple and we can click through to see product information for

00:08:09.380 | each of the products that this company sells so for these these uh trailblaze hiking pants we've got

00:08:15.300 | lots of details about the features there are some roof reviews faqs there's a return policy

00:08:20.580 | some cautions technical specifications user guide care maintenance lots and lots of information

00:08:26.580 | available to the customer about all the products that are available at this store in addition this

00:08:34.180 | storefront has a customer login feature in this particular example the customer sarah lee is already

00:08:40.820 | logged into this system what we're going to be building here is a chat bot that operates on this website

00:08:50.740 | you'll be able to access that click on the chat bot button and ask the question for example

00:08:55.540 | what can you do and this is going to connect to the system that we're going to build

00:09:03.300 | to get an answer to that question this is a pretty simple question it's coming straight from the llm and

00:09:08.580 | its context to tell it that as as an ai agent i can provide you with information about our products to help

00:09:13.460 | you with your purchases so on and so forth now we can also ask questions about specific products for

00:09:21.540 | for example i'm going to pop up my paste here let's try one like here we go what is a good tent that goes

00:09:31.620 | with um the trail walker shoes okay so in this case it's actually consulting all the information product

00:09:44.420 | information that's available in that website to formulate that that formulate that answer with

00:09:49.140 | the llm and it comes up with for your upcoming trip to andalusia i recommend pairing your trail walker

00:09:54.180 | hiring shoes with the trail master x4 tent we're also going to provide the llm with information about the

00:10:01.620 | customer themselves their name their status in the loyalty program where they live and their order history

00:10:08.180 | so it's able to answer questions like this what have i already purchased from you so in this case the llm is

00:10:15.940 | able to consult the customer's purchase history and give back the answer to say that sarah lee a value

00:10:21.620 | customer who's purchased the cozy night sleeping bag the trailblazer hiking tent and so on and so forth

00:10:27.620 | so this is the system that you'll be building on the back end so that that chat bot can use an llm like open ai

00:10:36.420 | to answer those kinds of questions based on the customer's purchase history and the products available

00:10:44.260 | from this particular retail store any questions on that so far very straightforward okay great

00:10:52.420 | all right we are going to be building this on azure and using various azure resources which we will provide

00:11:01.220 | to you to do that we're going to be using the azure ai studio platform to manage the large language models

00:11:08.820 | and the various resources that we're working with we're going to be using azure ai services which are

00:11:14.820 | going to provide our various tools to us but in particular the open ai models that the chat bot is

00:11:20.740 | going to use to generate its responses we'll be building an azure ai project which is going to manage all the

00:11:27.780 | flows that we have to to have the chat bot gather its information and generate its responses

00:11:33.540 | the product database is going to be stored in azure ai search and as an ai search is a vector database

00:11:42.100 | or at least it has a vector database feature which we're going to be using to match the customer's

00:11:47.700 | questions to the nearest or the most relevant products that are related to those customer questions and using that to provide

00:11:55.540 | context to the chat bot and lastly the customer information is going to be stored in a database

00:12:02.180 | regular rows and columns type of database relational database where we will extract the customer's information

00:12:09.940 | and order history

00:12:16.020 | okay so let's have a look at the architecture we're building for the back end in a little bit more detail

00:12:23.460 | customer types of question into the chat bot on the website and that gets extracted out and sent to an

00:12:32.180 | endpoint just as a simple string along with the customer's chat history as well

00:12:37.140 | that question gets sent to azure open ai where it will get embedded into a vector format if you're

00:12:44.340 | not familiar with embedding it's basically the idea of converting a piece of text piece of string into a

00:12:50.260 | point in multi-dimensional space in such a way that the other points in multi-dimensional space we have

00:12:56.900 | already defined by embedding the product information pages the ones that are closest to the question

00:13:03.540 | are the products that are most relevant to that and we'll use that to extract out the most relevant products

00:13:08.260 | and push them into the context of the large language model at the same time we'll also extract out

00:13:14.660 | information about the customer and their purchase history from cosmos db

00:13:20.180 | so we do the embedding do the search in azure search to get the product information we extract out the

00:13:25.540 | customer information with cosmos db we feed all that information into a prompt so we create a large

00:13:32.340 | prompt with information about the problem to solve you have the customer information about our products

00:13:37.220 | information about the relevant products from the azure ai vector search information about the product's

00:13:42.980 | customer history and then generating a response based on the customer's question any questions so far

00:13:49.860 | all right we'll get right into it just shortly

00:13:54.500 | i'll come into all this a little bit later on but we're going to be using a tool within azure ai search

00:14:01.540 | called prompt flow i'll come back into the details a little bit later on but what we're going to be doing

00:14:06.820 | is taking that state that same retrieval augmented generation process that i just outlined to you

00:14:14.980 | the steps are to take the question retrieve related data which in this case is our product data

00:14:20.260 | augment the prop for that information that's come from the knowledge base generate a response with the

00:14:25.700 | large language model and then send the results back to the user we'll be creating a version of that

00:14:32.580 | retrieval automatic generation flow using prompt flow which is a development tool that's in azure

00:14:38.580 | ai studio to streamline this process of putting all that data together we take the inputs we embed it

00:14:44.340 | we retrieve information from ai search we look up customer information for the database put all that

00:14:49.620 | together with the customer prompt to generate the response and that's what the customer sees in the

00:14:54.420 | website so that's what you will be doing so with that let me go back to the start here

00:15:01.380 | has anybody not yet been able to log in to that website if you anybody having trouble if so cedric

00:15:09.540 | and miguel will help you guys up let the password will come up in just a second all right so let me

00:15:16.180 | actually show you that okay so i've gone to that website and you should have come to a page just like

00:15:24.420 | like this we can go ahead and launch the virtual machine

00:15:27.540 | so on the right hand side of the virtual machine is the instructions the bottom right hand corner is

00:15:44.500 | the next button that'll take you to the next page of the instructions and that is the page where you'll find

00:15:51.460 | the password for the windows machine which is curiously password and from then you should be

00:15:59.620 | able to log into the windows virtual machine right there

00:16:09.940 | the next step after that and i can actually go to the next step here

00:16:13.220 | is to open up the browser i'm doing this directly within the virtual machine

00:16:21.780 | we're going to browse to a particular github repository that we provided to you for this workshop you can

00:16:28.420 | just click on the link there and

00:16:29.940 | have it typed into the browser in the virtual machine or you can cut and paste it into your own browser as

00:16:34.260 | okay you'll need to log in to get up at this point just a slight warning is that if you use single sign-on

00:16:50.420 | this happens to us at microsoft we use signal sign-on to access github that does not work through the

00:16:55.380 | virtual machine so i'm going to show you actually doing that directly from the browser

00:17:01.380 | where i've already logged in to github there

00:17:04.260 | and we're going to do code spaces it looks like i've actually already started a code space here

00:17:11.700 | so let me just go ahead

00:17:12.660 | the instructions will tell you to launch a code space oh i forgot one very important step

00:17:18.740 | before you launch any code spaces we're going to switch to a different branch

00:17:21.380 | rather than doing the main branch which is the one that you'll use if you want to do this at home

00:17:27.060 | we've got a special branch here for this lab it's called ms build dash lab 322

00:17:32.180 | the only difference is this version of the lab skips all the deployment instructions because we've done

00:17:38.180 | that for you already

00:17:41.460 | and then from that from that branch of the repository we're going to launch a code space on that specific

00:17:49.700 | branch i'll show you that in just a moment it's a launch one that i already have here

00:17:55.860 | this is what happens once code space is set up takes a couple of minutes for code spaces to warm up warm up

00:18:03.700 | this one should already be warm

00:18:05.060 | and what we have here in case you've never seen this before is basically an instance of visual studio

00:18:16.740 | code running in the github cloud directly in your browser so it's the same user interface as visual

00:18:24.740 | studio code but it's just running in your browser if you're experienced using code spaces you probably

00:18:30.020 | know that you can also run this directly from within visual studio code in your desktop and connect to

00:18:34.740 | that instance if you've got questions about that happy to show you

00:18:37.700 | one of the things that we are going to be doing with code spaces and it looks like i've already

00:18:45.220 | closed the terminal is logging in to your azure account

00:18:53.540 | at the terminal and the instructions to that if you've been following along you might be ahead of me

00:18:59.140 | already

00:18:59.460 | is to log in to um the portal and actually do this directly within the the local browser in this case

00:19:18.740 | okay this is the username we provided to you for this temporary azure account

00:19:23.460 | it'll only exist um for the duration of this workshop and everything will be deleted once we're finished

00:19:28.900 | one thing you'll probably find is because we're in a brand new windows virtual machine it thinks

00:19:40.340 | you're a brand new user to windows it thinks you're a brand new user to azure keeps on popping up all these hints

00:19:47.220 | if you're familiar with azure you can just delete them if you'd like to do an introduction to azure and

00:19:51.300 | you can do it here or you could do it at home and the trick is to get back to the starting place is to

00:19:57.380 | go to the home button there and then i won't go through these steps on the screen for you but you'll be

00:20:02.580 | following through the steps on here to have a look at the resource groups we have available also the

00:20:08.020 | resource group we provided to you in azure and you'll be able to inspect all the resources that we've already

00:20:12.340 | deployed for you that you'll be working with lastly i'll just show you this last part

00:20:22.580 | is when we log into visual studio code let me show an example of copying

00:20:28.740 | sorry i don't normally do this way there we go

00:20:42.660 | where's my copy button i'm not using my usual laptop here

00:20:46.020 | all right so what i will do this might be a neat trick is just paste it directly up here

00:20:57.540 | and copy that back into my other visual studio window

00:21:03.620 | allow there we go

00:21:09.140 | all right it's usually easier than that just to show you that process of how you're going to connect

00:21:20.180 | your visual studio code instance in github code spaces with your az account is with this az login

00:21:26.660 | after a moment it will give me an eight digit code which i'll copy

00:21:31.860 | and i'll go to this device login page to actually log into azure

00:21:38.180 | and now all the commands that i run within the visual studio code command line terminal

00:21:47.380 | will run against the azure account that we've set up for you so that's kind of the main things to

00:21:51.620 | watch out for as you get started just continue to work with you through the instructions as you go

00:21:56.020 | cedric miguel and i'll be wandering around to help you and put your hand up if you get stuck and

00:22:00.180 | in about 20 minutes or so i'll jump in and give you a bit more context about the prompt flow that we're

00:22:04.340 | going to be able to find out for you so that's what's going to be able to find out for you so

00:22:19.300 | that's what's going to be able to find out for you so that's what's going to be able to find out for you

00:22:21.940 | so that's what's going to be able to find out for you so that's what's going to be able to find out for you

00:22:23.940 | so that's what's going to be able to find out for you so that's what's going to be able to find out for you

00:22:26.820 | so

00:22:38.820 | yeah

00:22:50.820 | yeah

00:23:02.820 | yeah

00:23:04.820 | yeah

00:23:06.820 | yeah

00:23:08.820 | yeah

00:23:08.820 | so if you're already logged into github it'll just continue to use that login

00:23:30.820 | yeah

00:23:36.820 | yeah yeah before you do that though

00:23:40.820 | yeah yeah

00:23:42.820 | yeah

00:23:44.820 | yeah

00:23:46.820 | it looks like you're logged in yeah yeah yeah as long as you're logged in you're good

00:23:52.820 | yeah

00:23:54.820 | yeah and when you forked it did you um did you uncheck this copy of the main branch only

00:24:00.820 | uh oh i haven't forked it yet okay oh good yep

00:24:02.820 | yeah

00:24:04.820 | oh wait the this is what i got to fork then the contoso chat

00:24:08.820 | yep yep so click the fork button first of all yeah which is up here yeah

00:24:12.820 | and then

00:24:14.820 | and be sure to uncheck that box

00:24:18.820 | so you

00:24:20.820 | and the nice thing is now whatever github account you're using you'll have this in that account

00:24:30.820 | right to work from when you get home

00:24:32.820 | oh yeah yeah yeah i love that

00:24:34.820 | yep

00:24:34.820 | i mean the setup's awesome i love that github and that's everything

00:24:38.820 | yeah

00:24:38.820 | there we go

00:24:40.820 | okay so

00:24:42.820 | for the repo we just did that like a second

00:24:44.820 | yep

00:24:46.820 | just telling you that you just skipped ahead yeah

00:24:52.820 | oh i did right

00:24:54.820 | yeah it does it better

00:24:56.820 | um

00:24:58.820 | oh i gotta go to that branch

00:25:00.820 | yep

00:25:02.820 | uh

00:25:04.820 | the mslav

00:25:06.820 | fourth or whatever

00:25:08.820 | yep that one

00:25:10.820 | go to code

00:25:12.820 | and then code spaces

00:25:13.820 | and then code spaces

00:25:14.820 | i didn't try code space

00:25:15.820 | and you never tried it before yeah

00:25:16.820 | so click on the code spaces tab up there yeah

00:25:18.820 | you can see that i haven't seen it

00:25:19.820 | yeah

00:25:20.820 | yeah

00:25:20.820 | and then click create code space

00:25:22.820 | yeah

00:25:24.820 | so that's available because of my like private account

00:25:26.820 | yep yeah yeah it's um

00:25:27.820 | it's limited in how much you can use

00:25:29.820 | like it's 20 hours worth of computer stuff

00:25:31.820 | but you know those kinds of things but

00:25:33.820 | it's a nice consistent environment and what we can do on our end

00:25:36.820 | which we do for all of our samples is we make sure that when you launch code spaces

00:25:40.820 | you get a command line that has everything installed that you need to go

00:25:43.820 | so there's no

00:25:44.820 | right

00:25:45.820 | yeah

00:25:46.820 | uh

00:25:47.820 | and then uh

00:25:48.820 | yep

00:25:49.820 | yep

00:25:50.820 | yep

00:25:51.820 | yes

00:25:52.820 | it's doing it

00:25:53.820 | yep

00:25:53.820 | so it's gonna take probably 60 seconds or so to do things

00:25:56.820 | so what we actually recommend is you go to the next step and then come back to it later

00:26:00.820 | yeah

00:26:01.820 | yep

00:26:02.820 | all right

00:26:03.820 | i'll let you go to it

00:26:05.820 | uh... so i mean it's like you gave me an azure account

00:26:11.820 | yep

00:26:12.820 | yep

00:26:12.820 | yep and there's the username and password

00:26:14.820 | yeah

00:26:14.820 | and i was able to log in

00:26:15.820 | oh wait it's right here

00:26:16.820 | yep

00:26:17.820 | so i had all the resources

00:26:18.820 | exactly

00:26:19.820 | i'll deploy for you

00:26:20.820 | yeah cause some of them take a little while to provision so it's it's for these workshop environments it's easy for people to be able to jump straight in

00:26:26.820 | okay

00:26:27.820 | um

00:26:28.820 | and uh

00:26:29.820 | none of this is ai studio though

00:26:31.820 | uh one of them will be

00:26:32.820 | you should have eleven resources

00:26:33.820 | okay

00:26:34.820 | so you got an azure ai project

00:26:36.820 | so that's an ai studio project

00:26:38.820 | and then

00:26:39.820 | yep

00:26:39.820 | and the next step we can actually log into ai studio

00:26:41.820 | okay

00:26:42.820 | uh...

00:26:43.820 | i'm not familiar i think you know

00:26:57.820 | you gotta

00:26:58.820 | you will have to wait for something you should be

00:27:00.820 | you should be

00:27:01.820 | you should be

00:27:02.820 | you should be

00:27:03.820 | you should be

00:27:04.820 | available

00:27:05.820 | you should be

00:27:10.820 | you should be

00:27:11.820 | you should be

00:27:12.820 | you should be

00:27:13.820 | you should be

00:27:14.820 | you should be

00:27:15.820 | you should be

00:27:16.820 | you should be

00:27:17.820 | you should be

00:27:18.820 | you should be

00:27:19.820 | you should be

00:27:20.820 | you should be

00:27:21.820 | you should be

00:27:22.820 | you should be

00:27:23.820 | you should be

00:27:24.820 | you should be

00:27:25.820 | you should be

00:27:26.820 | you should be

00:27:27.820 | you should be

00:27:28.820 | you should be

00:27:29.820 | uh...

00:27:30.820 | nope there you go

00:27:31.820 | yeah

00:27:32.820 | yeah

00:27:33.820 | yeah

00:27:35.820 | yeah

00:27:37.820 | yeah

00:27:37.820 | Thank you.

00:28:07.800 | Thank you.

00:28:12.320 | Everything going all right so far?

00:28:14.200 | Cool.

00:28:15.200 | It should be the second tab across, I think.

00:28:26.880 | Yep, that one is code spaces, the one before.

00:28:33.440 | This tab right here where it says preview reading, that's code spaces right now.

00:28:37.260 | So it's showing you the contents of the repository with a command line connected to that file

00:28:43.200 | system.

00:28:45.200 | I'm so used to using, sorry, oh I don't do that, I'm sorry, I'm so used to just using VS code,

00:28:52.900 | I don't even, sorry, so it doesn't seem like you are inside the, yeah, I'm not.

00:29:04.840 | I don't know, I don't know, I don't know what happened, it used to have all the iPhones and

00:29:14.780 | it used to have all the, it looked, it had showed I explore and everything else, right?

00:29:26.720 | I think it just stopped but, yeah, I did it earlier so I think it just wiped itself or something.

00:29:34.660 | It's really odd because I did that earlier.

00:29:38.600 | Okay, it's okay.

00:29:40.600 | No, I had it earlier so.

00:29:41.600 | Incorrect.

00:29:42.600 | Yeah, I did that the first time.

00:29:46.540 | I had it actually, hmm, like it, hmm, Apple, but I really logged into this like ages ago

00:29:54.480 | so that's the reason why it's not loading.

00:29:56.480 | Potentially.

00:29:57.480 | Potentially.

00:29:58.480 | Potentially.

00:29:59.480 | Potentially.

00:30:00.480 | Potentially.

00:30:01.480 | Potentially.

00:30:02.480 | Here we go.

00:30:03.480 | Now you can open the browser.

00:30:04.480 | Oh, I did that earlier.

00:30:05.480 | I don't know why it wiped everything.

00:30:06.480 | Yeah, and then I went.

00:30:07.480 | Because, you know, the tab that you opened outside or no, that's okay.

00:30:10.480 | Oh, no, that's okay.

00:30:11.480 | Okay, okay.

00:30:12.480 | Yeah, that's okay.

00:30:13.480 | I'm just saying, I did this earlier.

00:30:14.480 | Okay.

00:30:15.480 | Yeah, so that's why I'm thinking that I already have the, the, the, the, the, the,

00:30:24.480 | Yeah, I guess you need to do it again. I'm sorry. I said you should be able to just click on it. It should tell it for you. You don't have to click on it.

00:30:43.480 | Oh, cool. Well, no. The reason I'm saying is because I had this load. I had everything else load. The only thing I stopped at was after I did this.

00:31:11.480 | No, I understand that. I'm just saying your stuff is buggy because it shouldn't not work across windows.

00:31:29.480 | Say that again? Oh, no, no. I'm just catching up to you guys. Yeah.

00:31:39.480 | I'm just logging into GitHub right now. Yeah, that's where we are. There we go.

00:31:45.480 | Just to avoid confusion, you might want to close the other tabs. Otherwise, it's confusing.

00:31:51.480 | No, that's why I'm confusing it. Like, these are other testing.

00:31:57.480 | The issue. The issue I'm having is you're saying this thing should be .

00:32:03.480 | Okay.

00:32:09.480 | Literally just kind of .

00:32:15.480 | Yeah, and this will, I already have that. So, after that, what I'm sorry.

00:32:17.480 | Okay.

00:32:23.480 | What I'm sorry, what I'm sorry, what I'm sorry.

00:32:25.480 | What I'm sorry, what I'm sorry, what I'm sorry, what I'm sorry.

00:32:31.480 | What I'm sorry, what I'm sorry, what I'm sorry, what I'm sorry, what I'm sorry.

00:32:43.480 | Okay, I got it. That's what I was trying to do.

00:32:45.480 | Got it. So, copy and then join me. Yeah. So, that's why I was saying that .

00:32:51.480 | Oh, because I didn't lose this .

00:32:53.480 | Got it, got it, got it, got it.

00:32:55.480 | Okay. That's what's confusing.

00:32:57.480 | Yes. No, you could just open the one you had already created. Here. Yes. Yes.

00:33:13.480 | Yes.

00:33:17.480 | Yeah, that worked. That's mine.

00:33:21.480 | The issue I have is .

00:33:25.480 | It takes about 60 seconds sometimes for the code to pop up.

00:33:33.480 | Yeah, yeah.

00:33:37.480 | And you do need to hit enter in the terminal afterwards as well to confirm.

00:33:41.480 | Yeah. Yeah, I'll take a look.

00:33:43.480 | Yes, in this browser.

00:33:45.480 | And then open the portal.

00:33:49.480 | Basically redo what you've done, but do it inside the VM instead of your macOS browser.

00:33:53.480 | Is what I'm saying.

00:33:55.480 | Yeah.

00:33:57.480 | Yeah.

00:33:59.480 | Yeah.

00:34:01.480 | Yeah.

00:34:03.480 | Yeah.

00:34:03.480 | Okay.

00:34:05.480 | Yeah, we've had.

00:34:07.480 | Azure domain.

00:34:09.480 | Yeah.

00:34:10.480 | We haven't used Azure AI Studio.

00:34:11.480 | I think that's the reason.

00:34:13.480 | Because it's been like a big data breach.

00:34:15.480 | Okay.

00:34:16.480 | Can you, yeah.

00:34:17.480 | Oh, you tried that by the way.

00:34:19.480 | Okay.

00:34:20.480 | Thank you.

00:34:21.480 | Yeah.

00:34:23.480 | Yeah.

00:34:24.480 | Yeah.

00:34:25.480 | Yeah.

00:34:26.480 | So, can you, yeah.

00:34:27.480 | Yes.

00:34:28.480 | It's, uh.

00:34:29.480 | Oh, I said, like, copy that.

00:34:30.480 | Yeah.

00:34:31.480 | Yeah.

00:34:32.480 | Yeah.

00:34:33.480 | That's right. You're connected and you can close that tab now, you won't need that one again.

00:34:45.880 | And just press Enter on that.

00:35:02.880 | So, right now it's going to create the attribute table and the data.

00:35:12.880 | Yeah, now you should be good.

00:35:27.880 | Yeah, but except you are in your macros.

00:35:41.880 | I mean, yes, but inside the VM browser.

00:35:46.880 | Yeah, but you have already the cut space opened here.

00:35:51.880 | Let me show you where you'll get to once you get to about step seven.

00:35:56.880 | I've gone through the steps of logging into GitHub, cloning the repository, launching code spaces,

00:36:01.880 | logging into the Azure portal and AI studio from the browser, and also logging into the

00:36:06.880 | Azure portal.

00:36:07.880 | Azure from the visual studio from the visual studio command line.

00:36:12.880 | So, they can run some commands, and I've run this post provision command here.

00:36:17.880 | You're welcome to have a look at what that script does.

00:36:22.880 | But let me actually show you.

00:36:23.880 | What we have done is, if you go to the home page of the Azure portal, you can also do this from the

00:36:28.880 | Azure CLI if you prefer.

00:36:30.880 | You can have a look at the Azure CLI if you prefer.

00:36:31.880 | You can have a look at the list of resources we've deployed into this single resource resource

00:36:35.880 | group for you called Contoso Chat RG.

00:36:38.880 | And you can have a look at the list of resources we've deployed into this single resource group for you called

00:36:48.880 | Contoso Chat RG.

00:36:49.880 | And you can have a look at the list of resources we've deployed into this single resource group for you called Contoso Chat RG.

00:37:04.880 | And you can see that there's 11 resources that we have launched for you.

00:37:09.880 | What that script did was to populate some data into Cosmos DB and to Azure AI Search.

00:37:16.880 | Let's have a look first at Cosmoso DB, which is the Azure Cosmos DB account.

00:37:23.880 | And here in the portal, I can go into and have a look at the data explorer.

00:37:31.880 | Okay, there's a video to watch if you're interested.

00:37:39.880 | I'm not.

00:37:40.880 | Okay, and what you can see here is now within our Contoso Outdoor Cosmos database, we have a customer's database.

00:37:48.880 | And you can actually drill in there if you're familiar with using databases and have a look at the tables.

00:37:53.880 | There's about 12 customers in the table and/or their purchase history.

00:38:00.880 | Likewise, if I go back to the resource group and have a look at the Azure AI Search research, which is called Search Service.

00:38:12.880 | You can go to the indexes in the sidebar under Search Management.

00:38:20.880 | By the way, if you can't see this sidebar, it happens if you're using a very small laptop screen or if you've got a large font.

00:38:27.880 | It might be hidden behind this hamburger menu here in the corner.

00:38:30.880 | Go to indexes.

00:38:34.880 | Oh, that's a problem.

00:38:39.880 | I've got no indexes found.

00:38:41.880 | So I must have missed a step.

00:38:42.880 | Hopefully, you've got indexes there.

00:38:44.880 | And if not, we'll come back and check.

00:38:46.880 | I don't think I went through all the steps to actually pre-provision things there.

00:38:49.880 | So we'll go back and have a look at that again.

00:38:51.880 | Okay.

00:38:52.880 | I just want to give you a preview of where you're coming to.

00:38:54.880 | Again, always any questions, just pop your hands up.

00:38:57.880 | Did I?

00:39:16.880 | Yeah, probably.

00:39:17.880 | I skipped ahead a bunch of stuff.

00:39:18.880 | Oh, there we go.

00:39:19.880 | A bunch of it.

00:39:20.880 | Did you get errors, too?

00:39:21.880 | Yeah, I got errors as well.

00:39:22.880 | Oh, okay.

00:39:23.880 | Permission errors.

00:39:24.880 | Let's see if I can figure what's going on here.

00:39:31.880 | Yeah, yeah.

00:39:32.880 | Okay.

00:39:34.880 | Right.

00:39:35.880 | Yes.

00:39:36.880 | I wonder why that happened.

00:39:54.880 | Oh, you know what?

00:39:55.880 | I know.

00:39:56.880 | I skipped a step.

00:39:57.880 | I know which step I skipped.

00:39:58.880 | I skipped.

00:39:59.880 | We have one good question from the audience.

00:40:01.880 | Yes.

00:40:02.880 | What's the question?

00:40:03.880 | So I was just looking at the prompt flow example on step, sort of end of step seven, where we're

00:40:08.880 | looking at the graph of the pre-existing prompt flow that's been created.

00:40:13.880 | One of the questions that I had is, can the graph also be cyclical?

00:40:16.880 | Can the graph be cyclical?

00:40:17.880 | No.

00:40:18.880 | It's a directed acyclical graph.

00:40:19.880 | And that is, because I don't think there's any support for any kind of looping like that.

00:40:30.880 | So, yeah.

00:40:31.880 | Is there a reason why you'd want it to be cyclical?

00:40:35.880 | Yeah.

00:40:36.880 | For example, React agents?

00:40:37.880 | Yeah.

00:40:38.880 | Well, like, within a single component of that prompt flow is anything you want.

00:40:45.880 | It could be any Python code.

00:40:46.880 | So if you need to do that kind of interactivity within one of those nodes, you can do that.

00:40:51.880 | But the flow itself is acyclic.

00:40:54.880 | Thank you very much.

00:40:55.880 | Yeah.

00:40:56.880 | It worked on the second time.

00:41:01.880 | Yeah.

00:41:02.880 | I realized the step that I skipped, because I was heading ahead, was I didn't do the bit

00:41:05.880 | where we create the .env file.

00:41:07.880 | So let me check.

00:41:08.880 | Yeah.

00:41:09.880 | I'm trying to remember which step that was.

00:41:12.880 | Done that one.

00:41:14.880 | All right.

00:41:15.880 | This is the one I didn't do.

00:41:29.880 | And, yeah.

00:41:31.880 | Yeah.

00:41:32.880 | I skipped this step.

00:41:36.880 | Oops.

00:41:37.880 | Click too late.

00:41:41.880 | You're late.

00:41:54.880 | Hm?

00:41:55.880 | Okay.

00:41:56.880 | Hm?

00:41:57.880 | Okay.

00:41:58.880 | Oh.

00:41:59.880 | What is that now?

00:41:59.880 | Yeah.

00:42:00.880 | Did I log into this one?

00:42:13.880 | Let's make sure I did.

00:42:29.880 | I'm going to go to the next one.

00:42:30.880 | I'm going to go to the next one.

00:42:31.880 | I'm going to go to the next one.

00:42:32.880 | I'm going to go to the next one.

00:42:33.880 | I'm going to go to the next one.

00:42:34.880 | I'm going to go to the next one.

00:42:35.880 | I'm going to go to the next one.

00:42:36.880 | I'm going to go to the next one.

00:42:37.880 | I'm going to go to the next one.

00:42:38.880 | I'm going to go to the next one.

00:42:39.880 | I'm going to go to the next one.

00:42:40.880 | I'm going to go to the next one.

00:42:41.880 | I'm going to go to the next one.

00:42:42.880 | I'm going to go to the next one.

00:42:43.880 | I'm going to go to the next one.

00:42:44.880 | I'm going to go to the next one.

00:42:45.880 | I'm going to go to the next one.

00:42:46.880 | I'm going to go to the next one.

00:42:47.880 | I'm going to go to the next one.

00:42:50.880 | I'm going to go to the next one.

00:42:51.880 | I'm going to go to the next one.

00:42:52.880 | I'm going to go to the next one.

00:42:53.880 | I'm going to go to the next one.

00:42:54.880 | I'm going to go to the next one.

00:42:55.880 | I'm going to go to the next one.

00:42:56.880 | I'm going to go to the next one.

00:42:57.880 | I'm going to go to the next one.

00:42:58.880 | I'm going to go to the next one.

00:42:59.880 | I'm going to go to the next one.

00:43:00.880 | I'm going to go to the next one.

00:43:01.880 | I'm going to go to the next one.

00:43:02.880 | I'm going to go to the next one.

00:43:03.880 | I'm going to go to the next one.

00:43:04.880 | I'm going to go to the next one.

00:43:05.880 | I'm going to go to the next one.

00:43:06.880 | I'm going to go to the next one.

00:43:07.880 | We have another good question here.

00:43:20.880 | Sure.

00:43:21.880 | Go ahead.

00:43:22.880 | Again, about the Prompt Flow DAG.

00:43:25.880 | This time, there's a lot of power in the DAG description that we can build.

00:43:32.880 | One thing that we'd like to explore, that we're currently exploring with a couple of our clients

00:43:36.880 | is how to hit these kinds of DAGs from Enterprise to GPT to use the Actions Framework

00:43:43.880 | and then go to a middleware layer where a DAG resides.

00:43:46.880 | What I'd like to understand is, what are our options to expose this as an endpoint to external sources other than our web app?

00:43:56.880 | That's actually going to be about step 11 or 12.

00:43:58.880 | We're actually going to deploy it as an endpoint and then connect it to the website.

00:44:01.880 | Let's go to the next one.

00:44:02.880 | Let's go to the next one.

00:44:03.880 | Let's go to the next one.

00:44:04.880 | Let's go to the next one.

00:44:05.880 | Let's go to the next one.

00:44:06.880 | Let's go to the next one.

00:44:07.880 | Let's go to the next one.

00:44:08.880 | Let's go to the next one.

00:44:09.880 | Let's go to the next one.

00:44:10.880 | Let's go to the next one.

00:44:11.880 | Let's go to the next one.

00:44:12.880 | Let's go to the next one.

00:44:13.880 | Let's go to the next one.

00:44:14.880 | Let's go to the next one.

00:44:15.880 | Let's go to the next one.

00:44:16.880 | Let's go to the next one.

00:44:17.880 | Let's go to the next one.

00:44:18.880 | Let's go to the next one.

00:44:19.880 | Let's go to the next one.

00:44:20.880 | Let's go to the next one.

00:44:21.880 | Let's go to the next one.

00:44:22.880 | Let's go to the next one.

00:44:23.880 | Let's go to the next one.

00:44:24.880 | Let's go to the next one.

00:44:25.880 | Let's go to the next one.

00:44:26.880 | Let's go to the next one.

00:44:27.880 | Let's go to the next one.

00:44:28.880 | Let's go to the next one.

00:44:29.880 | Let's go to the next one.

00:44:30.880 | Let's go to the next one.

00:44:31.880 | Let's go to the next one.

00:44:32.880 | Let's go to the next one.

00:44:33.880 | Let's go to the next one.

00:44:34.880 | Let's go to the next one.

00:44:35.880 | Let's go to the next one.

00:44:39.880 | Let's go to the next one.

00:44:41.880 | Let's go to the next one.

00:44:42.880 | Let's go to the next one.

00:44:43.880 | Let's go to the next one.

00:44:44.880 | Let's go to the next one.

00:44:45.880 | Let's go to the next one.

00:44:46.880 | Let's go to the next one.

00:44:47.880 | Let's go to the next one.

00:44:48.880 | Let's go to the next one.

00:44:49.880 | Let's go to the next one.

00:44:50.880 | Let's go to the next one.

00:44:51.880 | Let's go to the next one.

00:44:52.880 | Let's go to the next one.

00:44:53.880 | Let's go to the next one.

00:44:54.880 | Let's go to the next one.

00:44:55.880 | Let's go to the next one.

00:44:56.880 | Let's go to the next one.

00:44:57.880 | Let's go to the next one.

00:44:58.880 | Let's go to the next one.

00:44:59.880 | Let's go to the next one.

00:45:00.880 | Let's go to the next one.

00:45:01.880 | Let's go to the next one.

00:45:02.880 | Let's go to the next one.

00:45:03.880 | Let's go to the next one.

00:45:04.880 | Let's go to the next one.

00:45:05.880 | Let's go to the next one.

00:45:06.880 | Let's go to the next one.

00:45:07.880 | Let's go to the next one.

00:45:08.880 | Let's go to the next one.

00:45:10.880 | Let's go to the next one.

00:45:11.880 | Let's go to the next one.

00:45:12.880 | Let's go to the next one.

00:45:13.880 | Let's go to the next one.

00:45:14.880 | Let's go to the next one.

00:45:15.880 | Let's go to the next one.

00:45:16.880 | Let's go to the next one.

00:45:17.880 | Let's go to the next one.

00:45:18.880 | Let's go to the next one.

00:45:19.880 | Let's go to the next one.

00:45:20.880 | Let's go to the next one.

00:45:21.880 | Let's go to the next one.

00:45:22.880 | Let's go to the next one.

00:45:23.880 | Let's go to the next one.

00:45:24.880 | Let's go to the next one.

00:45:25.880 | Let's go to the next one.

00:45:26.880 | Let's go to the next one.

00:45:27.880 | Let's go to the next one.

00:45:28.880 | Let's go to the next one.

00:45:30.880 | Let's go to the next one.

00:45:31.880 | Let's go to the next one.

00:45:32.880 | Let's go to the next one.

00:45:33.880 | Let's go to the next one.

00:45:34.880 | Let's go to the next one.

00:45:35.880 | Let's go to the next one.

00:45:36.880 | Let's go to the next one.

00:45:37.880 | Let's go to the next one.

00:45:38.880 | Let's go to the next one.

00:45:39.880 | Let's go to the next one.

00:45:40.880 | Let's go to the next one.

00:45:41.880 | Let's go to the next one.

00:45:42.880 | Let's go to the next one.

00:45:43.880 | Let's go to the next one.

00:45:44.880 | Let's go to the next one.

00:45:45.880 | Let's go to the next one.

00:45:46.880 | Let's go to the next one.

00:45:47.880 | Let's go to the next one.

00:45:48.880 | Let's go to the next one.

00:45:49.880 | Let's go to the next one.

00:45:50.880 | Let's go to the next one.

00:45:51.880 | Let's go to the next one.

00:45:52.880 | Let's go to the next one.

00:45:53.880 | Let's go to the next one.

00:45:54.880 | Let's go to the next one.

00:45:55.880 | Let's go to the next one.

00:45:56.880 | Let's go.

00:45:57.880 | Let's go to the next one.

00:45:58.880 | Let's go to the next one.

00:45:59.880 | Let's go to the next one.

00:46:00.880 | Okay.

00:46:01.880 | I figured out what I did wrong.

00:46:06.880 | I completely skipped step six, which is an important step.

00:46:10.880 | You can confirm if you've done step six by having a look in your

00:46:14.880 | Explorer for Visual Studio Code.

00:46:15.880 | Once you've done that, you should have a file called .env,

00:46:19.880 | and that is where we've set up all of the endpoints and keys

00:46:23.880 | that you will need to access the resources that we provided for you.

00:46:27.880 | And you'll also have a config.json file, which does similar kinds of things for AI

00:46:34.880 | Studio.

00:46:36.880 | And then once that is all set up, I should be able to go back and run this pre-provision script.

00:46:42.880 | to the next one.

00:46:43.880 | Let's go to the next one.

00:46:44.880 | Let's go to the next one.

00:46:45.880 | Let's go to the next one.

00:46:46.880 | Let's go to the next one.

00:46:47.880 | Let's go to the next one.

00:46:48.880 | Let's go to the next one.

00:46:49.880 | Let's go to the next one.

00:46:50.880 | Let's go to the next one.

00:46:51.880 | Let's go to the next one.

00:46:52.880 | Let's go to the next one.

00:46:53.880 | Let's go to the next one.

00:46:54.880 | Let's go to the next one.

00:46:55.880 | Let's go to the next one.

00:46:56.880 | Let's go to the next one.

00:46:57.880 | Let's go to the next one.

00:46:58.880 | Let's go to the next one.

00:46:59.880 | Let's go to the next one.

00:47:00.880 | Let's go to the next one.

00:47:01.880 | Let's go to the next one.

00:47:02.880 | Let's go to the next one.

00:47:03.880 | Let's go to the next one.

00:47:04.880 | Let's go to the next one.

00:47:05.880 | Let's go to the next one.

00:47:06.880 | Let's go to the next one.

00:47:07.880 | Let's go to the next one.

00:47:08.880 | Let's go to the next one.

00:47:09.880 | Let's go to the next one.

00:47:10.880 | Let's go to the next one.

00:47:11.880 | Let's go to the next one.

00:47:12.880 | Let's go to the next one.

00:47:13.880 | Let's go to the next one.

00:47:14.880 | Let's go to the next one.

00:47:15.880 | Let's go to the next one.

00:47:16.880 | Let's go to the next one.

00:47:17.880 | Let's go to the next one.

00:47:18.880 | Let's go to the next one.

00:47:19.880 | Let's go to the next one.

00:47:20.880 | Let's go to the next one.

00:47:21.880 | Let's go to the next one.

00:47:23.880 | Let's go to the next one.

00:47:24.880 | Let's go to the next one.

00:47:25.880 | Let's go to the next one.

00:47:26.880 | Let's go to the next one.

00:47:27.880 | Let's go to the next one.

00:47:28.880 | Let's go to the next one.

00:47:29.880 | Let's go to the next one.

00:47:30.880 | Let's go to the next one.

00:47:31.880 | Let's go to the next one.

00:47:32.880 | Let's go to the next one.

00:47:33.880 | Let's go to the next one.

00:47:34.880 | Let's go to the next one.

00:47:35.880 | Let's go to the next one.

00:47:36.880 | Let's go to the next one.

00:47:37.880 | Let's go to the next one.

00:47:38.880 | Let's go to the next one.

00:47:39.880 | Let's go to the next one.

00:47:40.880 | Let's go to the next one.

00:47:41.880 | Let's go to the next one.

00:47:42.880 | Let's go to the next one.

00:47:43.880 | Let's go to the next one.

00:47:44.880 | Let's go to the next one.

00:47:45.880 | Let's go to the next one.

00:47:46.880 | Let's go to the next one.

00:47:47.880 | Let's go to the next one.

00:47:48.880 | Let's go to the next one.

00:47:49.880 | Let's go to the next one.

00:47:50.880 | Let's go to the next one.

00:47:51.880 | Let's go to the next one.

00:47:52.880 | Let's go to the next one.

00:47:53.880 | Let's go to the next one.

00:47:54.880 | Let's go to the next one.

00:47:55.880 | Let's go to the next one.

00:47:56.880 | Let's go to the next one.

00:47:57.880 | Let's go to the next one.

00:47:58.880 | Let's go to the next one.

00:47:59.880 | Let's go to the next one.

00:48:00.880 | Let's go to the next one.

00:48:02.880 | Let's go to the next one.

00:48:03.880 | Let's go to the next one.

00:48:04.880 | Let's go to the next one.

00:48:05.880 | Let's go to the next one.

00:48:06.880 | Let's go to the next one.

00:48:07.880 | Let's go to the next one.

00:48:08.880 | Let's go to the next one.

00:48:09.880 | Let's go to the next one.

00:48:10.880 | Let's go to the next one.

00:48:11.880 | Let's go to the next one.

00:48:12.880 | Let's go to the next one.

00:48:13.880 | Are you in the right directory?

00:48:14.880 | It looks all right.

00:48:15.880 | I think it's because I ran the script already once.

00:48:17.880 | I'll just have to do it again.

00:48:18.880 | Let's go to the next one.

00:48:19.880 | Let's go to the next one.

00:48:20.880 | Let's go to the next one.

00:48:21.880 | Let's go to the next one.

00:48:22.880 | Let's go to the next one.

00:48:23.880 | Let's go to the next one.

00:48:24.880 | Let's go to the next one.

00:48:25.880 | Yeah, that's right.

00:48:50.880 | So one of the things we did in that pre-provision script

00:48:54.880 | was to take each of the markdown files, which are in the repository.

00:48:57.880 | There's one markdown file per product that the company sells.

00:49:01.880 | And then we just script indexing that into the AI search database,

00:49:05.880 | which essentially converts that entire markdown file into one point

00:49:09.880 | in multidimensional space.

00:49:11.880 | It is actually chunking it, yeah.

00:49:15.880 | Actually, no.

00:49:16.880 | Actually, in this example, we don't chunk it, just for simplicity.

00:49:19.880 | If you actually do it through AI Studio on the search on your own data,

00:49:22.880 | it does do the chunking there.

00:49:23.880 | Yeah.

00:49:24.880 | Yeah.

00:49:25.880 | Mm-hmm.

00:49:26.880 | Yes.

00:49:27.880 | Question.

00:49:28.880 | While we're getting going, how does PromptFlow, Autogen, and Semantic kernel

00:49:37.880 | all come together?

00:49:38.880 | Are they all competing projects at Microsoft, or are they...?

00:49:42.880 | Not so much competing, and they have slightly different versions.

00:49:47.880 | Yeah.

00:49:48.880 | Let's start with Semantic kernel and Autogen.

00:49:50.880 | Both of those are orchestrators.

00:49:53.880 | They come from different parts of Microsoft.

00:49:56.880 | Semantic kernel we kind of view as the enterprise product.

00:49:59.880 | that's the one that is designed for user enterprise settings, has strict versioning,

00:50:05.880 | you know, API changes, all those kinds of stuff.

00:50:08.880 | But like Langchain and other orchestrators, you can use it to connect different tools together

00:50:14.880 | in different environments.

00:50:16.880 | Autogen serves a similar kind of purpose, but it comes out of Microsoft Research.

00:50:20.880 | So it's a little bit more cutting edge.

00:50:22.880 | It's a little bit more flexible, based on a slightly different paradigm.

00:50:25.880 | But that's not the one that we recommend for enterprise applications today.

00:50:29.880 | PromptFlow is a different beast again.

00:50:32.880 | PromptFlow is directly within the AI Studio product, and it's purely for orchestrating

00:50:38.880 | within an endpoint that you deploy through the AI Studio product.

00:50:43.880 | One endpoint.

00:50:46.880 | One endpoint.

00:50:47.880 | The whole purpose is to create one endpoint that goes through, in this case, a RAG process.

00:50:51.880 | But it's designed to be more flexible than just for RAG.

00:50:54.880 | You could use semantic kernel to manage multiple PromptFlows as endpoints.

00:51:01.880 | So, yes.

00:51:02.880 | And also, you can embed PromptFlow as a library.

00:51:08.880 | Deploying it as an endpoint is one possibility.

00:51:11.880 | It can be used to write integration tests, evaluation tests, embed in your application.

00:51:19.880 | Like, if you have a Python native application and you want to embed a DAG in it, it's also

00:51:25.880 | kind of interesting.

00:51:26.880 | Is there anything like PromptFlow that sits on top of semantic kernel to manage multiple

00:51:42.880 | endpoints?

00:51:43.880 | Or just the graphics side of it.

00:51:44.880 | Or just the graphics side of it.

00:51:45.880 | Or just the graphics side of it.

00:51:46.880 | Or just the graphics side of it.

00:51:47.880 | I don't think so.

00:51:50.880 | And honestly, when you work with PromptFlow for any length of time, you'll be working with

00:51:56.880 | the YAML files.

00:51:58.880 | It's nice to have that picture.

00:52:00.880 | It's great for these workshops, because I can point to things and show you how these

00:52:03.880 | connect.

00:52:04.880 | And it's great for debugging, because you can actually see how the data is flowing through

00:52:07.880 | the thing.

00:52:08.880 | In terms of an editing environment, that's not really what it's for.

00:52:11.880 | Yeah.

00:52:12.880 | We also have the Python DSL version now.

00:52:17.880 | In addition to the graphical YAML-based DAG, now you can also use PromptFlow in a more programmatic,

00:52:27.880 | code-first way.

00:52:28.880 | Where you write the PromptFlow configuration in Python instead of graphically.

00:52:37.880 | As an alternative.

00:52:38.880 | Yeah.

00:52:39.880 | To be honest, I'm not sure how generally available it is.

00:52:52.880 | I'm sorry.

00:52:53.880 | To be verified.

00:52:54.880 | But ultimately, this is the representation of the PromptFlow.

00:52:59.880 | It's a YAML file, which just defines each node with a bunch of tags associated with its

00:53:04.880 | inputs and outputs, and how it connects to the various endpoints and the types of nodes that

00:53:09.880 | we provide.

00:53:10.880 | Excuse me.

00:53:11.880 | In AI Studio.

00:53:12.880 | If you're running that as like a Python library, at that point it's fairly similar to using

00:53:21.880 | Langchain in Python?

00:53:22.880 | Exactly.

00:53:23.880 | Yeah.

00:53:24.880 | Yeah.

00:53:25.880 | So there's a command line, which is what we're doing in this thing, is to run that PromptFlow

00:53:28.880 | with a given set of inputs to generate the outputs.

00:53:31.880 | Correct.

00:53:32.880 | Which provides a more Longchain-like experience.

00:53:43.880 | Right.

00:53:44.880 | Right.

00:53:45.880 | Right.

00:53:46.880 | Cool.

00:53:47.880 | Autogen could, in theory, do similar stuff, but it's more of a labs thing, not ready for

00:53:56.880 | prime time.

00:53:57.880 | Yeah.

00:53:58.880 | Like David was saying, the team at Microsoft that works on both projects are very different,

00:54:03.880 | and they have very different goals.

00:54:06.880 | PromptFlow and I'm blanking.

00:54:11.880 | Semantine-Kernel are very product-oriented, so they follow very strict software release lifecycle,

00:54:19.880 | whereas the other team, the Autogen team, is really research.

00:54:26.880 | So they try the latest cutting-edge AI things.

00:54:32.880 | They also use that for papers.

00:54:35.880 | So if you use Autogen, you're literally on the bleeding edge, and things might break.

00:54:43.880 | And things are experimental.

00:54:45.880 | So you might want to do that.

00:54:47.880 | It might work for your application, but you need to know where you're going into.

00:54:53.880 | Yeah.

00:54:54.880 | We had used it for a hackathon, and we were able to put together an agentic-type flow really

00:55:01.000 | quickly with it, so we liked it, but it's good to know that we need to be a little careful

00:55:05.540 | with it.

00:55:06.540 | Yeah.

00:55:07.540 | Definitely.

00:55:08.540 | Okay.

00:55:09.540 | Thank you.

00:55:10.540 | So semantic-kernel is more for the orchestration, and this could be more closer to the app layer,

00:55:21.540 | where I'm building a back-end, where I'd be using semantic-kernel to orchestrate the LLM,

00:55:27.540 | and the calls on that side.

00:55:42.540 | Prompt flow is more on, I want to build a DAG, manage its lifecycle, potentially evaluation.

00:55:44.540 | Hopefully we'll get to that in the workshop.

00:55:46.540 | Yeah.

00:55:47.540 | And then deploy it as one endpoint.

00:55:48.540 | Exactly.

00:55:49.540 | Is that the right way to look at things?

00:55:49.540 | Sure.

00:55:50.540 | And like I was saying, you can also embed Prompt flow as a framework, as a library.

00:56:05.260 | You don't have to deploy it as an endpoint.

00:56:09.260 | But that's the use case we'll be using in this one, as an endpoint.

00:56:15.260 | And to make things even more confusing, sorry, semantic-kernel, you can also build agentic

00:56:28.260 | applications with it.

00:56:29.260 | It also has some of Autogen's use cases.

00:56:36.260 | Simple ones, but you can build agentic applications with it.

00:56:53.260 | Personally, I like to use it that way, because you're in development.

00:56:58.260 | It makes things easier to write tests, to write snippets of code that you can reuse.

00:57:06.260 | So this is very convenient for that.

00:57:09.260 | So it's more like development convenience rather than anything else.

00:57:18.260 | If you wanted to build, I don't know, like a rich Python desktop app where you happen to be

00:57:25.260 | wanting to orchestrate LLMs, that would also be a relevant use case.

00:57:35.260 | I would say--

00:57:48.260 | I can jump in there, Cedric.

00:57:51.260 | Prompt flow you'll find useful when you're managing all your research within Azure AI Studio.

00:57:55.260 | I haven't actually got to show you that yet.

00:57:56.260 | We'll do that in a minute.

00:57:57.260 | But within Azure AI Studio, you can launch-- you can create endpoints for open AI models,

00:58:04.260 | Mistral models, LLAMA models, anything from the model catalog.

00:58:07.260 | You can create connections into other Azure resources like Cosmos DB and AI Studio--

00:58:12.260 | AI Search, rather.

00:58:14.260 | You can hook it up to our evaluation framework.

00:58:17.260 | We'll be seeing that a little bit later on.

00:58:18.260 | All the features in this for managing the entire lifecycle of the endpoint itself and all the

00:58:25.260 | resources that are required to make that endpoint search.

00:58:27.260 | That's the situation where you'll mainly be using Prompt Flow is in basically managing

00:58:33.260 | that connection between all those resources with the goal of creating an endpoint.

00:58:38.260 | You'll be, in our example here, calling from a web app, just regular React API endpoint.

00:58:46.260 | But likewise, you wanted to call that same endpoint from within semantic kernel or any other orchestrator,

00:58:51.260 | you could do that same thing there as well.

00:58:56.260 | And if I may add, what I like also about Prompt Flow is that when you go to AI Studio,

00:59:01.260 | because it's very, very well integrated in AI Studio, so you have the visual aspect of it.

00:59:06.260 | But if you go to the playground of AI Studio, you can configure a RAG application and also

00:59:14.260 | build a code interpreter application and export it to Prompt Flow.

00:59:21.260 | So instead of having to come up with a whole design of a Prompt Flow-based application,

00:59:27.260 | you can just use the playground, configure it as you want, and export it and have a ready-to-use

00:59:33.260 | Prompt Flow application that you can re-use and embed in your project.

00:59:38.260 | So you also have a workflow that goes from the UI to the code that way.

00:59:43.260 | Which you don't with a semantic kernel.

00:59:46.260 | If you'd like to play around with that side of it, it's not part of the workshop.

00:59:58.260 | But feel free, because you have an instance of Azure AI Studio running already in your virtual machine.

01:00:04.260 | You go through the steps of selecting the one project we have here, which is called Contoso Chat SF AI approach.

01:00:12.260 | One of the things you might want to play around is the playground.

01:00:15.260 | We've provided you with GPT-4 and GPT-35 Turbo endpoints, I believe.

01:00:22.260 | And this is the place where you can interface through a playground to test out the endpoints you've created.

01:00:28.260 | Same place as when you get into the Prompt Flow thing a little bit later on, you can then test out the connections

01:00:32.260 | between those endpoints and the databases and everything else you've built your RAG application around.

01:00:38.260 | From there you have the Prompt Flow button where you can export to Prompt Flow.

01:00:46.260 | That's what I was talking about.

01:00:51.260 | Yeah, absolutely.

01:00:53.260 | That's what we're here for.

01:00:58.260 | Yeah, so Azure AI Studio is for mainly the purpose of creating an endpoint that works against

01:01:21.260 | an LLM and evaluating those endpoints.

01:01:24.260 | They're the main two use cases of AI Studio.

01:01:27.260 | Copilot Studio by hand is about building applications, not endpoints.

01:01:31.260 | Either building a complete application like a chatbot application or any kind of user interface

01:01:37.260 | that has an LLM element or integrating those types of applications into applications like teams.

01:01:43.260 | So Copilot Studio is for building entire apps.

01:01:46.260 | AR Studio is just for building endpoints.

01:01:56.260 | And as you might have guessed, Copilot Studio was built with AI Studio.

01:02:01.260 | In terms of the, in terms of, in terms of the, in terms of the, in terms of the

01:02:06.260 | Yeah.

01:02:07.260 | We were getting a bit of a big question for an exploring that.

01:02:10.260 | Mm-hm.

01:02:11.260 | Will you see it that, that we build an application in Copilot Studio and then call it

01:02:16.260 | that we build in, um, under AI Studio?

01:02:21.260 | If you, if you want that level of customization, you don't need to with Copilot Studio.

01:02:25.260 | Is it designed in such a way you can build complete apps without having to customize your own endpoints?

01:02:30.260 | But if that's the position you're in for a particular use case, then yes, absolutely.

01:02:35.260 | You can call any endpoint from Copilot Studio, including ones created with AI Studio.

01:02:40.260 | I don't believe, I'm not so familiar with that product, but I don't believe they have evaluation

01:02:47.260 | features.

01:02:48.260 | Yeah.

01:02:49.260 | Yeah, you would evaluate the, the LLM through its endpoint here in AI Studio.

01:02:56.260 | Yep.

01:02:57.260 | We'll come to that in probably about 20 minutes or so.

01:02:58.260 | Yeah.

01:02:59.260 | But here is the very simplest prompt flow I just generated from the chat playground.

01:03:15.260 | All it does is take an input, passes it through to the LLM called chat there, and then generates

01:03:21.260 | the output directly back into the breakers.

01:03:23.260 | That's a completely unfiltered AI endpoint.

01:03:27.260 | Whereas if you use an application like ChatGPT, there's a whole much more going on with your

01:03:33.260 | prompt and the context and everything else in a more rag style than just passing it directly

01:03:38.260 | to an endpoint.

01:03:39.260 | I'll keep on going to catch up to where you all are so I can demo the good bits when we

01:03:52.260 | go to the end.

01:03:53.260 | That's where I would get the documentation to make it a little bit clearer.

01:03:57.260 | Okay.

01:03:58.260 | Cool.

01:03:59.260 | And then, let's see.

01:04:00.260 | Now, make sure.

01:04:00.260 | So, you're just missing that in the document that makes it clear.

01:04:00.260 | Oh, it couldn't do it.

01:04:00.260 | So, you're already there.

01:04:00.260 | There's no file.

01:04:00.260 | I think you're already there.

01:04:01.260 | Oh, I think you're already there.

01:04:02.260 | Oh, no.

01:04:02.260 | Yep, yep.

01:04:02.260 | Oh, I already see there.

01:04:03.260 | Yep, yep, yep.

01:04:04.260 | Oh, I already see there.

01:04:04.260 | So, just set up.

01:04:05.260 | Yep, yep.

01:04:06.260 | So, just set up.

01:04:07.260 | Yep, yep.

01:04:07.260 | And then, you're going to...

01:04:08.260 | So, you're already there.

01:04:09.260 | There's no file.

01:04:09.260 | I think you're already there.

01:04:10.260 | Oh, I already see there.

01:04:11.260 | Yep, yep, yep.

01:04:12.260 | Oh, I already see there.

01:04:13.260 | Yep, yep.

01:04:14.260 | So, just set up.

01:04:15.260 | Yep, yep.

01:04:16.260 | And then, you're going to execute the database.

01:04:17.260 | That's it.

01:04:18.260 | That's it.

01:04:19.260 | Okay.

01:04:20.260 | So, you're just missing that in the document that makes it clear.

01:04:21.260 | Okay.

01:04:22.260 | Oh, they couldn't do it.

01:04:23.260 | Okay.

01:04:24.260 | So, you're already there.

01:04:25.260 | There's no file.

01:04:26.260 | Okay, you're already there.

01:04:27.260 | Oh, no.

01:04:28.260 | Yep, yep.

01:04:29.260 | Oh, I already see there.

01:04:30.260 | So, just set up.

01:04:31.260 | Yep, yep.

01:04:32.260 | Yep, yep.

01:04:33.260 | And then, you're going to execute the...

01:04:36.260 | Set up the...

01:04:38.260 | The database.

01:04:39.260 | That's it.

01:04:40.260 | Thank you.

01:05:10.240 | Yeah, just one sec.

01:05:32.300 | The question was, are we able to use another vector database besides Cosmos DB?

01:05:39.600 | Yeah, absolutely we can.

01:05:44.900 | When we get the prompt flow up, I'll show you that.

01:05:47.380 | But if we go into the prompt flow node, which I think is the retrieve product information

01:05:53.480 | node, you can see it's set up directly as a connection to AI search.

01:05:58.240 | But you can set it up as a connection to other vector databases as well, or even just an

01:06:02.020 | endpoint.

01:06:04.300 | What do you think is the biggest differentiator between AWS and Microsoft?

01:06:09.500 | What do I think is the biggest differentiator between AWS and Microsoft?

01:06:12.660 | Like why should I go with using Microsoft for startups versus AWS for startups?

01:06:18.080 | That's a different question because I was going to go like, the main differentiator between

01:06:21.920 | Azure and AWS is the enterprise scale.

01:06:25.040 | You know, Azure is a database that's used by big companies like Microsoft and so forth and

01:06:30.180 | is designed with a lot of features in that around authentication, security, scaling, monitoring

01:06:36.120 | that big companies that run production apps really need.

01:06:39.140 | That actually makes it a little more difficult for startups, honestly.

01:06:43.100 | Because the first thing you do when you start working with Azure is start dealing with things

01:06:46.560 | like resource groups and security, you know more about this than I do.

01:06:51.980 | We have to deal with that a lot.

01:06:53.240 | That's why we set up Microsoft for startups, is to help startups get into the process of understanding

01:06:59.200 | the different kind of process of working with resources in Azure, which is quite different

01:07:03.820 | from AWS in the sense that you can just spin up a single VM in AWS and you're done.

01:07:10.080 | Microsoft, when you spin up a VM, there's actually six different resources that get created

01:07:14.020 | because it's there to support the enterprise use case as opposed to just, I want a single

01:07:18.920 | VM.

01:07:19.180 | And I will say Microsoft for startups, they have a pretty cool program, the Microsoft for

01:07:26.360 | startups hub.

01:07:27.240 | They can give you up to $1,000, $100,000 in credits for Azure.

01:07:32.560 | It's like different tiers.

01:07:33.920 | So, you know, level one, they give you this many credits.

01:07:35.880 | Level two, they give you this many credits.

01:07:37.480 | So, you really get a lot of credits for developing your first startup application.

01:07:41.780 | And they also have a pretty good ecosystem.

01:07:43.860 | They have, like, different mentors that you can get paired up with.

01:07:47.120 | They have, like, different sessions that you can attend to learn, like, how do I use this

01:07:50.560 | tool?

01:07:50.840 | How do I, like, implement this technology?

01:07:52.280 | So, they give you the credits and a lot of the guidance to make that happen.

01:07:56.420 | And the whole Microsoft for startups team is here at the booth in, in, um, salon nine.

01:08:02.480 | So, chat to them about the startups up program and they can get you in at the right level.

01:08:06.060 | Yeah.

01:08:06.620 | Yeah.

01:08:06.620 | And, um, uh, uh, uh, uh, uh, uh, uh, uh, uh, uh.

01:08:12.300 | Thank you.

01:08:42.280 | Thank you.

01:09:12.260 | Thank you.

01:09:42.240 | Thank you.

01:10:12.220 | That's correct, yeah.

01:10:21.620 | Yeah, I had to start over because I ran that script before I did all the configuration.

01:10:26.900 | There we go.

01:10:53.440 | Yeah, so, no, don't do that actually.

01:11:12.820 | Can you click on text embedding?

01:11:14.560 | Yeah, it is connected.

01:11:17.780 | So, yeah, I'm not sure.

01:11:19.880 | Let me talk to David and ask him.

01:11:22.140 | Can you click the deployment?

01:11:35.060 | Yeah.

01:11:36.240 | You could, absolutely.

01:11:50.660 | Yeah, the way PromptFlow is set up is it actually manages the chat history for you as it comes through the flow.

01:11:56.520 | But if you want to restore it beyond that interaction, then yeah, you can store it in the database.

01:11:59.700 | Oh, yes.

01:12:00.880 | There might be the documentation that we missed.

01:12:06.160 | What's that?

01:12:06.660 | They're going to go in and it's going to go in and put the actual cost going to the LLM.

01:12:10.520 | Some of them are.

01:12:11.880 | Yep, yep.

01:12:12.300 | Okay.

01:12:12.500 | Yep.

01:12:13.120 | Maybe you might ask about like products or like a specific item.

01:12:16.640 | So it's going to say for those.

01:12:17.900 | And then it's going to put those as a part of the workflow, right?

01:12:21.840 | Actually it's AOAI, and by default it is correctly set.

01:12:31.180 | Okay.

01:12:31.600 | I missed it when I went through it.

01:12:34.060 | Okay.

01:12:34.480 | But a few of them were, and what's confusing is that in addition to the Azure AOAI connection, you do have a default.

01:12:47.520 | Right, yeah, there's two sets of resources.

01:12:48.900 | Yeah, so when you switch to default, actually you break things.

01:12:52.240 | Oh, okay.

01:12:53.040 | Okay.

01:12:53.620 | All right, when folks get to that, I'll have you show them.

01:12:56.740 | Yep.

01:12:57.540 | For those of you who haven't done step eight yet, the custom connection for Cosmos DB, I'm going to go through that now.

01:13:11.760 | Go to the connected resources and view them all, and then create a new connection.

01:13:28.980 | So this is one of the things that the Prompt Flow does for you, is it has standardized connections to all these tools.

01:13:40.500 | But if you want to connect to any other service, you can use a custom connection, and that in fact is what we're going to be doing here.

01:13:46.200 | Cedric looks like we have a question over the back right there.

01:13:50.760 | I'm going to add the key value pairs for our connection to Cosmos DB.

01:14:02.100 | I'm going to grab that value from our .env file.

01:14:26.440 | We have another good question here.

01:14:29.320 | Yeah, what is it?

01:14:30.240 | How does the vector search perform at scale?

01:14:33.920 | So if you have a million vectors and you want to perform a nearest neighbor search against a query vector, what are some of the strategies to prune that?

01:14:43.980 | And then how does that impact recall as well?

01:14:46.480 | Yeah, well, first of all, that's the entire reason why vector databases exist, is to exactly do that search quickly.

01:14:53.260 | They're indexed in such a way that they can do that nearest neighbor connection at scale and at speed.

01:14:59.240 | So you can certainly do that.

01:15:00.960 | It's not very difficult to do it yourself, to do an embedding for a bunch of documents or a bunch of chunks of documents,

01:15:08.000 | and then do a nearest neighbor algorithm to find what is closest to the embedding for the customer's question.

01:15:13.760 | But the advantage of doing it within a vector database is you can do that quickly and at scale.

01:15:17.060 | And what was the second part of your question?

01:15:18.740 | I guess there's different neighbor algorithms, right?

01:15:21.360 | Yeah.

01:15:21.480 | There is like ANN as well.

01:15:23.360 | And then you have to, I think there's parameters that you have to tune.

01:15:26.920 | That affects recall as well, depending on how far you want the tree search to occur.

01:15:31.820 | Yeah.

01:15:32.200 | And so what are some strategies to balancing that and getting the highest recall instead of just doing brute force?

01:15:39.420 | Because brute force, I'm assuming, takes a while at scale.

01:15:43.500 | Interestingly, and this might not be the answer you expect, is that, at least in our experience working with real-world applications,

01:15:50.640 | vector search by itself actually isn't enough, despite playing around with the algorithms and choosing the parameters so you can expand out the search and not miss documents.

01:16:00.440 | What we've actually found is actually a combination of keyword search and vector search together actually outperforms either.

01:16:08.300 | And that's a feature that's built straight into Azure AI search.

01:16:10.840 | I think by default these days, it actually defaults to a hybrid search.

01:16:15.360 | In this particular example, just to keep things simple, we're just doing a vector search.

01:16:19.060 | But for practical applications, we actually recommend a combo of keyword and vector search.

01:16:25.880 | And I'm assuming that metadata that you can search, like, for example, keywords can be updated for a product, in this case, at any point in time for any vector, right?

01:16:35.180 | Yeah, that's right.

01:16:36.100 | And then, is there, like, the last question, is there an upper limit on the dimension size of the embeddings?

01:16:42.280 | I'm sure there is an upper limit somewhere, but we haven't come across it for AI search, at least.

01:16:50.300 | Yeah, I don't think there's one sort of in there by design.

01:16:52.920 | I don't think there's one sort of in there, but we haven't come across it, but we haven't come across it, but we haven't come across it, but we haven't come across it, so we haven't come across it now.

01:17:20.640 | Yeah.

01:17:20.740 | What about using Entry ID with this, certainly external Entry ID, if you're trying to develop this for an externally facing application, multi-tenanted, and everybody having their own little space, shall we say?

01:17:38.260 | Yeah.

01:17:38.600 | Is there any sort of frameworks or things like this that are set up to show how to sort of set that up across all levels?

01:17:51.200 | Because you're sort of across, like, if you are, if you're going from what I'm talking about, you're talking about Copilot Studio into this, into AI search, into Cosmos DB.

01:18:00.100 | Is there any sort of, is there anything written down anywhere?

01:18:02.660 | I haven't been able to find it, basically.

01:18:03.660 | Okay, that's an area I'm not an expert in, but Cedric or Miguel might be, either of you, too.

01:18:06.900 | Entry ID.

01:18:08.400 | So, but, and, using external Entry IDs, I know it's only come out a couple of months ago, but being able to create a scope across all of those systems,

01:18:19.740 | so that the scope that you can see is for that Entry, external Entry ID, is there anything written down to show how to do that across each of the individual systems in the right way,

01:18:31.020 | or is there anybody working on that, or is, do you know?

01:18:33.360 | You're, when you said Intry ID, so from an authentication standpoint?

01:18:39.140 | So, say again?

01:18:41.460 | Yeah, you're talking about the Intry authentication mechanism?

01:18:44.480 | Yeah, Entry, yeah, specifically the external ID, so using external IDs, so social logins and things like that.

01:18:51.060 | So, if you're facing this to the outside world, enabling external users, external IDs to actually get scope across the whole thing.

01:18:59.240 | Yeah, no, it's, to be honest, it's not a domain where I've spent much time yet, so I'm not going to be able to talk much about it.

01:19:06.060 | I can add a little bit of color.

01:19:06.940 | You wouldn't have external users log into AI Studio, and that's for the developers and for the IT managers.

01:19:13.960 | But what you are exposing to the app, which the end users are then accessing, is those endpoints.

01:19:20.680 | And you can manage those endpoints either by tokens or by managed identity.

01:19:24.700 | So, whatever way that you want the app to talk to the endpoint based on that identity controls, obviously, what the endpoint is then able to do.

01:19:33.760 | And then all the way through.

01:19:35.340 | Exactly, yeah.

01:19:38.540 | Okay.

01:19:38.940 | And then this, at the moment, on the, if I'm reading this correctly, the retrieval document, retrieve documentation from AI search, is retrieving the vector, and then it's handing it off to the actual document itself.

01:19:56.260 | So, it's only the vector that's getting passed, not the document itself?

01:19:58.740 | No, well, the vector is then used to retrieve the document, in this case a markdown file.

01:20:03.700 | And then the markdown file actually gets inserted into the prompt so that the LLM can see it, so the OpenAI can see it, and form its answers based on that information.

01:20:13.000 | Okay.

01:20:13.860 | And so that chunk is being sent?

01:20:15.300 | Exactly, yeah.

01:20:16.060 | And I'll show you how that gets put together in a sec.

01:20:17.900 | There is one file there that has the prompt, and then it has the variable for the document, so that variable simply gets replaced by the text that it grabs from the search.

01:20:45.820 | So, just on, to continue building on the entry ID piece, because one thing that we're looking into is, for example, if we want to do granular access control, and make sure that we don't pass to our prompt flow, you know, the ability for it to search any database and retrieve any data from a user, just to make sure that a nefarious actor might not be able to get data from other users by something, through something like prompt injection.

01:21:12.660 | Yeah.

01:21:13.140 | I noticed that in the authentication type for prompt flow, we're able to use token-based, so Azure AD tokens.

01:21:21.380 | Let's say that we do have a user log into our app, pass through that token to call the prompt flow endpoint.

01:21:30.120 | Can we then use that token authentication of the user to call subsequent components of our Azure stack?

01:21:39.960 | So, to pass through that user's delegated token, and just make sure that we're retrieving data for that user?

01:21:46.580 | I don't think so.

01:21:48.220 | Go ahead, Cedric.

01:21:49.520 | I was going to say yes, but...

01:21:53.980 | I'm not a security expert.

01:21:55.080 | These guys don't love that than I did.

01:21:57.400 | So, the only thing is that I don't know much about intra.

01:22:01.080 | So, any intra-specific things, I don't know much.

01:22:05.140 | But, because I joined Microsoft not so long ago, but when it comes to, like, normal security, you could use any kind of token, such as JWT token, and then, like you would, you know, encode a JWT token for any kind of REST API.

01:22:24.100 | You can pass the JWT token to the prompt flow endpoint, and use it inside the prompt flow definition to pass on to whatever internal service you would like to.

01:22:38.760 | So, I think a very concrete example of that would be, let's say we have the Cosmos DB endpoint.

01:22:44.900 | And I want to ensure that I can only access the specifics user data in that Cosmos DB.

01:22:52.840 | So, I want to actually use, like, row-based access control, where that user is only allowed to see certain rows in that.

01:23:00.000 | Oh, yeah.

01:23:00.540 | I can actually show you something to do with that right here, now that I understand.

01:23:04.640 | So, what we have here is the simple prompt flow for this particular application, where I do the input.

01:23:11.520 | There's the embedding to retrieve the product documentation.

01:23:14.440 | There's nothing secret in that.

01:23:16.220 | What's relevant here to your question is, let me make this a bit bigger so we can see.

01:23:21.700 | There we go.

01:23:22.600 | And the customer lookup, which is using information that is provided by the customer through them having logged into the website.

01:23:36.160 | So, for example, when I go to this website, do I have it open still?

01:23:42.480 | Did I close it?

01:23:46.120 | I get a mess, slash, AITour, Contoso Web.

01:23:53.920 | This link is given in one of the last steps of your AITour T.

01:24:01.840 | That's not right, Contoso Chat.

01:24:04.680 | So, in this case, when the LLM gets the prompt, it already knows who you are.

01:24:11.100 | It already knows you are David.

01:24:13.020 | And it gives to the LLM only David's information.

01:24:16.360 | It will be different if the LLM asks, you know, what's your name?

01:24:19.860 | You're like, David.

01:24:20.880 | Never mind, I'm Miguel.

01:24:21.880 | Because then it could be jailbreak.

01:24:24.240 | But in this case, all of that is set up so that when the LLM gets there, it already has your name and it's authenticated.

01:24:30.240 | And it has only your information and only your information.

01:24:33.440 | And I think that's just about what you were going to show.

01:24:36.820 | Yeah, this is what I was trying to get to.

01:24:39.120 | All right.

01:24:39.960 | So, when we're at this chat application, we can see Sarah Lee is logged in already.

01:24:45.720 | So, as we go through the prompt flow, one of the inputs there is the customer ID.

01:24:50.980 | And that's come from this app through the token that's being provided to the endpoint.

01:24:56.100 | Actually, no, it's a parameter to the endpoint in this particular case.

01:24:58.700 | It's not set up exactly that way.

01:25:00.940 | But when we ask the question, what did I order last time?

01:25:05.480 | What's important to understand there is that there's nothing in this app that is searching any database.

01:25:16.380 | All it's doing is passing the user ID and that question, what did I order last time, into that whole prompt flow.

01:25:25.000 | And then as part of that prompt flow process, which is a privileged account, it is then query the database with that user ID to get back her list of product purchases.

01:25:37.760 | Then the LLM is operating on that information with that question to generate that bit of text that you see.

01:25:43.660 | And that bit of text is the only thing that actually goes back to the app.

01:25:46.720 | So, the app doesn't have any direct access to databases at all.

01:25:49.980 | Is that what you're getting at there?

01:25:51.180 | I think that is what we're getting at.

01:25:53.760 | And I think you hit the nail on the head when you're saying that prompt flow has privileged access to the database.

01:25:58.400 | What I'm trying to avoid is for prompt flow to have privileged access.

01:26:02.100 | What I want to do is to inherit the access of the calling user through, for example, his AD token.

01:26:08.760 | Right.

01:26:09.080 | So, I'm speculating here.

01:26:11.260 | This is not my domain.

01:26:12.140 | But I think the way that would work is through the features of the database where you give that authentication information with your query that prevents you otherwise.

01:26:20.480 | Then would that prevent what you're trying to do?

01:26:22.560 | Yeah, absolutely.

01:26:23.440 | I'm just wondering whether or not that's already something that you're looking into with, for example, the prompt flow connection to Cosmos DB.

01:26:31.020 | That I'm not sure of, but I'm sure they are.

01:26:33.900 | Because that's the whole purpose for this thing existing in the first place.

01:26:37.640 | This is what Copilot Chat was built on, for example.

01:26:41.020 | And sort of that's all based on enterprise logins and things like that.

01:26:45.080 | Questions there?

01:26:50.680 | Yeah.

01:26:50.860 | Like, if you want to, like, have variable loops in the control DAG, is that possible?

01:27:02.540 | Yeah, the question on that, do you have variable control points in the loop?

01:27:04.760 | Yeah, that is absolutely.

01:27:05.720 | If you run this within Visual Studio Code, you can set breakpoints in the Python code that runs within each of the nodes.

01:27:12.260 | Is that the question you're asking?

01:27:13.380 | I guess, like, depending on the results, like, of one node, for example.

01:27:18.340 | Oh, conditional results.

01:27:19.340 | You might want to route to, like, different nodes.

01:27:21.440 | Yeah, absolutely.

01:27:22.660 | Because within, what's actually being run within each of those nodes, let's actually even take a look at that.

01:27:27.640 | If I go over to the prompt flow itself, let's have a look, for example, at the LLM response node, I think.

01:27:39.060 | Actually, that one's not very exciting.

01:27:40.960 | Let's have a look at the concept of the prompt node.

01:27:44.800 | So, what's actually happening at this point is it's got, this is actually just the prompt that actually gets built together.

01:27:54.740 | But you can see, like, it's got this, like, metaprogramming language, you know, for item and documentation and so forth.

01:28:00.040 | So, what that's doing through at that point is looping through all of the matched products that are related to the user query, extracting them out from Azure AI search as vectors, then extracting out the markdown files that relate to those vector indices, putting that directly into this prompt.

01:28:19.040 | So, when I ask the question of the app, you know, what's a good pair of shoes, that's not the only bit of text that is going to OpenAI at that point.

01:28:39.340 | What is, in fact, going to OpenAI is a whole bunch of text defined by this customer prompt here, including telling OpenAI, you're an AI agent for the Contours of Outdoor Product Retailer, you should always reference factual statements, the following documentation should be used in the response, and this is where the individual relevant products are inserted into the prompt.

01:29:04.040 | And, to our question earlier on, for this particular customer, this is where their previous orders are inserted into the prompt.

01:29:10.100 | And then, finally, the question, you know, what's a good pair of shoes, is sent to OpenAI.

01:29:15.720 | So, it has all that context from that RAG process to formulate a meaningful response based on that particular customer, their purchase history, the question, and the products that are related to their question.

01:29:28.780 | Yeah, I want to give you another example, here's a better example of that kind of thing, in this case, this particular node is just running Python code.

01:29:40.900 | So, you can put conditionals into that Python code, for example, based on the inputs to do different kinds of things, anything you like, in fact.

01:29:48.360 | May I think I know the specifics that you have in mind.

01:29:57.460 | What David showed, he showed two things, he showed in the templates, in the template nodes, he showed looping logic, and conditional, but it's looping and conditional string rendering constructs.

01:30:14.780 | And in the Python code, you can have any Python, like you can have conditions, loops, whatever, but to your point, all the nodes in the DAG are going to be executed.

01:30:27.700 | You cannot have conditional node execution, but what you can have is inside a node, in the Python code, you can conditionally execute something.

01:30:41.540 | But all the nodes are systematically going to be executed.

01:30:47.280 | It is not a business process orchestration system.

01:30:52.120 | It is really tailored towards building LLM applications.

01:30:56.100 | So, it is simplified, it is not generic.

01:31:02.840 | Does that make sense?

01:31:03.840 | Yeah, it looked like in that Python, though, that you had, that's where I was doing the customer lookup.

01:31:14.840 | So, how does that tie together?

01:31:16.620 | I mean, I see the line connecting it, and then I see the Jinja template for the prompt, and the Jinja template was iterating over customers, and that, you know, for, or sorry, iterating over the orders.

01:31:29.000 | So, how does that, how does that tie together?

01:31:33.920 | So, where are you at right now?

01:31:36.060 | This one?

01:31:38.380 | Yeah, there was, like, the customer lookup Python that we were just looking at.

01:31:42.660 | Trying to make it bigger so I can see it.

01:31:44.220 | Oh, yeah.

01:31:44.740 | The one on the right, yeah, the customer lookup.

01:31:47.660 | You have one node which queries the database, fetches all the information from the database, stores it into a variable, into the context, and then the Jinja template uses the previously set collection of results for the rendering.

01:32:07.820 | Right, that makes sense.

01:32:09.340 | So, when you say it stores it into it, is that where the response, the orders on line 13, is that doing it?

01:32:16.900 | Correct.

01:32:17.760 | Okay, and then if we, if we click on the next one down, the customer prompt, and we go to that loop again, there it is, well, but, oh, okay, customer.orders, so that's, that's how it ties, then, eh?

01:32:29.100 | Correct.

01:32:29.640 | Yeah.

01:32:30.060 | Okay, thank you.

01:32:31.140 | Using input and output bindings on each node.

01:32:33.980 | Yeah, and the arrows that are coming into the top of the representation in this graph, those are the inputs, and the arrow coming out of the back is the outputs, and there could be multiple of those.

01:32:46.640 | Thank you.

01:33:16.620 | Yes.

01:33:44.200 | It's called a visual editor, but it's really more of a visual reader, and that is absolutely true.

01:33:49.540 | I want to highlight a little subtlety, too, when you get to step 10, when you first run your prompt flow in Visual Studio Code, you're going to be clicking on the run button once you've viewed the prompt flow itself in the Visual Studio Code environment.

01:34:05.020 | You can see the commands it's running, it's just running a little Python command to launch in the YAML file, but what I want to emphasize here is that, in reality, everything here is running locally, and in fact, in the usual developer environment, it will be running directly on your laptop or a shared machine.

01:34:22.360 | In this case, in this case, it's running on the GitHub code space environment, and the whole idea behind this is you have a very fast responsive place to try out different prompts, to make sure your connections are working, perhaps testing different types of LLMs, replacing them in the LLM steps, so you can actually figure out what are the bits of the puzzle that go together to give you a good experience for the endpoint that you're trying to create, just in a local environment.

01:34:51.220 | Now, I'll say local because, of course, the database is still in the cloud, and the OpenAI endpoint is still in the cloud, but all the orchestration is happening directly on your local machine.

01:35:00.460 | Our next step after this is going to be then publish that prompt flow into Azure, inside its own container app, as it turns out, and then that's going to be a hosted cloud version of that same prompt flow, which is going to support the production use of that endpoint in your application.

01:35:17.960 | Yeah, side effect of what David just said is that because it's building a Docker container, you can actually customize environment and add packages.

01:35:43.820 | So, earlier we were talking about the differences between 17 kernel and prompt flow, one of the nice things with prompt flow is that it's very interesting for web developers, because they don't have to care about creating an environment, deploying a Docker environment, scaling it, the whole scaling is done automatically by the platform.

01:36:10.040 | So, you just need to add packages, so you can combine an LLM with some packages for some specific processing, and the whole deployment is done automatically, so you can focus on the UI and the user experience.

01:36:26.300 | So, all right, and then let me get to the evaluation.

01:36:38.080 | So, all right, and then let me get to the evaluation.

01:36:39.920 | So, all right, and then let me get to the evaluation.

01:36:40.860 | So, all right, and then let me get to the evaluation.

01:36:41.820 | So, all right, and then let me get to the evaluation.

01:36:42.860 | So, all right, and then let me get to the evaluation.

01:36:43.560 | So, all right, and then let me get to the evaluation.

01:36:44.860 | So, all right, and then let me get to the evaluation.

01:36:46.700 | So, all right, and then let me get to the evaluation.

01:36:48.260 | So, all right, and then let me get to the evaluation.

01:36:49.860 | So, all right, and then let me get to the evaluation.

01:36:50.860 | So, all right, and then let me get to the evaluation.

01:36:51.860 | So, all right, and then let me get to the evaluation.

01:36:52.860 | So, all right, and then let me get to the evaluation.

01:36:53.860 | So, all right, and then let me get to the evaluation.

01:37:05.640 | So, all right, and then let me get to the evaluation.

01:37:07.640 | So, all right, and then let me get to the evaluation.

01:37:13.640 | So, all right, and then let me get to the evaluation.

01:37:16.640 | So, all right, and then let me get to the evaluation.

01:37:20.580 | So, all right, and then let me get to the evaluation.

01:37:22.640 | So, all right, and then let me get to the evaluation.

01:37:23.640 | so, all right, and then let me get to the evaluation.

01:37:53.420 | You just need the purchase history for that question.

01:38:03.900 | So, in that specific DAG, we will systematically query both the vector database and the customer

01:38:16.380 | information.

01:38:19.100 | So, yes, when it comes to answering the question of, can you repeat the question?

01:38:37.580 | Yeah, it was what else did I purchase, then because we will query the order history from the relational database, the LLM is going to pay more attention to that part of the context than the product documentation side of things.

01:39:00.060 | But it's really, but it's really, we are really relying on the feature of the LLM to be able to pay attention to what matters.

01:39:10.540 | One more question.

01:39:21.020 | Is there is a form of a form of a form of a form of relational query to ?

01:39:23.500 | Not in that specific prompt flow DAG.

01:39:27.500 | Because that specific prompt flow DAG just returns the, I believe, the last ten orders from the history is what we do, I think.

01:39:37.980 | And that's it.

01:39:48.460 | And that's what we put in the context, because that RAC application is for workshops and demos.

01:39:48.460 | What you're talking about is to do something else, which is text to SQL, where you take a query in natural language, you transform it into a SQL query that you execute against a database where you have a filter where date is one month ago, or whatever.

01:40:10.940 | So that's a similar use case, but a different implementation.

01:40:18.140 | And that's also an area to be wary of, too, because that's an area where prompt injection could come into the fact.

01:40:25.100 | If you're forming a SQL query on the basis of user input, you've got to recognize that there might be malicious input in that process, which might generate SQL.

01:40:34.300 | There's still an intermediate step, it's not directly pasting the string into a SQL query, but there still is an opportunity there for bad actors to control what happens at that SQL generation step.

01:40:45.500 | And I believe we have a template for that, I believe Pamela has created a template called RAG on PostgreSQL, which is now in the same Azure samples GitHub account.

01:41:01.500 | That does exactly what you're saying.

01:41:05.500 | It takes a natural language query, transforms it into a SQL query, and executes it on a PostgreSQL database, but you could do the same with Cosmos DB.

01:41:18.700 | So that actually leads me into another topic, which I wanted to get to before we run out of time here today, which is about evaluation.

01:41:28.020 | So this is an important step.

01:41:32.020 | Any time that you put any kind of an LM-based application into production, where users are going to be providing input to that.

01:41:40.020 | And in this context of a chatbot, the kind of questions you want to ask are, did my chatbot give a relevant answer to my user's question?

01:41:52.020 | Was the chatbot's answer grounded in the information that is available in my databases that is part of my RAG flow?

01:42:02.020 | Was the chatbot's answer coherent?

01:42:05.020 | Was it good English?

01:42:07.020 | Was it understandable?

01:42:08.020 | And the other metric that is in that list, which I'm trying to remember right now, is, I'll get back to in a minute.

01:42:15.020 | But when you get to step number 13, we're going to take you through a Python notebook, which shows you a process for answering these questions manually, essentially.

01:42:31.020 | And then I'm going to show you how that's built into the Azure AI Studio platform itself.

01:42:36.020 | But we think of debugging in just regular apps and tests that we write for applications.

01:42:44.020 | And it's very simple.

01:42:45.020 | It's a yes/no.

01:42:46.020 | Like, did the application return a positive value when it should be a positive value?

01:42:53.020 | Very easy thing to test for in programming style.

01:42:56.020 | Much more difficult test to answer the question was, is the answer generated by my chatbot relevant?

01:43:05.020 | How would you even program such a thing?

01:43:08.020 | And the answer is, you get an LLM to answer that question.

01:43:13.020 | Now, this particular chatbot application we have running on GPT-35 Turbo.

01:43:20.020 | Very cheap, very fast, very reliable LLM.

01:43:26.020 | GPT-40 Turbo just came out recently.

01:43:28.020 | I haven't played around with it a lot myself.

01:43:30.020 | But I imagine that will probably take the place of GPT-35 Turbo in a lot of these applications pretty soon.

01:43:37.020 | Next time we run this workshop, we're going to switch it over to using GPT-40.

01:43:42.020 | But you've also heard of GPT-4.

01:43:44.020 | There are very large, very powerful LLMs that have reasoning capabilities in some sense.

01:43:53.020 | Now, you wouldn't want to use GPT-4 in a production application like this.

01:43:58.020 | Because every time the user types in a chat, not only are they going to have to wait quite a long time for a response,

01:44:03.020 | but it's going to cost you a lot of money on the endpoint.

01:44:05.020 | In this RAG architecture, GPT-35 works great.

01:44:08.020 | As long as you give it the context it needs to answer that question.

01:44:12.020 | But for this testing paradigm, for asking the question, is the answer, trail master jackets are good, to the question, what jackets should I buy?

01:44:24.020 | Is that relevant?

01:44:25.020 | Is that relevant?

01:44:26.020 | That is the kind of question a powerful LLM like GPT-4 can answer quite readily.

01:44:31.020 | So think about how you might automate that process.

01:44:35.020 | You might use the prompt to GPT-4.

01:44:37.020 | Given this question, and this context, the stuff that we put into the RAG, and this answer, ask GPT-4 on a scale of 0 to 5, how relevant is this answer?

01:44:53.020 | How grounded is this answer in the data that I've also provided here?

01:44:58.020 | Is that answer coherent?

01:45:01.020 | And these are all things that GPT-4 can do quite readily.

01:45:04.020 | And we can use the scores in this case that GPT-4 provides as a ranking of how well GPT-3-5 is doing in our endpoint for generating its answers based on the RAG process.

01:45:15.020 | And that's exactly what goes on here.

01:45:17.020 | In this notebook, at the top of it, you can put in a question.

01:45:23.020 | I just ran it on, can you tell me about your jackets?

01:45:26.020 | You can have a look at all the code.

01:45:27.020 | You can even see the prompts that it's using to GPT-4 to answer these questions.

01:45:31.020 | And you can see the actual answers that came back are in the next node up here.

01:45:36.020 | Here we go.

01:45:37.020 | Hey, Sarah Lee, let me tell you about our jackets.

01:45:39.020 | We have two awesome things, awesome options that will go over your previous purchase.

01:45:44.020 | Summer Breeze Jacket, et cetera, et cetera.

01:45:45.020 | So that's the answer that the LM came back.

01:45:48.020 | This is the context that the RAG process was provided by that it used to generate that answer.

01:45:54.020 | And then with that information, we can ask those questions we just asked.

01:45:58.020 | Was that answer about jackets grounded in Contoso's product database?

01:46:02.020 | And GPT-4 ranked that a scale of five.

01:46:05.020 | Sorry, a rank of five on a scale of zero to five.

01:46:09.020 | And likewise, we can ask questions about coherence, fluency, groundedness, and relevance.

01:46:15.020 | And we get the answers.

01:46:17.020 | This particular question is doing really well.

01:46:19.020 | You probably also want to test out your LLM on some adversarial types of responses.

01:46:26.020 | For example, you might ask the question, you know, I want to buy a toothbrush.

01:46:34.020 | Now, remember, this is a camping store.

01:46:37.020 | They don't have toothbrushes.

01:46:38.020 | Nothing is going to come up in the database when we do the RAG search.

01:46:42.020 | Well, actually, something will come back because we always get back some responses that are somewhat

01:46:47.020 | close to our query.

01:46:48.020 | But let's see how our LLM actually does here.

01:46:51.020 | When I run this notebook, it's going to run through those scripts.

01:46:55.020 | It's going to pass that question to our RAG flow, generate the response with GPT-3-5,

01:47:02.020 | and then ask GPT-4 to rank it on those four scripts using the prompts that are linked to in this script.

01:47:08.020 | And when we come back to it, we can see the answer it came back with was, hi, Sarah.

01:47:13.020 | Since you're into outdoor adventures, I recommend the Fresh Breeze -- where's my scroll bar?

01:47:19.020 | Here we go.

01:47:20.020 | Fresh Breeze Travel something, something like that.

01:47:22.020 | There it is.

01:47:23.020 | There's my scroll bar.

01:47:24.020 | Fresh Breeze Travel Toothbrush.

01:47:25.020 | All right.

01:47:26.020 | This is interesting.

01:47:27.020 | Qantasda does not sell a Fresh Breeze Travel Toothbrush.

01:47:31.020 | GPT-3-5 just made that up out of whole cloth.

01:47:34.020 | But this is what LLMs do.

01:47:37.020 | And we have to test to see whether or not they're doing these kinds of things for the type of use cases that we anticipate.

01:47:43.020 | And we can detect that particular test is not going well.

01:47:48.020 | Groundedness.

01:47:49.020 | Score of one out of five.

01:47:50.020 | It really wasn't grounded in our data because there was nothing about toothbrushes in our context data that we provided through RAG.

01:47:57.020 | And similarly, coherence -- well, it was in nice English.

01:48:00.020 | So I've got a score of four for coherence, a score of five for fluency, but one for groundedness and one for relevance.

01:48:07.020 | And so now you can think about automating this process.

01:48:10.020 | You can think about what are the types of questions that we want our application to do well at.

01:48:16.020 | What are the types of questions that we might want to, say, not give any responses to at all and score accordingly.

01:48:22.020 | And I won't go through all the details of this, but when we get into AI Studio, there's a whole section here on evaluation.

01:48:31.020 | Let me just discard that.

01:48:36.020 | And this is the process where you can actually load into it a bunch of tests, which in this case are not Python code or C# code or whatever.

01:48:45.020 | It's responses -- so questions, responses, and context.

01:48:49.020 | And then automate the process of evaluating how your end point, how your RAG process does on all those questions.

01:48:57.020 | So that next time when you add new products, or next time when you decide to upgrade from GPT-3.5 to GPT-4.0, you've got a series of tests ready to go to evaluate how well your application does in face of those changes.

01:49:10.020 | All right.

01:49:11.020 | Questions here?

01:49:12.020 | Quick question on that.

01:49:13.020 | Yeah.

01:49:14.020 | So for -- after you've evaluated the model and you sort of understand the performance of it, what typically are your next steps and what actions do you take to drive improvement on the measures that you see there?

01:49:27.020 | That is an excellent question.

01:49:28.020 | I have a slide just for that.

01:49:30.020 | I honest, this was not planned.

01:49:33.020 | But it's -- this is essentially the LLM Ops process, which is essentially the same as DevOps, but with a fancier name that gets you lots of funding.

01:49:47.020 | Here.

01:49:48.020 | Here's the life cycle.

01:49:49.020 | So exactly the same idea as when we build applications.

01:49:52.020 | We go through the ideation process.

01:49:54.020 | We figure out our use case.

01:49:56.020 | We do some exploration, testing it to our data.

01:49:59.020 | We build our basic prompt flow in the LLM case.

01:50:02.020 | And we develop our first version of that flow.

01:50:05.020 | And then we actually run it against sample data.

01:50:07.020 | This is the evaluation step.

01:50:09.020 | It was still very early on the process here.

01:50:12.020 | If the evaluations don't give the scores that we're looking for before we put it to production, the next step then is to modify our prompts.

01:50:19.020 | You saw that ninja template with a bunch of prompts around do this, don't do that.

01:50:23.020 | You would modify those until you get the behavior that you're looking for.

01:50:26.020 | Maybe you change the process in rag.

01:50:28.020 | Maybe you chunk the data differently.

01:50:30.020 | Maybe you present it differently to the rag process.

01:50:33.020 | Then once you get satisfied in that process, you would keep on testing that against perhaps a live user cohort or bring in some testers.

01:50:41.020 | Bring in a red team to try and break it.

01:50:43.020 | And again, go through that same evaluation process until you're satisfied.

01:50:47.020 | And then finally you'll be ready to actually deploy that to production.

01:50:50.020 | You would put monitoring in.

01:50:52.020 | You would actually monitor live probably a sample of actual user questions responses and have, you know, real-time charts.

01:51:01.020 | Not real-time charts actually, probably daily charts of how your model is doing in scores like roundness for the types of questions that you can ask.

01:51:08.020 | And that might be detecting, you know, maybe things are drifting because your product set has changed and there are trigger words in your products that are making the GPT model do strange things.

01:51:19.020 | Maybe you've got some adversaries that are coming in to try and hack into your system.

01:51:22.020 | That might come up in some of your monitoring scores.

01:51:25.020 | And then you go back through that iteration process to go back and build and augment the model for its next deployment.

01:51:31.020 | Is that the kind of question you're looking for?

01:51:33.020 | Yes.

01:51:34.020 | Great.

01:51:35.020 | Any other questions?

01:51:37.020 | I have one question.

01:51:38.020 | Yeah.

01:51:39.020 | You know, right now when we look at the input from the user, you put text.

01:51:44.020 | You know, you put what you have purchased or something like that.

01:51:47.020 | Mm-hmm.

01:51:48.020 | Can this be improved to take, like, you know, graph or PDF file?

01:51:56.020 | I'm not sure my question is--

01:51:57.020 | Yeah.

01:51:58.020 | I understand how graph fits in there, yeah.

01:51:59.020 | Oh, a PDF file, yeah.

01:52:00.020 | I'd say, you know--

01:52:01.020 | PDF file, yeah.

01:52:02.020 | PDF file, let's say, I want to input my PO rather than I type something.

01:52:05.020 | Yeah.

01:52:06.020 | Can this accommodate that kind of reason?

01:52:08.020 | Absolutely, yeah.

01:52:09.020 | Azure Search, for example, can index PDF files and then you can do a search to find the PDF file that's most relevant to that user's question.

01:52:15.020 | Mm-hmm.

01:52:16.020 | You can then extract down from that PDF file context, which is put into the prompt, which is then used to generate the response.

01:52:22.020 | And then you can put references back to those source files if it's a trusted user kind of a situation, so they can come back to see them.

01:52:28.020 | That's how Copilot works, for example.

01:52:30.020 | Okay.

01:52:31.020 | Yeah.

01:52:32.020 | But still, you know, from the, you know, prompt, you can only input text, right?

01:52:36.020 | Right.

01:52:37.020 | The prompt, well, yes and no.

01:52:40.020 | This particular example, everything is converted into text.

01:52:43.020 | But today, we have what we call multimodal models.

01:52:45.020 | Yeah.

01:52:46.020 | GPT-40, for example, as the prompt, you can input not just text, but also images, even audio.

01:52:53.020 | Not video yet.

01:52:54.020 | Not video yet.

01:52:55.020 | But you could set up that RAG application to insert into the prompt the images or the audio or whatever it is you want the LLM to be able to reference.

01:53:03.020 | Great, great.

01:53:04.020 | That's still relatively new, because 4.0 doesn't have all of its multimodal capabilities out yet, but the principle exists.

01:53:11.020 | Okay.

01:53:12.020 | Yeah.

01:53:13.020 | Good.

01:53:14.020 | And we have Florence that we can use.

01:53:18.020 | And Florence version 2 was actually released earlier this week, which is a model which allows you to do image to text.

01:53:26.020 | So you can analyze an image, generate text out of it, and then take the text and give it to GPT 3.5 or something else.

01:53:35.020 | Are you giving a talk about that tomorrow?

01:53:37.020 | Yes.

01:53:38.020 | Well, I mean, briefly.

01:53:39.020 | It's one of the things I'm talking about, yes.

01:53:41.020 | Okay.

01:53:42.020 | And what time is your talk tomorrow?

01:53:43.020 | What time?

01:53:44.020 | That's a very good question.

01:53:45.020 | I forgot.

01:53:46.020 | It's in my calendar.

01:53:47.020 | That's a killer.

01:53:48.020 | Yeah.

01:53:49.020 | .

01:53:50.020 | Because that's one kind of primary request from our team.

01:53:58.020 | And right now, I have built, you know, this, you know, customized prompt window.

01:54:04.020 | But it can only take text.

01:54:06.020 | And now they want to say, okay, I want to use PDF file or even GPT file.

01:54:12.020 | So, well, like we said, for GPT file, when GPT 4.0, when we make available on Azure the multimodality

01:54:25.020 | capabilities, then you will be able to use it directly.

01:54:28.020 | For now, you can use another image to text model, such as Florence 2 or something else.

01:54:34.020 | For PDFs, so it really depends exactly what your use case is.

01:54:40.020 | Like, is it a transient use case?

01:54:42.020 | Are you storing the PDF long term?

01:54:46.020 | Because if it's the latter...

01:54:48.020 | The end.

01:54:49.020 | Okay.

01:54:50.020 | So that's, okay.

01:54:51.020 | So if it's transient, then an alternative approach is instead of taking the PDF and storing it into

01:54:57.020 | the vector database and indexing it long term, which you can use...

01:55:02.020 | Ah, one, two, three.

01:55:03.020 | What you can use is do that...

01:55:07.020 | You upload the PDF, you chunk it, and you can actually...

01:55:11.020 | The algorithms for embedding the chunks, you can actually run them in Python in memory.

01:55:19.020 | You don't have to do it like, you know, in a long term vector database.

01:55:23.020 | So you can do the chunking and the embedding in memory.

01:55:26.020 | And actually, the vector similarity search function that to find what's relevant, you can execute

01:55:34.020 | those functions in memory to provide your users with transient fMRI experience where they

01:55:42.020 | upload the PDF, and you query the PDF just for the sake of the current conversation.

01:55:49.020 | Okay.

01:55:50.020 | I think I need to take this offline with you.

01:55:52.020 | Yeah.

01:55:53.020 | My use case is a little bit different.

01:55:55.020 | I'll explain a little bit, you know, in PDF and then you can...

01:55:58.020 | Sure.

01:55:59.020 | Oh, yeah.

01:56:00.020 | And you can do it in memory or in a PostgreSQL database or a Cosmos DB that works too.

01:56:04.020 | And delete the data after once you are over with it.

01:56:09.020 | And the last thing I wanted to say regarding, because that's a good question, is the last

01:56:13.020 | thing I wanted to say is right now, today, you can go to Azure OpenAI, you can deploy it

01:56:19.020 | at GPT-4 model, and in the chat, they have, like, a chat section where you can chat with

01:56:24.020 | your model.

01:56:25.020 | Yes.

01:56:26.020 | There you can enter images.

01:56:27.020 | There, today, right now, you can go there.

01:56:30.020 | There you can enter a picture and you can do things with it.

01:56:33.020 | Like, hey, here is a picture of a website.

01:56:35.020 | Can you write the code for the website?

01:56:37.020 | And it will write code according to the picture.

01:56:39.020 | I know this one, I have done that already.

01:56:40.020 | And this is more related to RG prompt window.

01:56:44.020 | Mm-hm.

01:56:45.020 | And so far, you know, what I have, you know, developed can only take text.

01:56:52.020 | It cannot take picture on a PDF file.

01:56:55.020 | And not to cut you off, because we love these detailed questions, but I've been told that

01:56:58.020 | I'm going to get cut off up here in just a minute.

01:57:00.020 | And before I do that, I just want to let you all know that we'll be here for a few minutes

01:57:05.020 | for in-person questions around, but also come to the Microsoft booth in Salon 9.

01:57:10.020 | Lots of people there to have, you can ask exactly these kinds of questions of, so please go ahead

01:57:14.020 | and do that.

01:57:15.020 | Cedric is also giving a talk tomorrow about multimodal models at 12:30 PM in Salon 10.

01:57:22.020 | So please come on and check that.

01:57:24.020 | But thank you, everybody, for coming today.

01:57:26.020 | If you'd like to do this at home, the repository is already in your GitHub account.

01:57:30.020 | And if you happen to miss that step, there's a QR code where you can get to it there as well.

01:57:34.020 | But thank you, everybody, for coming today.

01:57:36.020 | And enjoy the rest of the conference.

01:57:38.020 | Thank you.

01:57:39.020 | Thank you.

01:57:40.020 | Thank you.

01:57:41.020 | Thank you.

01:57:42.020 | Thank you.

01:57:43.020 | Thank you.

01:57:44.020 | Thank you.

01:57:45.020 | Thank you.

01:57:46.020 | Thank you.

01:57:47.020 | Thank you.

01:57:48.020 | Thank you.

01:57:49.020 | Thank you.

01:57:50.020 | Thank you.

01:57:51.020 | Thank you.

01:57:52.020 | Thank you.

01:57:53.020 | We'll see you next time.