Agentic Workflows on Vertex AI: Rukma Sen

I'm Rukma, I work at Google Cloud on our Vertex AI product and at towards the end of the talk for those of you who don't know what that is I will discuss it just a little bit more but where I want to start today is with agents so this slide you're like understatement much right you're like yeah yeah right that's why we're here at this conference because generative AI is transforming how we interact with technology and if any of you are wondering hey is this is the rest of this person's talk filled with such groundbreaking insights maybe maybe not stick around and find out I kid I kid the interesting thing about this statement that I want to think about is what is the interface of that interaction where do all of these all of us whether we're developers employees parents students interface with AI I would posit that for the vast majority of many of our use cases that we actually want to accomplish that interface of interaction with generative AI is going to be an agent of some kind so the power of generative AI as I'm sure I don't need to belabor this point to you guys is immense but it can be kind of intimidating and is inaccessible to many people perhaps many people who are not in the room right now with us but we can think about these personas people we want to help people we want to build for right and that's kind of where I think agents come in and where they're really powerful they're the bridge between the models and everyday users so they help you go from speaking model language to speaking natural language funny joke no no no no I'm very sad you guys give me a laugh no all right all right all right all right I'll try I'll try I'll try well we'll we'll we'll make it there we'll make it there what I think is actually really cool though is that for actually all of us in this room we speak both languages so we're going to be the ones developing these agents right so we're going to be designing how they interact with people what kinds of limits and frameworks we're putting around them to make sure that you know we're being ethical we're being helpful we're being humane we're being safe and that I think is kind of magical think back to the days before the internet existed right what was the human interaction interface with technology it was machines it was things like appliances in the home and then the internet came about and the whole way human beings and technology interact completely changed we're all looking at our screens we use gestures like swiping and zooming and scrolling think about how cool it could be if you were the one building the next interface the next kind of boundary of interaction between human beings and technology I'm wearing a little necklace that says wizard in training because I think this is actually kind of magical that one got a better reaction okay we like wizards in this room given that though to quote my favorite spider person with great power comes great responsibility we all know who to attribute this to uncle ben in every version of every spider-man ever I promised spider-man to someone in this room I did say there was spider-man coming up in my talk and I'm hoping I delivered on that promise he's here he's here but the point spider-man is making I think is actually serious and something we should we should be thinking about so with the power to really shape how how people are interacting with AI does come responsibility we must ensure that these interactions are like I said safe humane and helpful when you think about like what is this responsibility I would say there are several kind of sources but some I would just highlight for everybody to think about our first ethical considerations what are our moral obligations to protect users who are using these technologies that have really great unlimited powers in some ways how can we build guardrails that protects people that keeps them safe prevents kind of the spread of misinformation and make it really clear when let's say an agent is producing something that's generated versus when it's producing something that should be taken as a true fact we should also think a good bit I think about safety cyber security data privacy where are we storing the data that we reason over with these models with how are we thinking about making sure that we're safeguarding people's privacy with the rise of a lot of things like wearables and kind of just a lot of what I like to think about as like unobtrusive compute where it's just out there in the world these become I think even more important you know things to think about so great I talked a lot about agents and how we should think about making them but let's talk really quickly about what an agent is now real talk the reason this talk was supposed to be open models is because we did have a last minute schedule shift and fully true story before I knew I was going to deliver this talk and I was you know one week ago registering for this and they asked hey what is it you really want to learn I said what's an agent really so so actually really curious about this but this talk is not actually going to focus on kind of the philosophies and ontologies of agents if you want to chat with me about it please drop by the google cloud booth I would be happy to discuss this with you can we appreciate that I got a spider-man reference and ontology in the same talk very proud of myself okay so given that we're just going to move forward with a working definition and what is a working definition this is probably the kind of broadest most overarching definition you can think about for our purposes an AI agent simply is a system that's designed to achieve specific goals by interacting with its environment so let's break that kind of down into what its key components are so at the heart of every AI agent is a powerful model often this is based on large language models right this is the model that's responsible for reasoning over what are the goals of this agent kind of determining what the next best plan of action is and then guiding its behavior think about it as your agent's brain or executive center if you will then let's think tools so an AI agent doesn't just think it also acts and I think this is actually a key piece of the definition where you can separate it from something where the primary function is just thinking or reasoning or generating with an AI agent you do want to have an action included so this is where tools come in tools are if that if the model was the brain tools are your AI agents hands this is where you get to interact you can do things like fetch data from the internet more complex action calling external APIs to do things like say book flights process payments etc and then orchestration is the glue that kind of holds everything together it maintains memory and state which is really important it keeps sort of track of the goals and if in this analogy of brain and hands orchestration is really the nervous system tying it all kind of together so these three components work together kind of allowing the AI agent to function autonomously and accomplish tasks that being said I really quickly want to say that there are different types of AI agents and some of these you could say have existed for a very long time way before generative AI really you know boomed in the marketplace so there are deterministic agents generative agents and obviously kind of hybrid agents deterministic agents are basically following a fixed set of rules or algorithms to make decisions so given a specific input that type of agent is always consistently going to return the same output so I'm sure you can tell this is quite different from when you're say prompting with a generative agent an example a very simple example of this could be a calculator when you give it the input of two plus two it will always return four unless something's deeply wrong and you're in a mirror dimension let's hope not generative agents on the other hand are more creative they kind of will work best in use cases where you want to be creative you want to combine rules in ways that they haven't been combined together before and they are capable of a much wider range of diverse outputs kind of based on the input they receive so an example a simple example of a generative agent is a chatbot designed to answer kind of customer questions a customer service chatbot when asked about kind of a product it will generate generate and hopefully helpful and informative answers based on whatever data source it has about your company's products etc and hybrid agents combine sort of the strengths of the two an example of this could be like a financial advisor that uses deterministic agents to analyze the market and predict the right places to invest but then uses a generative agent to actually communicate this or go out and talk about this strategy to customers okay so this is i think where things get really interesting so given the different types of agents you can actually architect them quite differently across the spectrum so from single agent to multi-agent architecture i think increases the kind of sophistication and complexity that your agent is capable of very very very quickly so just to like kind of very quickly go over the single agent one this is not i think hopefully new to most people this is where a single model is just responsible for everything reasoning planning acting super straightforward architecture to implement you just provide it with instructions and a set of tools to kind of achieve a goal right so what is the problem here great like you know great tell it what to do it's gonna do it it's gonna return the output well have you ever tried a prompt like count how many instances of the letter a are in the word banana and the model will say four and then you say hey can you check that and then it will say two and then you say hey can you check that and then it'll say one so in cases where you're trying to deploy a production ready app something like this can you know really be a problem so now we get to a much more powerful way to design agents which is multi-agent architecture so just like complex human systems like let's say a company you work at have people specialized in different roles working together to achieve a common goal that's what multi-agent architecture does as an example of this is a customer service system so let's say there's three levels of agent level one you have a dispatcher agent the job of this agent is simply to triage everything that comes in assess the customer's issue and determine where to route it so it triages second level agents subject matter exports these agents are trained in specific subject matters but maybe specific product lines or specific regions if that's how your company functions and when they are assigned a case by that first agent they have the expertise to respond and then finally as a level three check you also have a supervisor agent that quality checks the work against a predefined data set it that agent has the ability to go in and solve some issues for example um fun story i created a multi-agent kind of architecture once and the supervisor agent was supposed to return the the sentence this is not good enough please try again if it wasn't happy with the output and it just kept doing that it did not like anything my first agent did until i went back and like recreated the whole whole thing okay so as agents are becoming more and more common across industries we're largely kind of seeing development in four types and i just wanted to give like show you really quickly like what a set of use cases for agents could look like so with customer i already talked for example through what it would look like for a customer support agent but also things like e-commerce being able to support b2b supporting travel if you are a travel vendor for example there's also internal facing employee agents hr things like enrollment benefits questions those things sales of course as i'm sure you can see would be a great opportunity payable supply chain so those are kind of thinking about who the agent is targeted to and then knowledge agents are specialized agents in terms of what exactly is their subject matter of expertise so you could have an agent that's specifically very good at answering legal questions for example and then finally we are also seeing through the use of multimodal use cases a huge uptick in voice agents especially in scenarios like say a fast food drive-through so i'm sure you can imagine what like where a voice agent would come in here you go in you make that order using your voice and the agent basically transcribes that and sends it through to the ordering system so that the person at the delivery window can go ahead and serve you okay so we're more than halfway through this talk so quick moment so we looked at why we should care about agent design then we kind of peeked under the hood really quickly to talk about what what the kind of components of agents are then we thought through architecture a little bit and kind of looked at what the top use cases are so just before wrapping up the last thing i'm going to do so you can see my shirt i'm going to talk about tooling and specifically google cloud's developer platform vertex ai so google cloud's developer platform vertex ai offers essentially a full life cycle ai development platform so whatever it is you want to do whether it's things i didn't talk about today like uh calling models and fine tuning them or it is stuff like i talked about today such as building agents we offer you a spectrum of ways to enable that whether that's super low code even no code in some cases all the way up to very high customization high code methods to do it vertex offers you access to 150 plus models obviously all of our first party google cloud models but we also have all of anthropics models on there llama 2 and llama 3 as well as a whole bunch of open source models we try to make it easy to prototype so you can get apis for all of this and start experimenting start building without having to you know go through a whole bunch of setup we also want to make it very simple to kind of be able to deploy and have peace of mind that your security and all those enterprise concerns i was talking about earlier when it comes to things like data privacy etc are taken care of so we back all of this with google cloud level enterprise readiness security uh you know things like compute orchestration so you're not ending up paying too much for something if you don't have to and all of that um i wanted to quickly flash model garden for you since this is the piece of vertex ai i did not cover in today's talk but model garden is where you can go in pick your model get you know fine tune it we have a couple of model eval workflows that you can run to try to match the model to your specific use case as well and then finally agent builder as i said all the way from no code to kind of full code ways to build those cool exciting agents that i was just telling you about the last thought i want to leave you with is this we're building four builders vertex ai is designed with developers first in mind and all the choices we make as we build this from training and quick start resources all the way through to deployment is for you so we love feedback please stop by our booth tell us if you've used the product what you love what you hate we would love to learn from all of you with that i will ask you to please do me a giant favor and take a quick survey to tell us how we did and mayveveen my colleague in the green skirt there will give you a cute vertex ai branded water bottle if you show her you completed the survey that's it thank you guys you

Agentic Workflows on Vertex AI: Rukma Sen

Transcript