back to indexPersonality Driven Development: Exploring the Frontier of Agents with Attitude

00:00:00.000 |
All right, everybody. My name is Ben. I'm going to talk about anthropomorphized agents. I'm calling 00:00:20.880 |
it personality-driven development, which is a kind of a cute name. And what we do at Perpetual 00:00:25.280 |
is we build AI agents. We call them virtual teammates or AI employees. And we have really 00:00:31.920 |
leaned into the idea of giving them forms and form factors. So level set on terminology, 00:00:38.000 |
anthropomorphization, which I had to practice saying a whole lot. I like A18N, which I don't 00:00:42.720 |
know if it will catch on. That is when you give human traits to non-human entities, right? So 00:00:47.440 |
Yogi Bear or Lightning McQueen or Nemo, right? These are all anthropomorphized creatures. 00:00:54.240 |
Fun fact, zoomorphization is when you do the same thing for animals. Not really relevant to this talk, 00:01:01.200 |
but I thought it was kind of cool. Aslan is a good example of this, right? Aslan's a lion from Narnia. 00:01:06.160 |
So it was a non-human entity or whatever, a non-animal entity that got animal characteristics. 00:01:11.840 |
You can tell your boss on Monday this is what you learned at the conference. 00:01:15.920 |
So these are not new ideas, right? We've seen this in software for decades, right? Clippy. 00:01:20.320 |
You know, her, I actually put this on the slide before, you know, the recent 00:01:24.240 |
OpenAI fiasco. But anthropomorphizing, giving technology a human form, very, very common, right? 00:01:31.840 |
And we're seeing this obviously a lot more with agents, but again, not new ideas. 00:01:37.040 |
Right? So Mailchimp, you know, 20 years, they've had a monkey as the form of a human that like sends email for you, right? 00:01:44.800 |
So again, these are not new. But I would ask, right? So we have Siri, which is clearly not 00:01:51.040 |
anthropomorphized, right? There's just no persona here. However, I'm going to make you raise your hands. 00:01:56.720 |
How many think Siri or you could do Alexa 2 is female? Anyone think it's male? Got one. Anyone 00:02:06.240 |
think it's non-binary or think this is just a stupid question? Right? It's very weird, right? And this is 00:02:11.360 |
sort of a lot of like my learnings and my observations is that we're all people and we really like to ascribe 00:02:17.120 |
human characteristics to things, right? So, well, suddenly this AI became female because it had a voice. 00:02:24.800 |
And that's strange, right? So, you know, GPT 4.0. Is this female? Male? What about when you're talking 00:02:35.040 |
to it? Suddenly it's like, oh, well, now it has a voice. And so suddenly it has a prescribed gender. 00:02:40.640 |
And these are just weird concepts, right? And they're normal, nothing profound here. But it's interesting 00:02:45.520 |
when you're thinking about product development and software development, what your customers or what 00:02:50.080 |
an audience is going to perceive on the other side. The foundational models, interestingly, 00:02:56.160 |
have really moved away from any of these concepts. And you can guess some of the reasons, some of them 00:03:01.360 |
we'll talk about. But these are like the most modern representations of the foundational models, 00:03:07.520 |
at least the ones that have consumer experiences, right? You have Copilot, you have OpenAI, you have 00:03:12.320 |
Meta AI, Gemini. This is how they're being represented, right? They clearly all share a single designer for some reason. 00:03:19.840 |
Actually, I want to go back to the female thing for a second. I'll keep it here. When we got a Google Home a 00:03:27.440 |
couple years ago, right? The little screen you put in your kitchen that shows your photos. And I was super 00:03:31.600 |
excited. And I was showing my wife. And I said, hey, look, we can set a timer and it can play music and all these 00:03:37.840 |
things. And she looked at me with just this, like, look in her eyes. She's like, you could not have a 00:03:43.280 |
woman in the kitchen that you boss around. She's like, and she was dead serious. She was like, that is 00:03:47.680 |
unacceptable. You can't just tell a woman what to do. And I was like, it's not a woman, right? And she's 00:03:54.080 |
like, I don't care. And she was very, very serious. And I was like, wow, this is remarkable how like deeply 00:03:59.600 |
ingrained the star. So I changed the voice to like an Australian man. And like, now we're cool. And 00:04:03.440 |
like, everyone's happy. But like, true story, right? So I think it's also helpful to think about the 00:04:10.080 |
contrapositive, the opposite. Like, what is it like an AI that's like, doesn't have a form, right? So 00:04:15.440 |
this is just a great example, right? The AI that's inside Google Photos is just mind blowing, 00:04:20.480 |
right? Just like search for a dog, all the pictures of my beagle, like, lovely. But there's no form here. 00:04:26.320 |
You don't think about Google Photos as having like, personality, right? It just, it just is. 00:04:30.640 |
And the algorithm is like under the covers. Oh, another fun fact, as long as I'm talking to my 00:04:35.200 |
family. So I showed this to my 11 year old Zeke. And he was like, oh, my God, Google Photos has ChatGPT 00:04:42.480 |
inside. So if you wonder how, you know, Gemini's branding is going with the youth, it is not, 00:04:47.680 |
not good at all. So at Perpetual, we've really leaned into this idea of 00:04:55.600 |
giving agents forms and personalities. And we really took it a couple steps. And we tell our 00:05:00.080 |
customers, you can give your agents their own forms. And so here's tech lead, right? Kind of a cyborg 00:05:07.680 |
Android type of, you know, persona. And, you know, a member of the team writes code, does code reviews, 00:05:14.000 |
things like that. But really leaned in to say, listen, they can have a personality, they can have a form 00:05:19.280 |
factor, they can have preferences. And then things get real weird because our recruiter is like 00:05:24.880 |
an artichoke. And it's like, oh, it's kind of like technology that's 00:05:28.960 |
zoom morphized into a, or whatever you do with a vegetable that now has human characteristics. 00:05:34.000 |
And it's all very bizarre. But on the other hand, it's like, oh, it actually kind of makes sense. 00:05:37.520 |
You're like, I mean, it makes no sense. But it's also like, I understand this, like, it's a recruiter 00:05:41.760 |
who has this form. And we have teams of agents. And these are hamsters. And they run the business. 00:05:46.960 |
And we have the general manager and the graphic designer. And they represent their own roles, 00:05:54.240 |
right? And on one hand, it's very amusing. So the question would be like, why are we doing this? 00:05:58.320 |
And, well, I'll get to why we're doing it in a second. Let me talk about the expectations, 00:06:01.680 |
because this was surprising. So customer expectations, as soon as you put a form onto something, 00:06:08.720 |
like, get real. And they get real very quickly. So assumption number one is that you can chat with 00:06:14.960 |
it. And there's nothing about workflow. At the end of the day, like our agents, all we're talking about 00:06:19.760 |
is workflow. If we're being real, right? It's just like, it's just smart workflow. That's what we're all 00:06:23.200 |
doing. There's no reason that you should be able to chat with it. But all of a sudden, oh, it has a face. 00:06:28.400 |
I must be able to chat with it. Oh, do I talk with it in Slack or in Teams? Like, wait, why do you think 00:06:33.280 |
that should even be a thing? But like, 100% it is. Personality. Everyone assumes it's going to have 00:06:39.600 |
a personality, right? And generally, the baseline is like, you're a helpful assistant, right? So assume, 00:06:43.920 |
oh, you're going to be friendly, a helpful assistant. And we've very, very quickly adopted that mental 00:06:49.680 |
model. By we, I mean, customers like have adopted this mental model of just like, oh, a helpful 00:06:53.920 |
assistant. That's great. You know, we let customers make, you know, make them snarky, make them funny. 00:06:58.800 |
But like, at the end of the day, there's an assumption. And again, that was also weird. Like, 00:07:02.480 |
why does software have personality? And suddenly with a face, it just needs it. 00:07:05.280 |
Users have no patience with these things going wrong, right? We all know this stuff goes off the 00:07:11.440 |
rails. It gets wrong. But like, all of software fails all the time, right? But you never hear 00:07:15.120 |
people being like, fuck you, Google Sheets, right? But like, they curse those hamsters. You better 00:07:20.480 |
believe it, right? It's just like, it's weird. You suddenly have this thing that you can get mad at. 00:07:24.480 |
And like, you can -- this is all just psychology. It's user psychology that is really, really innate. 00:07:29.920 |
And thinking about this from the perspective of product development, of, you know, software 00:07:36.560 |
development, it's like, well, is there any reason that we actually do it? Like, it seems like there's, 00:07:41.280 |
like -- I mean, I already named some downsides actually listed a lot more at the end. But so, 00:07:45.040 |
why do we -- why do we even bother? First off, these are really easy concepts to understand, right? 00:07:53.200 |
When I tell someone, oh, you have an AI software engineer. Okay, like, you instantly know what we're 00:07:59.280 |
talking about. Oh, it's an AI recruiter. Oh, okay, well, it'll probably, like, read resumes and it'll probably 00:08:04.240 |
coordinate interviews. And without any additional words. And we found that to just be an incredibly 00:08:08.720 |
powerful -- what's the word? Like a jargon. Not even jargon. It's just like the terse way to describe 00:08:15.120 |
the things that we're all doing without getting into really complicated discussions about React 00:08:20.560 |
frameworks, right? It's just great. Branding. This is -- I would say it's -- I don't know if branding is 00:08:27.200 |
quite the right word, but having a handle, like something to describe what it is, is also really, 00:08:32.400 |
really powerful, right? I think most people think of, like -- I should just talk about AI for a second. 00:08:36.960 |
They call it, like, the algorithm, right? People, like, know the word algorithm. And everyone thinks 00:08:40.800 |
an algorithm is just, like, why my news feed is not in order, right? Like, oh, it's the algorithm, 00:08:45.760 |
right? It's just this concept that is just very hard to grasp, very hard to grok. But giving something 00:08:51.520 |
a name, or a name that's something that we are familiar with, really, really powerful, because now you can 00:08:56.320 |
talk about it. And I was really struck by, I don't know, for the Android folks, Google Now used to -- all 00:09:03.840 |
the phones are going to buzz -- or, I guess, all the phones that haven't been updated in three years 00:09:06.960 |
are going to buzz -- Google Now was essentially what Google Assistant became. And it did all the same 00:09:10.880 |
stuff. It set your timers and your reminders. And you would talk to Google Now. But, like, what was 00:09:15.920 |
it? It was, like -- it was just, like, this weird conceptual thing. But all of a sudden, it became a Google 00:09:21.280 |
Assistant. And you're, like, oh, again, it's a thing that, like, works on my behalf. And it, like, 00:09:25.440 |
you have this, like, almost corporeal understanding of, like, what it is. And I'd actually be curious 00:09:31.280 |
if Gemini makes this better or worse. But, again, Google Now and the same thing. But just that one 00:09:36.400 |
nuance difference made it really easy to understand. Price anchoring. So this is an interesting one. I 00:09:43.600 |
don't know that this is, like, well-tested in the field yet. All of this is so new. But if what we're 00:09:48.800 |
talking about, again, is just, like, workflow and all of our agents are just doing, like, workflow, 00:09:52.560 |
like, what is the mindset for what workflow should cost, right? We look at comps and it's, like, I don't 00:09:57.040 |
know, 50 bucks a month. I mean, it's sort of, like, there's, again, making up the numbers. But if you 00:10:01.600 |
think about it as a percentage -- or if your price anchored on a junior employee, wow, it changes the 00:10:08.240 |
conversation, right? So you talk to an executive and it's, like, yeah, 1/20th of the cost, 1/100th of the cost, 00:10:14.080 |
right? Suddenly it changes that nature of that pricing conversation. Again, I don't know that this is fully 00:10:18.480 |
battle-tested or if it will withstand tests of time, right? But, like, right now, it's fantastic. 00:10:29.920 |
That's an awesome picture. So the other reason that this is a really helpful construct is it is a way to 00:10:44.480 |
decompose problems, right? So thinking about if, you know, wearing an engineering hat for a moment, 00:10:48.880 |
right? What is engineering? It is abstractions about the real world. It is getting the right 00:10:53.280 |
levels of abstractions and it's about decomposing problems. Like, that's just all we do all day long, 00:10:57.440 |
right? And in a sense, this is, like, an arbitrary way to break down a problem. On the other hand, 00:11:03.040 |
it's a really useful way to break down a problem, which is what we've learned, right? So specialized agents 00:11:07.520 |
have heard a bunch of talks today talk about specialized agents versus generalized and how 00:11:11.200 |
specialized ones perform better, right? It's like, oh, we have a finite set of tools. We have a small 00:11:17.360 |
number of inputs. Like, there's just less chance for LMs who are trying to interpret or do tool calling 00:11:22.000 |
to get things wrong. And so it just happens to be a really convenient way to break down problems and also 00:11:28.080 |
to scale because we can keep subdividing, like, agent problems into more and more specializations. 00:11:34.560 |
And so it's just very natural. Again, arbitrary, but, like, even for me, just like, oh, this is very 00:11:39.040 |
helpful. I understand that my AI engineer writes code and I understand that my AI copywriter, you know, 00:11:43.920 |
writes compelling copy and, like, great. I can get my head around that very, very easily. 00:11:47.840 |
Unless it's just fun, right? Which is, you know, that's sort of a company branding question, but, like, 00:11:54.960 |
we like making our video game avatars, right? You spend more time making your, like, eyebrows correct 00:11:59.440 |
in, like, the Nintendo Switch than you do playing the game. We roll characters in D&D. Like, so, like, 00:12:04.560 |
why not have fun, right? So that's sort of a personal perspective on this. Let's talk about the downsides. 00:12:10.640 |
And there's not going to be any cute pictures because, you know, this is, like, a sad part. So, 00:12:15.200 |
okay, so one of the big glaring ones is, like, we are just inviting inclusivity and stereotype 00:12:23.520 |
challenges, right? We were just asking for it. One thing that was fascinating is, you know, 00:12:28.160 |
all those avatars were generated, right? So a customer can pick their form and we ought to generate it. 00:12:33.360 |
100% of the software engineers are generated with, like, neckties, like, looking like men. They're just, 00:12:39.440 |
like, what comes right out of Dolly. And, like, I'm not going to have any, like, perspectives, 00:12:42.640 |
but, like, it's just, like, all of them do. So we are just inviting this onto ourselves. And is that 00:12:48.480 |
worth it, right? Is that worth it in a work context to really just invite those questions? 00:12:53.440 |
Expectations of performance. I don't know what it is because we assume, like, software is going to 00:13:00.720 |
always just work. But there's this expectation that, like, these agents are going to perform 00:13:05.760 |
really, really well, right? There's just this bar. It's like, oh, well, my -- you know, even though our, 00:13:10.160 |
you know, our junior employees don't perform well, there's an expectation that these things are going 00:13:14.000 |
to perform in a very high bar. It's just what we've seen. It's like, yes, of course it's going to get 00:13:17.520 |
it right 100% of the time. The features I alluded to before, it's like, why are we spending our time 00:13:22.720 |
building, like, chat interfaces and all of these things that, like -- it's just -- it's strange. Like, 00:13:28.720 |
you almost have to if you're building this type of personified agent. But it's just because it's expected. 00:13:34.480 |
Certainly, as a startup, we haven't had to deal with this. But I'm going to guess that when we want to 00:13:41.520 |
walk this back and rebrand, like, holy shit, right? Like, there's just, like, walking back when all 00:13:46.640 |
of our customers have these -- it's going to be very, very difficult, right? This is not just, like, 00:13:50.400 |
changing some colors, right? This is a major, you know, stake in the ground that we would be planting. 00:13:55.520 |
And lastly, it could be a distraction, right? Like, all of these fun stuff that we're talking 00:14:01.520 |
about around personalities and, like, chatting and preferences -- it's a distraction from the actual 00:14:08.800 |
business value, which is, like, document review and data extraction. The thing that, actually, 00:14:13.920 |
someone would pay for, it can be distracting. But at the end of the day, honestly, the biggest 00:14:19.040 |
downside right now is that it is just a very stark reminder that you're replacing jobs. And, like, 00:14:24.560 |
the -- outside the scope of the talk, whether or not that is a good thing or a bad thing or an 00:14:29.920 |
inevitable thing, right? However, it is a reality that as soon as you bring this up with a prospect, 00:14:36.080 |
like, first thing on their mind. And I'll just be real. I'll tell a story. So, very recently, 00:14:41.440 |
I was, you know, pitching sort of, you know, C-suite pitch, right, to a CEO. It was like, oh, this is -- you 00:14:46.880 |
can't scale your business. You don't have enough people. You could never hire enough people to scale 00:14:51.760 |
the business to meet your aspirations. Have we got a solution for you? It's also shit work. Like, no one wants to 00:14:56.080 |
do it. Like, it's great. And he was just loving. He's like, yeah, this is, like, exactly what I need. 00:15:00.400 |
And at the cost, it'll be great. And so, set up our first design session. And we get into the meeting. 00:15:07.520 |
And he has, you know, one of his, you know, like an IC on the team. He's like, oh, yeah, well, I don't 00:15:12.720 |
do any actual work. I'm, like, an executive, right? I brought the person who knows what they're doing. 00:15:17.200 |
This is Ben. And he has software that has virtual employees. Can you please tell him what your job is? 00:15:23.760 |
Like, my jaw dropped. I was like, oh, shit. Like, I was not at all prepared for that, right? 00:15:28.400 |
Just like, oh, I can't talk to you with a straight -- like, whether or not I'm going to actually make 00:15:32.800 |
you more productive and you will do better in your -- like, just to lead with that in the conversation 00:15:36.960 |
was just, like, really, really hard. So, it's, like, very, very quick, like, on the fly, try to, like, 00:15:40.960 |
walk back that concept of, like, oh, no, no, no. Like, this is actually going to help you. And, like, 00:15:45.360 |
whether -- we don't know what the future is going to hold. But, like, we are definitely, like, 00:15:50.080 |
leaning heavily into this, and it's potentially a huge hurdle. 00:15:53.200 |
Got a couple more minutes. I want to talk about, you know, so this was the title, 00:15:58.880 |
Personality-Driven Design. And there's this -- I don't actually even have, like, the words for this. 00:16:03.280 |
I'll be curious afterwards if folks do. The software we're building and our ability to, you know, 00:16:08.640 |
create these roles, these virtual employees that have job descriptions and forms and personal 00:16:15.760 |
preferences and, like, there's no, like, checkboxes inside, like, the configuration. It's all very 00:16:20.800 |
prompt-driven, right? But it's a way to inject nuance and business logic into these agents with zero 00:16:29.120 |
configuration, which means that every single instance is, like, 100 percent -- excuse me -- bespoke for 00:16:35.600 |
each customer, which is a really wild concept. When you think about, like, oh, can I reproduce this bug? 00:16:40.800 |
Does it work on my machine? It's like, well, of course not. Like, of course this doesn't work. 00:16:43.840 |
It's -- every single instance is bespoke, right? So here's just, like -- I mean, this is not a real 00:16:48.720 |
one. This is an example, right? A giraffe, right, who has a personality. It's charismatic. It's entertaining. 00:16:53.280 |
You review resumes. You read cover letters. You know, you do that, like, operational work. 00:16:58.480 |
And here's where it gets super interesting is the preferences, right? Oh, as a hiring manager, 00:17:05.040 |
I want to tell my recruiter my priorities, and I like people who went to Ivy League schools, 00:17:10.160 |
and I don't like job hopper -- I sound like such an old man -- and I don't like people who hop jobs, 00:17:14.000 |
and, like, I like cover letters. Like, okay, well, I can train, you know, lowercase t, train my virtual 00:17:20.400 |
employee the way I want to work and the way I want them to work. And so this is just an amazing concept 00:17:26.960 |
that, like, again, I really don't have my head around, like, what it means for, like, the future of 00:17:30.960 |
software when an entire thing can be molded to meet the business's need, not just, like, oh, 00:17:36.560 |
what the PM of the SaaS platform, like, happened to, like, think was a good checkbox, right? And so 00:17:40.960 |
this is sort of, like, the future that I'm really, really excited to be working on. Period. Full stop.