back to index

Personality Driven Development: Exploring the Frontier of Agents with Attitude


Whisper Transcript | Transcript Only Page

00:00:00.000 | All right, everybody. My name is Ben. I'm going to talk about anthropomorphized agents. I'm calling
00:00:20.880 | it personality-driven development, which is a kind of a cute name. And what we do at Perpetual
00:00:25.280 | is we build AI agents. We call them virtual teammates or AI employees. And we have really
00:00:31.920 | leaned into the idea of giving them forms and form factors. So level set on terminology,
00:00:38.000 | anthropomorphization, which I had to practice saying a whole lot. I like A18N, which I don't
00:00:42.720 | know if it will catch on. That is when you give human traits to non-human entities, right? So
00:00:47.440 | Yogi Bear or Lightning McQueen or Nemo, right? These are all anthropomorphized creatures.
00:00:54.240 | Fun fact, zoomorphization is when you do the same thing for animals. Not really relevant to this talk,
00:01:01.200 | but I thought it was kind of cool. Aslan is a good example of this, right? Aslan's a lion from Narnia.
00:01:06.160 | So it was a non-human entity or whatever, a non-animal entity that got animal characteristics.
00:01:11.840 | You can tell your boss on Monday this is what you learned at the conference.
00:01:15.920 | So these are not new ideas, right? We've seen this in software for decades, right? Clippy.
00:01:20.320 | You know, her, I actually put this on the slide before, you know, the recent
00:01:24.240 | OpenAI fiasco. But anthropomorphizing, giving technology a human form, very, very common, right?
00:01:31.840 | And we're seeing this obviously a lot more with agents, but again, not new ideas.
00:01:37.040 | Right? So Mailchimp, you know, 20 years, they've had a monkey as the form of a human that like sends email for you, right?
00:01:44.800 | So again, these are not new. But I would ask, right? So we have Siri, which is clearly not
00:01:51.040 | anthropomorphized, right? There's just no persona here. However, I'm going to make you raise your hands.
00:01:56.720 | How many think Siri or you could do Alexa 2 is female? Anyone think it's male? Got one. Anyone
00:02:06.240 | think it's non-binary or think this is just a stupid question? Right? It's very weird, right? And this is
00:02:11.360 | sort of a lot of like my learnings and my observations is that we're all people and we really like to ascribe
00:02:17.120 | human characteristics to things, right? So, well, suddenly this AI became female because it had a voice.
00:02:24.800 | And that's strange, right? So, you know, GPT 4.0. Is this female? Male? What about when you're talking
00:02:35.040 | to it? Suddenly it's like, oh, well, now it has a voice. And so suddenly it has a prescribed gender.
00:02:40.640 | And these are just weird concepts, right? And they're normal, nothing profound here. But it's interesting
00:02:45.520 | when you're thinking about product development and software development, what your customers or what
00:02:50.080 | an audience is going to perceive on the other side. The foundational models, interestingly,
00:02:56.160 | have really moved away from any of these concepts. And you can guess some of the reasons, some of them
00:03:01.360 | we'll talk about. But these are like the most modern representations of the foundational models,
00:03:07.520 | at least the ones that have consumer experiences, right? You have Copilot, you have OpenAI, you have
00:03:12.320 | Meta AI, Gemini. This is how they're being represented, right? They clearly all share a single designer for some reason.
00:03:19.840 | Actually, I want to go back to the female thing for a second. I'll keep it here. When we got a Google Home a
00:03:27.440 | couple years ago, right? The little screen you put in your kitchen that shows your photos. And I was super
00:03:31.600 | excited. And I was showing my wife. And I said, hey, look, we can set a timer and it can play music and all these
00:03:37.840 | things. And she looked at me with just this, like, look in her eyes. She's like, you could not have a
00:03:43.280 | woman in the kitchen that you boss around. She's like, and she was dead serious. She was like, that is
00:03:47.680 | unacceptable. You can't just tell a woman what to do. And I was like, it's not a woman, right? And she's
00:03:54.080 | like, I don't care. And she was very, very serious. And I was like, wow, this is remarkable how like deeply
00:03:59.600 | ingrained the star. So I changed the voice to like an Australian man. And like, now we're cool. And
00:04:03.440 | like, everyone's happy. But like, true story, right? So I think it's also helpful to think about the
00:04:10.080 | contrapositive, the opposite. Like, what is it like an AI that's like, doesn't have a form, right? So
00:04:15.440 | this is just a great example, right? The AI that's inside Google Photos is just mind blowing,
00:04:20.480 | right? Just like search for a dog, all the pictures of my beagle, like, lovely. But there's no form here.
00:04:26.320 | You don't think about Google Photos as having like, personality, right? It just, it just is.
00:04:30.640 | And the algorithm is like under the covers. Oh, another fun fact, as long as I'm talking to my
00:04:35.200 | family. So I showed this to my 11 year old Zeke. And he was like, oh, my God, Google Photos has ChatGPT
00:04:42.480 | inside. So if you wonder how, you know, Gemini's branding is going with the youth, it is not,
00:04:47.680 | not good at all. So at Perpetual, we've really leaned into this idea of
00:04:55.600 | giving agents forms and personalities. And we really took it a couple steps. And we tell our
00:05:00.080 | customers, you can give your agents their own forms. And so here's tech lead, right? Kind of a cyborg
00:05:07.680 | Android type of, you know, persona. And, you know, a member of the team writes code, does code reviews,
00:05:14.000 | things like that. But really leaned in to say, listen, they can have a personality, they can have a form
00:05:19.280 | factor, they can have preferences. And then things get real weird because our recruiter is like
00:05:24.880 | an artichoke. And it's like, oh, it's kind of like technology that's
00:05:28.960 | zoom morphized into a, or whatever you do with a vegetable that now has human characteristics.
00:05:34.000 | And it's all very bizarre. But on the other hand, it's like, oh, it actually kind of makes sense.
00:05:37.520 | You're like, I mean, it makes no sense. But it's also like, I understand this, like, it's a recruiter
00:05:41.760 | who has this form. And we have teams of agents. And these are hamsters. And they run the business.
00:05:46.960 | And we have the general manager and the graphic designer. And they represent their own roles,
00:05:54.240 | right? And on one hand, it's very amusing. So the question would be like, why are we doing this?
00:05:58.320 | And, well, I'll get to why we're doing it in a second. Let me talk about the expectations,
00:06:01.680 | because this was surprising. So customer expectations, as soon as you put a form onto something,
00:06:08.720 | like, get real. And they get real very quickly. So assumption number one is that you can chat with
00:06:14.960 | it. And there's nothing about workflow. At the end of the day, like our agents, all we're talking about
00:06:19.760 | is workflow. If we're being real, right? It's just like, it's just smart workflow. That's what we're all
00:06:23.200 | doing. There's no reason that you should be able to chat with it. But all of a sudden, oh, it has a face.
00:06:28.400 | I must be able to chat with it. Oh, do I talk with it in Slack or in Teams? Like, wait, why do you think
00:06:33.280 | that should even be a thing? But like, 100% it is. Personality. Everyone assumes it's going to have
00:06:39.600 | a personality, right? And generally, the baseline is like, you're a helpful assistant, right? So assume,
00:06:43.920 | oh, you're going to be friendly, a helpful assistant. And we've very, very quickly adopted that mental
00:06:49.680 | model. By we, I mean, customers like have adopted this mental model of just like, oh, a helpful
00:06:53.920 | assistant. That's great. You know, we let customers make, you know, make them snarky, make them funny.
00:06:58.800 | But like, at the end of the day, there's an assumption. And again, that was also weird. Like,
00:07:02.480 | why does software have personality? And suddenly with a face, it just needs it.
00:07:05.280 | Users have no patience with these things going wrong, right? We all know this stuff goes off the
00:07:11.440 | rails. It gets wrong. But like, all of software fails all the time, right? But you never hear
00:07:15.120 | people being like, fuck you, Google Sheets, right? But like, they curse those hamsters. You better
00:07:20.480 | believe it, right? It's just like, it's weird. You suddenly have this thing that you can get mad at.
00:07:24.480 | And like, you can -- this is all just psychology. It's user psychology that is really, really innate.
00:07:29.920 | And thinking about this from the perspective of product development, of, you know, software
00:07:36.560 | development, it's like, well, is there any reason that we actually do it? Like, it seems like there's,
00:07:41.280 | like -- I mean, I already named some downsides actually listed a lot more at the end. But so,
00:07:45.040 | why do we -- why do we even bother? First off, these are really easy concepts to understand, right?
00:07:53.200 | When I tell someone, oh, you have an AI software engineer. Okay, like, you instantly know what we're
00:07:59.280 | talking about. Oh, it's an AI recruiter. Oh, okay, well, it'll probably, like, read resumes and it'll probably
00:08:04.240 | coordinate interviews. And without any additional words. And we found that to just be an incredibly
00:08:08.720 | powerful -- what's the word? Like a jargon. Not even jargon. It's just like the terse way to describe
00:08:15.120 | the things that we're all doing without getting into really complicated discussions about React
00:08:20.560 | frameworks, right? It's just great. Branding. This is -- I would say it's -- I don't know if branding is
00:08:27.200 | quite the right word, but having a handle, like something to describe what it is, is also really,
00:08:32.400 | really powerful, right? I think most people think of, like -- I should just talk about AI for a second.
00:08:36.960 | They call it, like, the algorithm, right? People, like, know the word algorithm. And everyone thinks
00:08:40.800 | an algorithm is just, like, why my news feed is not in order, right? Like, oh, it's the algorithm,
00:08:45.760 | right? It's just this concept that is just very hard to grasp, very hard to grok. But giving something
00:08:51.520 | a name, or a name that's something that we are familiar with, really, really powerful, because now you can
00:08:56.320 | talk about it. And I was really struck by, I don't know, for the Android folks, Google Now used to -- all
00:09:03.840 | the phones are going to buzz -- or, I guess, all the phones that haven't been updated in three years
00:09:06.960 | are going to buzz -- Google Now was essentially what Google Assistant became. And it did all the same
00:09:10.880 | stuff. It set your timers and your reminders. And you would talk to Google Now. But, like, what was
00:09:15.920 | it? It was, like -- it was just, like, this weird conceptual thing. But all of a sudden, it became a Google
00:09:21.280 | Assistant. And you're, like, oh, again, it's a thing that, like, works on my behalf. And it, like,
00:09:25.440 | you have this, like, almost corporeal understanding of, like, what it is. And I'd actually be curious
00:09:31.280 | if Gemini makes this better or worse. But, again, Google Now and the same thing. But just that one
00:09:36.400 | nuance difference made it really easy to understand. Price anchoring. So this is an interesting one. I
00:09:43.600 | don't know that this is, like, well-tested in the field yet. All of this is so new. But if what we're
00:09:48.800 | talking about, again, is just, like, workflow and all of our agents are just doing, like, workflow,
00:09:52.560 | like, what is the mindset for what workflow should cost, right? We look at comps and it's, like, I don't
00:09:57.040 | know, 50 bucks a month. I mean, it's sort of, like, there's, again, making up the numbers. But if you
00:10:01.600 | think about it as a percentage -- or if your price anchored on a junior employee, wow, it changes the
00:10:08.240 | conversation, right? So you talk to an executive and it's, like, yeah, 1/20th of the cost, 1/100th of the cost,
00:10:14.080 | right? Suddenly it changes that nature of that pricing conversation. Again, I don't know that this is fully
00:10:18.480 | battle-tested or if it will withstand tests of time, right? But, like, right now, it's fantastic.
00:10:24.480 | Is that a ferret? What is that? An otter?
00:10:29.920 | That's an awesome picture. So the other reason that this is a really helpful construct is it is a way to
00:10:44.480 | decompose problems, right? So thinking about if, you know, wearing an engineering hat for a moment,
00:10:48.880 | right? What is engineering? It is abstractions about the real world. It is getting the right
00:10:53.280 | levels of abstractions and it's about decomposing problems. Like, that's just all we do all day long,
00:10:57.440 | right? And in a sense, this is, like, an arbitrary way to break down a problem. On the other hand,
00:11:03.040 | it's a really useful way to break down a problem, which is what we've learned, right? So specialized agents
00:11:07.520 | have heard a bunch of talks today talk about specialized agents versus generalized and how
00:11:11.200 | specialized ones perform better, right? It's like, oh, we have a finite set of tools. We have a small
00:11:17.360 | number of inputs. Like, there's just less chance for LMs who are trying to interpret or do tool calling
00:11:22.000 | to get things wrong. And so it just happens to be a really convenient way to break down problems and also
00:11:28.080 | to scale because we can keep subdividing, like, agent problems into more and more specializations.
00:11:34.560 | And so it's just very natural. Again, arbitrary, but, like, even for me, just like, oh, this is very
00:11:39.040 | helpful. I understand that my AI engineer writes code and I understand that my AI copywriter, you know,
00:11:43.920 | writes compelling copy and, like, great. I can get my head around that very, very easily.
00:11:47.840 | Unless it's just fun, right? Which is, you know, that's sort of a company branding question, but, like,
00:11:54.960 | we like making our video game avatars, right? You spend more time making your, like, eyebrows correct
00:11:59.440 | in, like, the Nintendo Switch than you do playing the game. We roll characters in D&D. Like, so, like,
00:12:04.560 | why not have fun, right? So that's sort of a personal perspective on this. Let's talk about the downsides.
00:12:10.640 | And there's not going to be any cute pictures because, you know, this is, like, a sad part. So,
00:12:15.200 | okay, so one of the big glaring ones is, like, we are just inviting inclusivity and stereotype
00:12:23.520 | challenges, right? We were just asking for it. One thing that was fascinating is, you know,
00:12:28.160 | all those avatars were generated, right? So a customer can pick their form and we ought to generate it.
00:12:33.360 | 100% of the software engineers are generated with, like, neckties, like, looking like men. They're just,
00:12:39.440 | like, what comes right out of Dolly. And, like, I'm not going to have any, like, perspectives,
00:12:42.640 | but, like, it's just, like, all of them do. So we are just inviting this onto ourselves. And is that
00:12:48.480 | worth it, right? Is that worth it in a work context to really just invite those questions?
00:12:53.440 | Expectations of performance. I don't know what it is because we assume, like, software is going to
00:13:00.720 | always just work. But there's this expectation that, like, these agents are going to perform
00:13:05.760 | really, really well, right? There's just this bar. It's like, oh, well, my -- you know, even though our,
00:13:10.160 | you know, our junior employees don't perform well, there's an expectation that these things are going
00:13:14.000 | to perform in a very high bar. It's just what we've seen. It's like, yes, of course it's going to get
00:13:17.520 | it right 100% of the time. The features I alluded to before, it's like, why are we spending our time
00:13:22.720 | building, like, chat interfaces and all of these things that, like -- it's just -- it's strange. Like,
00:13:28.720 | you almost have to if you're building this type of personified agent. But it's just because it's expected.
00:13:34.480 | Certainly, as a startup, we haven't had to deal with this. But I'm going to guess that when we want to
00:13:41.520 | walk this back and rebrand, like, holy shit, right? Like, there's just, like, walking back when all
00:13:46.640 | of our customers have these -- it's going to be very, very difficult, right? This is not just, like,
00:13:50.400 | changing some colors, right? This is a major, you know, stake in the ground that we would be planting.
00:13:55.520 | And lastly, it could be a distraction, right? Like, all of these fun stuff that we're talking
00:14:01.520 | about around personalities and, like, chatting and preferences -- it's a distraction from the actual
00:14:08.800 | business value, which is, like, document review and data extraction. The thing that, actually,
00:14:13.920 | someone would pay for, it can be distracting. But at the end of the day, honestly, the biggest
00:14:19.040 | downside right now is that it is just a very stark reminder that you're replacing jobs. And, like,
00:14:24.560 | the -- outside the scope of the talk, whether or not that is a good thing or a bad thing or an
00:14:29.920 | inevitable thing, right? However, it is a reality that as soon as you bring this up with a prospect,
00:14:36.080 | like, first thing on their mind. And I'll just be real. I'll tell a story. So, very recently,
00:14:41.440 | I was, you know, pitching sort of, you know, C-suite pitch, right, to a CEO. It was like, oh, this is -- you
00:14:46.880 | can't scale your business. You don't have enough people. You could never hire enough people to scale
00:14:51.760 | the business to meet your aspirations. Have we got a solution for you? It's also shit work. Like, no one wants to
00:14:56.080 | do it. Like, it's great. And he was just loving. He's like, yeah, this is, like, exactly what I need.
00:15:00.400 | And at the cost, it'll be great. And so, set up our first design session. And we get into the meeting.
00:15:07.520 | And he has, you know, one of his, you know, like an IC on the team. He's like, oh, yeah, well, I don't
00:15:12.720 | do any actual work. I'm, like, an executive, right? I brought the person who knows what they're doing.
00:15:17.200 | This is Ben. And he has software that has virtual employees. Can you please tell him what your job is?
00:15:23.760 | Like, my jaw dropped. I was like, oh, shit. Like, I was not at all prepared for that, right?
00:15:28.400 | Just like, oh, I can't talk to you with a straight -- like, whether or not I'm going to actually make
00:15:32.800 | you more productive and you will do better in your -- like, just to lead with that in the conversation
00:15:36.960 | was just, like, really, really hard. So, it's, like, very, very quick, like, on the fly, try to, like,
00:15:40.960 | walk back that concept of, like, oh, no, no, no. Like, this is actually going to help you. And, like,
00:15:45.360 | whether -- we don't know what the future is going to hold. But, like, we are definitely, like,
00:15:50.080 | leaning heavily into this, and it's potentially a huge hurdle.
00:15:53.200 | Got a couple more minutes. I want to talk about, you know, so this was the title,
00:15:58.880 | Personality-Driven Design. And there's this -- I don't actually even have, like, the words for this.
00:16:03.280 | I'll be curious afterwards if folks do. The software we're building and our ability to, you know,
00:16:08.640 | create these roles, these virtual employees that have job descriptions and forms and personal
00:16:15.760 | preferences and, like, there's no, like, checkboxes inside, like, the configuration. It's all very
00:16:20.800 | prompt-driven, right? But it's a way to inject nuance and business logic into these agents with zero
00:16:29.120 | configuration, which means that every single instance is, like, 100 percent -- excuse me -- bespoke for
00:16:35.600 | each customer, which is a really wild concept. When you think about, like, oh, can I reproduce this bug?
00:16:40.800 | Does it work on my machine? It's like, well, of course not. Like, of course this doesn't work.
00:16:43.840 | It's -- every single instance is bespoke, right? So here's just, like -- I mean, this is not a real
00:16:48.720 | one. This is an example, right? A giraffe, right, who has a personality. It's charismatic. It's entertaining.
00:16:53.280 | You review resumes. You read cover letters. You know, you do that, like, operational work.
00:16:58.480 | And here's where it gets super interesting is the preferences, right? Oh, as a hiring manager,
00:17:05.040 | I want to tell my recruiter my priorities, and I like people who went to Ivy League schools,
00:17:10.160 | and I don't like job hopper -- I sound like such an old man -- and I don't like people who hop jobs,
00:17:14.000 | and, like, I like cover letters. Like, okay, well, I can train, you know, lowercase t, train my virtual
00:17:20.400 | employee the way I want to work and the way I want them to work. And so this is just an amazing concept
00:17:26.960 | that, like, again, I really don't have my head around, like, what it means for, like, the future of
00:17:30.960 | software when an entire thing can be molded to meet the business's need, not just, like, oh,
00:17:36.560 | what the PM of the SaaS platform, like, happened to, like, think was a good checkbox, right? And so
00:17:40.960 | this is sort of, like, the future that I'm really, really excited to be working on. Period. Full stop.
00:17:46.640 | Awesome. Thank you, everybody.
00:17:47.680 | Thank you.