Agents @ Work: Dust.tt — with Stanislas Polu

00:00:00.000 | (upbeat music)

00:00:02.580 | - Hey, everyone.

00:00:04.520 | Welcome to the "Lit and Space" podcast.

00:00:06.160 | This is Alessio, partner and CTO at Decibel Partners,

00:00:09.120 | and I'm joined by my co-host, Swiggs, founder of Small.ai.

00:00:12.160 | - Hey, and today we're in a studio with Stanford, welcome.

00:00:14.800 | - Thank you very much for having me.

00:00:16.120 | - Visiting from Paris. - Paris.

00:00:17.800 | - And you have had a very distinguished career.

00:00:21.240 | It's very hard to summarize,

00:00:22.680 | but you went to college in both

00:00:25.800 | Ecole Polytechnique and Stanford,

00:00:27.680 | and then you worked in a number of places,

00:00:29.480 | Oracle, Totems, Stripe, and then OpenAI Preach at UBT.

00:00:33.680 | We'll spend a little bit of time about that.

00:00:35.840 | About two years ago, you left OpenAI to start Dust.

00:00:38.280 | I think you were one of the first OpenAI alum founders.

00:00:41.800 | - Yeah, I think it was about at the same time

00:00:43.320 | as the Adapt guys, so it was that first wave.

00:00:46.120 | - Yeah, and people really loved the David episode.

00:00:49.000 | We love a few sort of OpenAI stories,

00:00:51.600 | you know, for back in the day,

00:00:53.320 | like we were talking about pre-recording,

00:00:55.440 | probably the statute of limitations

00:00:56.560 | on some of those stories has expired.

00:00:58.560 | So you can talk a little bit more freely

00:01:00.240 | without them coming after you.

00:01:02.160 | But maybe we'll just talk about,

00:01:03.280 | like, what was your journey into AI?

00:01:05.640 | You know, you were at Stripe for almost five years.

00:01:07.840 | There are a lot of Stripe alums going into OpenAI.

00:01:09.520 | I think the Stripe culture has come

00:01:10.800 | into OpenAI quite a bit.

00:01:12.120 | - Yeah, so I think the buses of Stripe people

00:01:14.880 | really started flowing in, I guess, after ChatGPD.

00:01:18.600 | But yeah, my journey into AI is--

00:01:20.720 | - I mean, Greg Brockman.

00:01:21.840 | - Yeah, yeah, from Greg, of course.

00:01:25.680 | And Daniela, actually, back in the days.

00:01:28.120 | Daniela Modigliani.

00:01:28.960 | - Yeah, she was COO.

00:01:31.280 | I mean, she is COO, yeah.

00:01:32.600 | - She had a pretty high job at OpenAI at the time,

00:01:35.000 | yeah, for sure.

00:01:35.960 | My journey started as anybody else.

00:01:38.200 | You're fascinated with computer science

00:01:39.720 | and you want to make them think it's awesome,

00:01:41.560 | but it doesn't work.

00:01:42.760 | I mean, it was a long time ago.

00:01:44.000 | I was, like, maybe 16, so it was 25 years ago.

00:01:47.360 | Then the first big exposure to AI would be at Stanford.

00:01:50.680 | And I'm going to, like, disclose how old I am

00:01:54.240 | because, at the time, it was a class taught by Andrew Eng.

00:01:59.240 | And it was, there was no deep learning.

00:02:01.320 | It was half features for vision and a star algorithm.

00:02:05.360 | So it was fun.

00:02:06.280 | But it was the early days of deep learning.

00:02:08.440 | At the time, I think a few years after,

00:02:09.880 | it was the first project at Google.

00:02:11.680 | But, you know, that cat face or the human face

00:02:14.800 | trained from many images.

00:02:16.760 | Went to, hesitated doing a PhD, more in systems.

00:02:19.920 | Eventually decided to go into getting a job.

00:02:24.600 | Went to Oracle, started a company,

00:02:26.400 | did a gazillion mistakes, got acquired by Stripe.

00:02:29.200 | Worked with Greg Buckman there.

00:02:30.680 | And at the end of Stripe,

00:02:32.040 | I started interesting myself in AI again.

00:02:34.280 | Felt like it was the time you had the Atari games,

00:02:36.440 | you had the self-driving craziness at the time.

00:02:39.480 | And I started exploring projects.

00:02:41.280 | It felt like the Atari games were incredible,

00:02:44.000 | but they were still games.

00:02:45.000 | And I was looking into exploring projects

00:02:46.600 | that would have an impact on the world.

00:02:48.920 | And so I decided to explore three things,

00:02:50.840 | self-driving cars, cybersecurity and AI,

00:02:54.120 | and math and AI.

00:02:55.720 | It's like, I think, by a decreasing order

00:02:57.880 | of impact on the world, I guess.

00:03:00.120 | - Yeah.

00:03:01.400 | Discovering new math would be very foundational.

00:03:03.480 | - It is extremely foundational,

00:03:04.480 | but it's not as direct as driving people around.

00:03:07.040 | - Sorry, you're doing this at Stripe.

00:03:08.440 | You were like thinking about your next move.

00:03:09.280 | - No, it was like, you have to Stripe,

00:03:10.320 | kind of a bit of time where I started exploring.

00:03:13.400 | Did a bunch of work with friends

00:03:15.000 | on trying to get RC cars to drive autonomously.

00:03:18.640 | Almost started a company in France or Europe

00:03:21.160 | about self-driving trucks.

00:03:23.520 | We decided to not go for it

00:03:25.080 | because it was like probably very operational.

00:03:27.880 | And I think the idea and ADN of the company,

00:03:30.400 | of the team wasn't there.

00:03:31.600 | And also I realized that if I wake up a day

00:03:34.480 | and because of a bug I wrote, I killed a family,

00:03:37.440 | it would be a bad experience.

00:03:39.880 | And so just decided like, no, that's just too crazy.

00:03:43.120 | Then I explored cybersecurity with a friend.

00:03:45.800 | We're trying to apply transformers to cut fuzzing.

00:03:48.120 | So cut fuzzing, you have kind of an, sorry,

00:03:50.440 | an algorithm that goes really fast

00:03:52.200 | and tries to mutate the inputs of a library to find bugs.

00:03:56.440 | And we try to apply a transformer to that

00:03:58.320 | and do reinforcement learning with the signal

00:04:00.600 | of how much you propagate within the binary.

00:04:03.960 | Didn't work at all because the transformers are so slow

00:04:06.440 | compared to evolutionary algorithms

00:04:08.400 | that it kind of didn't work.

00:04:09.960 | And then I started interested myself in math and AI.

00:04:12.800 | And started working on SAT solving with AI.

00:04:15.320 | And at the same time, OpenAI was kind of starting

00:04:17.320 | the reasoning team that were tackling that project as well.

00:04:21.000 | I was in chat and in touch with Greg

00:04:22.840 | and eventually got in touch with Ilya

00:04:24.800 | and finally found my way to OpenAI.

00:04:27.560 | I don't know how much you want to dig into that.

00:04:29.360 | The way to find your way to OpenAI when you're in Paris

00:04:31.480 | was kind of an interesting adventure as well.

00:04:33.040 | - Please, and I want to note, this was a two month journey.

00:04:36.600 | You did all this in two months, the search.

00:04:39.040 | - The search for what, sorry?

00:04:40.160 | - Your search for your next thing.

00:04:41.280 | 'Cause you left in July, 2019,

00:04:43.760 | and then you joined OpenAI September.

00:04:45.640 | - I'm gonna be ashamed to say that.

00:04:47.640 | - You were searching before, yeah.

00:04:48.760 | - I was searching before.

00:04:49.920 | - I mean, it's normal, it's normal.

00:04:51.520 | - No, the truth is that I moved back to Paris through Scribe.

00:04:54.600 | And I just felt the hardship of being remote from your team

00:04:58.080 | nine hours away.

00:04:59.200 | And so it kind of freed a bit of time

00:05:01.680 | for me to start the exploration before.

00:05:03.720 | Sorry, Patrick, sorry, John.

00:05:05.000 | (all laughing)

00:05:06.000 | - Hopefully they're listening.

00:05:07.520 | Joining OpenAI from Paris and from like,

00:05:11.600 | obviously you had worked with Greg, but not anyone else.

00:05:14.360 | - No, yeah, so I didn't work with,

00:05:15.520 | I had worked with Greg, but not Ilya,

00:05:17.040 | but I had started chatting with Ilya,

00:05:18.680 | and Ilya was kind of excited because he knew

00:05:20.640 | that I was a good engineer through Greg, I presume,

00:05:22.680 | but I was not a trained researcher,

00:05:24.200 | didn't do a PhD, never did research.

00:05:26.200 | And I started chatting and he was excited

00:05:27.760 | all the way to the point where he was like,

00:05:29.320 | "Hey, come pass interviews, it's gonna be fun."

00:05:31.960 | I think he didn't care where I was,

00:05:33.240 | he just wanted to try working together.

00:05:35.800 | So I go to SF, go through the interview process,

00:05:38.680 | get an offer, and so I get Bob McGrew on the phone

00:05:41.640 | for the first time, he's like,

00:05:42.640 | "Hey, Stan, it's awesome, you've got an offer.

00:05:44.480 | "When are you coming to SF?"

00:05:45.960 | I'm like, "Hey, it's awesome,

00:05:47.400 | "I'm not coming to the SF, I'm based in Paris

00:05:50.640 | "and we just moved."

00:05:51.920 | He was like, "Hey, it's awesome.

00:05:52.920 | "Well, you don't have an offer anymore."

00:05:55.520 | - Oh my God.

00:05:56.600 | - No, it wasn't as hard as that,

00:05:58.040 | but that's basically the idea.

00:05:59.640 | And it took me like maybe a couple more time

00:06:02.440 | to keep chatting and they eventually decided

00:06:04.880 | to try a contractor set up,

00:06:06.400 | and that's how I kind of started working at OpenAI,

00:06:09.600 | officially as a contractor, but in practice,

00:06:12.360 | really felt like being an employee.

00:06:13.840 | - What did you work on?

00:06:15.120 | - So it was solely focused on math and AI,

00:06:18.760 | and in particular in the application,

00:06:20.720 | so the study of the larger-gauge models,

00:06:23.360 | mathematical reasoning capabilities,

00:06:25.440 | and in particular in the context of formal mathematics.

00:06:28.760 | The motivation was simple, transformers are very creative,

00:06:31.920 | but yet they do mistakes,

00:06:33.880 | and formal math systems have the ability to verify a proof,

00:06:38.760 | and the tactics they can use to solve problems

00:06:42.320 | are very mechanical, so you miss the creativity.

00:06:44.840 | And so the idea was to try to explore both together,

00:06:47.720 | you would get the creativity of the LLMs

00:06:49.680 | and the kind of verification capabilities

00:06:51.800 | of the formal system.

00:06:53.040 | A formal system, just to give a little bit of context,

00:06:55.120 | is a system in which a proof is a program,

00:06:58.600 | and the formal system is a type system,

00:07:00.800 | a type system that's so evolved

00:07:02.120 | that you can verify the program.

00:07:04.000 | If the type checks, it means that the program is correct.

00:07:06.400 | - Is the verification much faster

00:07:09.720 | than actually executing the program?

00:07:12.440 | It is, right?

00:07:13.280 | - Verification is instantaneous, basically.

00:07:15.320 | So the truth is that what you code in involves tactics

00:07:18.720 | that may involve computation to search for solutions,

00:07:22.800 | so it's not instantaneous.

00:07:24.160 | You do have to do the computation

00:07:25.640 | to expand the tactics into the actual proof.

00:07:28.200 | The verification of the proof at the very low level

00:07:31.280 | is instantaneous.

00:07:32.760 | - How quickly do you run into halting problem,

00:07:35.720 | P and P type things,

00:07:37.840 | and possibilities where you're just like that?

00:07:39.840 | - I mean, you don't run into it at the time.

00:07:41.480 | It was really trying to solve very easy problems.

00:07:45.720 | So I think the-

00:07:47.080 | - Can you give an example of easy?

00:07:48.600 | - Yeah, so that's the mass benchmark

00:07:50.640 | that everybody knows today.

00:07:51.800 | - The Dan Hendricks one.

00:07:52.640 | - The Dan Hendricks one, yeah.

00:07:53.880 | And I think it was the low-end part

00:07:56.200 | of the mass benchmark at the time,

00:07:58.040 | because that mass benchmark includes AMC problems,

00:08:00.800 | AMCA times 10, 12, so these are the easy ones,

00:08:03.880 | then AME problems, somewhat harder,

00:08:06.240 | and some IMO problems like-

00:08:07.960 | - For our listeners, we covered this

00:08:08.960 | in our Benchmarks 101 episode.

00:08:10.400 | AMC is literally the grade of high school,

00:08:13.560 | grade eight, grade 10, grade 12.

00:08:15.560 | So you can solve this.

00:08:16.680 | (laughs)

00:08:17.960 | Just briefly to mention this,

00:08:19.400 | because I don't think we'll touch on this again.

00:08:21.320 | There's a bit of work with Lean,

00:08:23.520 | and then more recently with DeepMind scoring silver

00:08:28.520 | on the IMO.

00:08:29.720 | Any commentary on how math has evolved

00:08:31.720 | from your early work to today?

00:08:34.120 | - I mean, that result is mind-blowing.

00:08:37.920 | I mean, from my perspective, spent three years on that.

00:08:40.120 | At the same time, Guillaume Lampe in Paris,

00:08:42.400 | we were both in Paris, actually.

00:08:43.840 | He was at FAIR, was working on some problems.

00:08:46.120 | We were pushing the boundaries,

00:08:47.960 | and the goal was the IMO.

00:08:50.120 | And we cracked a few problems here and there,

00:08:52.680 | but the idea of getting a medal at an IMO

00:08:56.120 | was just remote.

00:08:58.200 | - Yeah.

00:08:59.040 | - So this is an impressive result.

00:09:00.360 | And we can, I think the DeepMind team

00:09:03.840 | just did a good job of scaling.

00:09:07.360 | I think there's nothing too magical in their approach,

00:09:09.480 | even if it hasn't been published.

00:09:11.000 | There's a Dan Silver talk from seven days ago

00:09:13.360 | where it goes a little bit into more details.

00:09:15.520 | It feels like there's nothing magical there.

00:09:17.200 | It's really applying reinforcement learning

00:09:19.440 | and scaling up the amount of data

00:09:21.120 | that can generate through autoformalization.

00:09:22.920 | So we can dig into what autoformalization means,

00:09:24.960 | if you want.

00:09:25.800 | (laughs)

00:09:26.640 | - Let's talk about the tail end, maybe, of the OpenAI.

00:09:29.560 | So you joined and you're like,

00:09:30.520 | "I'm gonna work on math and do all of these things."

00:09:33.000 | I saw on one of your blog posts, you mentioned

00:09:34.880 | you fine-tuned over 10,000 models at OpenAI

00:09:37.440 | using 10 million A100 hours.

00:09:40.600 | How did the research evolve from the GPT-2

00:09:45.360 | and then getting closer to DaVinci 003?

00:09:47.680 | And then you left just before ChatGPT was released,

00:09:49.920 | but tell people a bit more about the research path

00:09:52.840 | that took you there.

00:09:53.680 | - Yeah, I can give you my perspective of it.

00:09:55.520 | I think at OpenAI,

00:09:57.040 | there's always been a large chunk of the compute

00:09:58.800 | that was reserved to train the GPTs, which makes sense.

00:10:02.360 | So it was pre-entropic splits.

00:10:04.520 | Most of the compute was going to a product called Nest,

00:10:07.240 | which was basically GPT-3.

00:10:09.440 | And then you had a bunch of, let's say remote,

00:10:12.840 | not core research teams that were trying to explore

00:10:17.040 | maybe more specific problems

00:10:18.520 | or maybe the algorithm part of it.

00:10:20.920 | The interesting part, I don't know if it was

00:10:23.040 | where your question was going, is that in those labs,

00:10:25.800 | you're managing researchers.

00:10:27.480 | So by definition, you shouldn't be managing them.

00:10:30.400 | But in that space, there's a managing tool that is great,

00:10:33.680 | which is compute allocation.

00:10:35.280 | Basically, by managing the compute allocation,

00:10:37.240 | you can message the team

00:10:39.400 | of where you think the priority should go.

00:10:41.920 | And so it was really a question of,

00:10:44.960 | you were free as a researcher

00:10:46.720 | to work on whatever you wanted,

00:10:49.320 | but if it was not aligned with OpenAI mission,

00:10:51.640 | and that's fair, you wouldn't get the compute allocation.

00:10:55.160 | As it happens, solving math was very much aligned

00:10:58.280 | with the direction of OpenAI.

00:11:01.200 | And so I was lucky to generally get the compute

00:11:03.880 | I needed to make good progress.

00:11:05.920 | - What do you need to show as incremental results

00:11:08.960 | to get funded for further results?

00:11:11.160 | - It's an imperfect process.

00:11:12.720 | If you're working on math and AI,

00:11:14.160 | obviously there's kind of a prior

00:11:15.760 | that it's going to be aligned with the company.

00:11:17.520 | So it's much easier than to go into something

00:11:19.720 | much riskier, I guess.

00:11:20.960 | You have to show incremental progress, I guess.

00:11:23.080 | It's like you ask for a certain amount of compute

00:11:25.880 | and you deliver a few weeks after,

00:11:27.680 | and you demonstrated you have a progress.

00:11:29.600 | Progress might be a positive result.

00:11:31.200 | Progress might be a strong negative result.

00:11:33.880 | And a strong negative result is actually often

00:11:36.240 | much harder to get or much more interesting

00:11:39.000 | than a positive result.

00:11:40.320 | And then it generally goes into, as any organization,

00:11:44.320 | you would have kind of a people finding your project

00:11:47.120 | or any other project kind of a cool and fancy.

00:11:50.240 | And so you would have that kind of phase

00:11:51.760 | of growing up compute allocation for it

00:11:54.160 | all the way to a point.

00:11:55.400 | And then maybe you reach an apex

00:11:58.880 | and then maybe you go back to mostly to zero

00:12:01.280 | and restart the process

00:12:02.400 | because you're going in a different direction

00:12:03.760 | or something else.

00:12:04.600 | That's how I felt.

00:12:05.440 | - Explore, exploit.

00:12:06.320 | - Yeah, yeah, exactly.

00:12:07.400 | Exactly, exactly.

00:12:08.600 | (laughing)

00:12:09.440 | It's a reinforcement learning approach.

00:12:10.280 | - Classic PhD student search process.

00:12:13.560 | - And you were reporting to Ilya,

00:12:15.600 | like the results you were kind of bringing back to him

00:12:17.880 | or like, what's the structure?

00:12:18.960 | It's almost like when you're doing

00:12:20.080 | such cutting edge research,

00:12:21.520 | you need to report to somebody

00:12:22.840 | who is actually really smart

00:12:23.920 | to understand the directions, right?

00:12:25.640 | - So we had a reasoning team,

00:12:27.080 | which was working on reasoning, obviously,

00:12:29.760 | and some math in general.

00:12:31.280 | So, and that team had a manager,

00:12:33.240 | but Ilya was extremely involved in the team

00:12:35.160 | as an advisor, I guess.

00:12:37.240 | Since he brought me in OpenAI,

00:12:39.080 | I was lucky to,

00:12:40.320 | mostly for during the first years

00:12:41.680 | to have kind of a direct access to him.

00:12:44.200 | He would really coach me as a trainee researcher, I guess,

00:12:47.960 | with good engineering skills.

00:12:50.120 | And Ilya, I think at OpenAI,

00:12:52.600 | he was the one showing the North Star, right?

00:12:55.600 | He was, his job,

00:12:56.800 | and I think he really enjoyed that

00:12:58.360 | and he did it super well,

00:12:59.600 | was going through the teams and saying,

00:13:01.960 | this is where we should be going

00:13:03.720 | and trying to, you know,

00:13:04.800 | flock the different teams together towards an objective.

00:13:08.360 | - I would say like the public perception of him

00:13:10.240 | is that he was the strongest believer in scaling.

00:13:12.800 | - Oh yeah.

00:13:13.640 | - He was, he has always pursued like the compression thesis.

00:13:17.120 | - Yep.

00:13:17.960 | - You have worked with him personally.

00:13:19.320 | What does the public not know about how he works?

00:13:22.400 | - I think he's really focused on building the vision

00:13:25.160 | and communicating the vision within the company,

00:13:27.240 | which was extremely useful.

00:13:28.680 | I was personally surprised that he spent so much time,

00:13:31.760 | you know, working on communicating that vision

00:13:34.160 | and getting the teams to work together versus--

00:13:36.560 | - To be specific, vision is AGI.

00:13:38.080 | - Oh yeah, vision is like, yeah,

00:13:40.040 | it's the belief in compression and scanning computes.

00:13:43.560 | I remember when I started working on the reasoning team,

00:13:45.840 | it was, the excitement was really

00:13:47.720 | about scaling the computer on reasoning.

00:13:49.840 | And that was really the belief

00:13:51.280 | he wanted to ingrain in the team.

00:13:52.760 | And that's what has been useful to the team.

00:13:55.440 | And with the DeepMind results,

00:13:57.240 | shows that it was the right approach

00:13:58.880 | with the success of GPT-4 and stuff,

00:14:01.040 | shows that it was the right approach.

00:14:02.400 | - And was it according to the neural scaling laws,

00:14:05.280 | the Kaplan paper that was published?

00:14:08.120 | - I think it was before that,

00:14:09.440 | because those ones came with GPT-3,

00:14:13.360 | basically at the time of GPT-3 being released

00:14:15.360 | or being ready internally.

00:14:17.000 | But before that, there really was a strong belief in scale.

00:14:20.960 | I think it was just the belief that the transformer

00:14:23.120 | was a generic enough architecture

00:14:25.560 | that you could learn anything,

00:14:26.960 | and that this was just a question of scaling.

00:14:28.960 | - Any other fun stories you want to tell?

00:14:31.360 | - David, Sam Altman, Greg, you know, any.

00:14:34.120 | I didn't work, weirdly, I didn't work that much with Greg

00:14:37.000 | when I was at OpenAI.

00:14:38.400 | He was, he had always been mostly focused

00:14:40.560 | on training the GPTs, and rightfully so.

00:14:43.400 | One thing about Sam Altman, he really impressed me

00:14:46.000 | because when I joined, he had joined not that long ago,

00:14:49.920 | and it felt like he was kind of a very high-level CEO.

00:14:54.720 | And I was mind-blown by how deep he was able

00:14:59.000 | to go into the subjects within a year or something,

00:15:02.320 | all the way to a situation where when I was having lunch

00:15:05.640 | by year two, I was at OpenAI with him,

00:15:07.800 | he would just quite know deeply what I was doing.

00:15:10.920 | - With no ML background, like, you know.

00:15:14.320 | - Yeah, with no ML, but I didn't have any either,

00:15:17.120 | so I guess that explains why.

00:15:19.560 | But I think you can, it's a question about really,

00:15:22.360 | you don't necessarily need to understand

00:15:24.320 | the very technicalities of how things are done,

00:15:27.240 | but you need to understand what's the goal

00:15:29.400 | and what's being done, and what are the recent results,

00:15:31.840 | and all of that in you, and we could have

00:15:33.440 | kind of a very productive discussion,

00:15:35.240 | and that really impressed me, given the size at the time

00:15:38.560 | of OpenAI, which was not negligible.

00:15:40.560 | - Yeah, I mean, you've been, you were a founder before.

00:15:42.520 | - Yep. - You're a founder now.

00:15:43.480 | - Yep. - And you've seen Sam

00:15:44.520 | as a founder.

00:15:45.360 | How has he affected you as a founder?

00:15:47.400 | - I think having that capability of changing

00:15:50.040 | the scale of your attention in the company,

00:15:52.600 | because most of the time, you operate at a very high level,

00:15:55.080 | but being able to go deep down and being in the known

00:15:57.520 | of what's happening on the ground is something

00:15:59.600 | that I feel is really enlightening.

00:16:02.320 | That's not a place in which I ever was as a founder,

00:16:05.440 | because first company, we went all the way to 10 people.

00:16:08.520 | Current company, there's 25 of us,

00:16:10.400 | so the high level, the sky and the ground

00:16:13.280 | are pretty much at the same place.

00:16:14.440 | (all laughing)

00:16:16.000 | - Yeah, you're being too humble.

00:16:17.280 | I mean, Stripe was also like a huge rocket ship.

00:16:19.840 | - Stripe, I was a founder, so I was, like at OpenAI,

00:16:22.360 | I was really happy being on the ground,

00:16:24.640 | pushing the machine, making it work.

00:16:26.680 | - Yeah.

00:16:27.520 | - Last OpenAI question. - Yep.

00:16:28.800 | - The Anthropic split you mentioned,

00:16:30.640 | you were around for that, very dramatic.

00:16:32.240 | David also left around that time, you left.

00:16:34.960 | This year, we've also had a similar management shakeup,

00:16:38.240 | let's just call it.

00:16:39.120 | Can you compare what it was like going through that split

00:16:41.840 | during that time?

00:16:42.960 | And then like, does that have any similarities now?

00:16:46.000 | Like, are we gonna see a new Anthropic emerge

00:16:48.520 | from these folks that you just left?

00:16:50.720 | - That, I really, really don't know.

00:16:52.640 | At the time, the split was pretty surprising

00:16:55.360 | because they had been trying GPT-3, it was a success.

00:16:59.080 | And to be completely transparent,

00:17:00.600 | I wasn't in the weeds of the splits.

00:17:03.080 | What I understood of it is that there was a disagreement

00:17:06.400 | of the commercialization of that technology.

00:17:09.480 | I think the focal point of that disagreement

00:17:11.640 | was the fact that we started working on the API

00:17:14.160 | and wanted to make those models available through an API.

00:17:17.080 | Is that really the core disagreement?

00:17:20.240 | I don't know.

00:17:21.080 | - Was it safety, was it commercialization?

00:17:22.560 | - Exactly.

00:17:23.400 | - Or did they just wanna start a company?

00:17:24.240 | - Exactly, exactly, that I don't know.

00:17:26.000 | But I think what I was surprised of

00:17:29.040 | is how quickly OpenAI recovered at a time.

00:17:32.480 | And I think it's just because we were mostly a research org

00:17:36.400 | and the mission was so clear

00:17:37.960 | that some divergence in some teams, some people leave,

00:17:41.680 | the mission is still there.

00:17:42.680 | We have the compute, we have a site.

00:17:44.640 | So it just keep going.

00:17:46.200 | - Yeah, very deep bench, like just a lot of talent.

00:17:48.800 | - Yeah.

00:17:49.640 | - So that was the OpenAI part of the history.

00:17:52.440 | - Exactly.

00:17:53.280 | - So then you leave OpenAI in September, 2022.

00:17:55.880 | And I would say in Silicon Valley,

00:17:57.840 | the two hottest companies at the time

00:17:59.400 | were you and Linkgrain.

00:18:01.000 | What was that start like?

00:18:02.720 | And what did you decide to start

00:18:03.920 | with a more developer focus,

00:18:05.640 | kind of like a AI engineer tool,

00:18:07.960 | rather than going back into doing some more research

00:18:10.240 | on something else?

00:18:11.080 | - Yeah, first, I'm not a trained researcher.

00:18:13.280 | So going through OpenAI was really kind of the PhD

00:18:15.840 | I always wanted to do.

00:18:17.200 | But research is hard.

00:18:18.440 | You're digging into a field all day long

00:18:21.200 | for weeks and weeks and weeks.

00:18:23.320 | And you find something,

00:18:25.000 | you get super excited for 12 seconds.

00:18:28.000 | And at the 13th second, you're like,

00:18:29.360 | "Oh yeah, that was obvious."

00:18:30.720 | And you go back to digging.

00:18:32.320 | I'm not a trained, formally trained researcher.

00:18:35.080 | And it wasn't kind of necessarily an ambition of me

00:18:37.360 | of having a research career.

00:18:40.280 | And I felt the hardness of it.

00:18:42.560 | I enjoyed a lot of that a ton.

00:18:45.960 | But at the time I decided that I wanted to go back

00:18:49.160 | to something more productive.

00:18:51.200 | And the other fun motivation was like,

00:18:54.200 | I mean, if we believe in AGI

00:18:56.040 | and if we believe the timelines might not be too long,

00:18:58.680 | it's actually the last train leaving a station

00:19:00.560 | to start a company.

00:19:01.640 | After that, it's going to be computers all the way down.

00:19:04.880 | And so that was kind of the true motivation

00:19:06.840 | for like trying to go there.

00:19:09.720 | So that's kind of the core motivation

00:19:11.240 | at the beginning of personally.

00:19:12.680 | And the motivation for starting a company was pretty simple.

00:19:16.360 | I had seen GPT-4 internally at the time.

00:19:18.560 | It was September, 2022.

00:19:20.360 | So it was pre-GPT, but GPT-4 was ready since,

00:19:23.800 | I mean, I'd been ready for a few months internally.

00:19:26.040 | I was like, okay, that's obvious.

00:19:27.520 | The capabilities are there to create an insane amount

00:19:30.200 | of value to the world.

00:19:31.440 | And yet the deployment is not there yet.

00:19:34.360 | The revenue of OpenAI at the time were ridiculously small

00:19:37.680 | compared to what it is today.

00:19:39.080 | And so the thesis was there's probably a lot to be done

00:19:42.080 | at the product level to unlock the usage.

00:19:45.120 | - Yep.

00:19:45.960 | Let's talk a bit more about the form factor, maybe.

00:19:48.240 | I think one of the first successes you have

00:19:50.720 | was kind of like the WebGPT-like thing,

00:19:52.880 | like using the models to traverse the web

00:19:55.440 | and like summarize thing.

00:19:56.280 | And the browser was really the interface.

00:19:58.840 | Why did you start with the browser?

00:20:00.720 | Like what was it important?

00:20:01.760 | And then you built XB1,

00:20:03.040 | which was kind of like the browser extension.

00:20:05.200 | - So the starting point at the time was,

00:20:07.080 | so if you wanted to talk about LLMs,

00:20:09.720 | it was still a rather small community,

00:20:12.960 | a community of mostly researchers,

00:20:15.400 | and to some extent, very early adopters,

00:20:17.920 | very early engineers.

00:20:19.440 | It was almost inconceivable to just build a product

00:20:22.680 | and go sell it to the enterprise.

00:20:24.400 | Though at the time there was a few companies doing that,

00:20:26.280 | the one on marketing, I don't remember its name, Jasper.

00:20:29.800 | But so the natural first intention,

00:20:32.240 | the first, first, first intention

00:20:33.480 | was to go to the developers

00:20:34.960 | and try to create tooling for them

00:20:36.960 | to create product on top of those models.

00:20:39.520 | And so that's what Dust was originally.

00:20:41.560 | It was quite different than Lanchain,

00:20:43.240 | and Lanchain just beat the shit out of us,

00:20:46.920 | which is great.

00:20:48.080 | - It's a choice.

00:20:49.280 | You were cloud, in closed source.

00:20:51.400 | They were open source.

00:20:52.240 | - Yeah, so technically we were open source

00:20:54.080 | and we still are open source,

00:20:55.040 | but I think that doesn't really matter.

00:20:56.760 | I had the strong belief from my research time

00:20:59.480 | that you cannot create an LLM-based workflow

00:21:04.200 | on just one example.

00:21:05.680 | Basically, if you just have one example, you overfit.

00:21:08.320 | So as you develop your interaction,

00:21:10.280 | your orchestration around the LLM,

00:21:11.880 | you need a dozen example.

00:21:14.000 | Obviously, if you're running a dozen example

00:21:15.800 | on a multi-step workflow, you start paralyzing stuff.

00:21:19.240 | And if you do that in the console,

00:21:21.960 | you just have like a messy stream of tokens going out

00:21:25.440 | and it's very hard to observe what's going there.

00:21:28.720 | And so the idea was to go with a new UI

00:21:30.520 | so that you could kind of introspect easily

00:21:33.080 | the output of each interaction with the model

00:21:35.760 | and dig into there through a new UI, which is--

00:21:38.200 | - Was that open source?

00:21:39.320 | I actually didn't--

00:21:40.160 | - Oh yeah, it wasn't.

00:21:41.000 | I mean, Dust is entirely open source even today.

00:21:43.520 | We're not going for--

00:21:44.360 | - If it matters, I didn't know that.

00:21:45.520 | - No, no, no, no, no.

00:21:47.080 | The reason why is because we're not open source

00:21:48.600 | because we're not doing an open source strategy.

00:21:51.160 | It's not an open source go-to-market at all.

00:21:53.080 | We're open source because we can and it's fun.

00:21:55.240 | - Open source is marketing.

00:21:56.360 | You have all the downsides of open source,

00:21:57.680 | which is that people can clone you.

00:21:59.680 | - But I think that downside is a big fallacy.

00:22:02.320 | - Okay.

00:22:03.160 | - Yes, anybody can clone Dust today,

00:22:04.800 | but the value of Dust is not the current state.

00:22:07.320 | The value of Dust is number of eyeballs

00:22:09.400 | and hands of developers

00:22:11.000 | that are creating to it in the future.

00:22:13.440 | And so, yes, anybody can close today,

00:22:15.360 | but that wouldn't change anything.

00:22:17.720 | There is some value in being open source.

00:22:19.960 | In a discussion with the security team,

00:22:22.120 | you can be extremely transparent and just show the code.

00:22:24.880 | When you have discussion with users

00:22:26.520 | and there's a bug or a feature missing,

00:22:28.600 | you can just point to the issue, show the pull request.

00:22:31.280 | - PR welcome.

00:22:32.120 | - Show the, exactly.

00:22:33.120 | Oh, PR welcome, that doesn't happen that much.

00:22:36.160 | But you can show the progress.

00:22:38.000 | If the person that you're chatting with

00:22:40.280 | is a little bit technical,

00:22:41.120 | they really enjoy seeing the pull requests advancing

00:22:43.520 | and seeing all the way to deploy.

00:22:45.280 | And then the downsides are mostly around security.

00:22:48.440 | You never want to do security by obfuscation.

00:22:50.960 | But the truth is that your vector of attack

00:22:54.080 | is facilitated by you being open source.

00:22:56.760 | But at the same time, it's a good thing

00:22:58.160 | because if you're doing anything like bug bunting

00:23:00.520 | or stuff like that,

00:23:01.840 | you just give much more tools to the bug buntiers

00:23:05.200 | so that their output is much better.

00:23:07.280 | So there's many, many, many trade-offs.

00:23:09.360 | I don't believe in the value of the code base per se.

00:23:12.560 | - Wow.

00:23:13.400 | - I think it's really the people that are on the code base

00:23:15.120 | that have the value and the go-to-market and the product

00:23:18.040 | and all of those things that are around the code base.

00:23:20.960 | Obviously, that's not true for every code base.

00:23:23.120 | If you're working on a very secret kernel

00:23:25.280 | to accelerate the inference of LLMs,

00:23:28.560 | I would buy that you don't want to be open source.

00:23:31.240 | But for product stuff,

00:23:32.520 | I really think there's very little risk.

00:23:35.240 | - Yeah.

00:23:36.080 | - I signed up for XP1, I was looking, January, 2023.

00:23:39.720 | I think at the time you were on DaVinci 003.

00:23:42.720 | Given that you had seen GPT-4,

00:23:44.760 | how did you feel having to push a product out

00:23:47.080 | that was using this model that was so inferior?

00:23:49.520 | And you're like, "Please, just use it today.

00:23:51.120 | "I promise it's going to get better."

00:23:52.360 | It's like just overall as a founder,

00:23:54.360 | how do you build something

00:23:55.200 | that maybe doesn't quite work with the model today,

00:23:57.080 | but you're just expecting the new model to be better?

00:23:59.360 | - Yeah, so actually, XP1 was even on a smaller one

00:24:02.920 | that was the post-Chad GPT release small version,

00:24:05.880 | so it was--

00:24:06.720 | - Ada, Babbage.

00:24:07.560 | - No, no, no, not that far away,

00:24:08.880 | but it was the small version of Chad GPT, basically.

00:24:12.240 | I don't remember its name.

00:24:13.600 | Yes, you have a frustration there,

00:24:15.440 | but at the same time, I think XP1 was designed,

00:24:18.080 | was an experiment, but was designed as a way to be useful

00:24:20.520 | at the current capability of the model.

00:24:22.560 | If you just want to extract data from a LinkedIn page,

00:24:25.240 | that model was just fine.

00:24:26.840 | If you want to summarize an article on a newspaper,

00:24:29.480 | that model was just fine.

00:24:31.000 | And so it was really a question of trying to find a product

00:24:33.720 | that works with the current capability,

00:24:36.000 | knowing that you will always have tailwinds

00:24:39.080 | as models get better and faster and cheaper.

00:24:41.240 | So that was kind of a, there's a bit of a frustration

00:24:43.760 | because you know what's out there

00:24:44.880 | and you know that you don't have access to it yet,

00:24:46.520 | but it's also interesting to try to find a product

00:24:49.480 | that works with the current capability.

00:24:51.360 | - And we highlighted XP1 in our anatomy of autonomy posts

00:24:54.840 | in April of last year, which was, you know,

00:24:57.400 | where are all the agents, right?

00:24:59.160 | So now we spent 30 minutes getting

00:25:01.120 | to what you're building now.

00:25:02.960 | So you basically had a developer framework,

00:25:06.160 | then you had a browser extension,

00:25:07.840 | then you had all these things,

00:25:08.800 | and then you kind of got to where Dust is today.

00:25:10.960 | So maybe just give people an overview

00:25:12.640 | of what Dust is today and the courtesies behind it.

00:25:16.080 | - Yeah, of course.

00:25:16.920 | So Dust, we really want to build the infrastructure

00:25:19.280 | so that companies can deploy agents within their teams.

00:25:23.280 | We are horizontal by nature

00:25:25.680 | because we strongly believe in the emergence of use cases

00:25:28.280 | from the people having access to creating an agent

00:25:30.760 | that don't need to be developers.

00:25:32.600 | They have to be tinkerers, they have to be curious,

00:25:35.120 | but they can, like anybody can create an agent

00:25:37.440 | that will solve an operational thing

00:25:39.960 | that they're doing in their day-to-day job.

00:25:42.000 | And to make those agents useful,

00:25:44.360 | there's two focus, which is interesting.

00:25:46.760 | The first one is an infrastructure focus.

00:25:48.520 | You have to build the pipes

00:25:49.840 | so that the agent has access to the data,

00:25:51.800 | you have to build the pipes such that the agents

00:25:53.560 | can take action, can access the web, et cetera.

00:25:55.800 | So that's really an infrastructure play.

00:25:58.040 | Maintaining connections to Notion, Slack, GitHub,

00:26:01.240 | all of them is a lot of work.

00:26:04.280 | It is boring work, boring infrastructure work,

00:26:06.840 | but that's something that we know is extremely valuable

00:26:09.440 | in the same way that Stripe is extremely valuable

00:26:11.280 | because it maintains the pipes.

00:26:13.120 | And we have that dual focus

00:26:15.080 | because we're also building the product

00:26:17.240 | for people to use it.

00:26:18.640 | And there it's fascinating because everything started

00:26:21.760 | from the conversational interface, obviously,

00:26:24.240 | which is a great starting point,

00:26:26.160 | but we're only scratching the surface, right?

00:26:29.280 | I think we are at the Pong level of LLM productization,

00:26:33.960 | and we haven't invented the C3,

00:26:35.960 | we haven't invented Counter-Strike,

00:26:37.520 | we haven't invented Cyberpunk 2077.

00:26:41.640 | So this is really, our mission is to really create

00:26:46.080 | the product that let people equip themselves

00:26:48.520 | to just get away all the work that can be automated

00:26:52.120 | or assisted by LLMs.

00:26:54.080 | - And can you just comment on different takes

00:26:56.480 | that people had?

00:26:57.320 | So maybe at the most open, it's like auto-GPT.

00:27:01.200 | It's just kind of like, just try and do anything.

00:27:03.000 | It's like, it's all magic.

00:27:04.240 | There's no way for you to do anything.

00:27:06.000 | Then you had the Adapt.

00:27:08.160 | We had David on the podcast.

00:27:09.440 | They're very super hands-on with each individual customer

00:27:12.280 | to build super tailored.

00:27:14.160 | How do you decide where to draw the line

00:27:15.880 | between this is magic, this is exposed to you,

00:27:18.120 | especially in a market where most people don't know

00:27:20.520 | how to build with AI at all.

00:27:21.880 | So if you expect them to do the thing,

00:27:23.560 | they're probably not going to do it.

00:27:24.840 | - Yeah, exactly.

00:27:25.680 | So the auto-GPT approach obviously is extremely exciting,

00:27:28.920 | but we know that the agentic capability of models

00:27:32.200 | are not quite there yet.

00:27:33.520 | It just gets lost.

00:27:35.680 | So we're starting where it works.

00:27:37.760 | Same with the XP one, and where it works is pretty simple.

00:27:40.440 | It's like a simple workflows that involve a couple tools

00:27:45.440 | where you don't even need to have the model

00:27:48.320 | decide which tools it's used in the sense of

00:27:51.120 | you just want people to put it in the instructions.

00:27:53.920 | It's like, take that page, do that search,

00:27:57.200 | pick up that document, do the work that I want

00:27:59.800 | in the format I want, and give me the results.

00:28:01.760 | There's no smartness there, right?

00:28:03.640 | In terms of orchestrating the tools,

00:28:06.080 | it's mostly using English for people to program a workflow

00:28:10.360 | where you don't have the constraint

00:28:11.600 | of having compatible API between the two.

00:28:14.000 | - That kind of personal automation,

00:28:16.000 | would you say it's kind of like a LLM Zapier type of thing?

00:28:18.800 | Like if this, then that, and then, you know,

00:28:20.880 | do this, then this.

00:28:21.720 | It's still very, you're programming with English?

00:28:24.520 | - So you're programming with English.

00:28:25.760 | So you're just saying, oh, do this, and then that.

00:28:28.720 | You can even create some form of APIs.

00:28:31.320 | You say, when I give you the command X, do this.

00:28:34.480 | When I give you the command Y, do this,

00:28:36.360 | and you describe the workflow.

00:28:37.920 | But you don't have to create boxes

00:28:39.800 | and create the workflow explicitly.

00:28:41.720 | It's just need to describe what are the tasks

00:28:43.720 | supposed to be and make the tool available to the agent.

00:28:47.360 | Tool can be a semantic search.

00:28:49.200 | The tool can be querying into a structured database.

00:28:52.080 | The tool can be searching on the web.

00:28:54.640 | And obviously, the interesting tools

00:28:57.040 | that we're only starting to scratch

00:28:58.800 | are actually creating external actions,

00:29:00.880 | like reimbursing something on Stripe,

00:29:03.400 | sending an email, clicking on a button in the admin,

00:29:06.680 | or something like that.

00:29:07.560 | - Do you maintain all these integrations?

00:29:09.320 | - Today, we maintain most of the integrations.

00:29:11.720 | We do always have an escape hatch

00:29:13.600 | for people to kind of custom integrate.

00:29:17.000 | But the reality is that, the reality of the market today

00:29:19.560 | is that people just want it to work, right?

00:29:22.280 | And so it's mostly us maintaining the integration.

00:29:25.400 | As an example, a very good source of information

00:29:27.480 | that is tricky to productize is Salesforce,

00:29:30.880 | because Salesforce is basically a database and a UI,

00:29:33.480 | and they do the fuck they want with it.

00:29:35.320 | (all laughing)

00:29:37.120 | And so every company has different models

00:29:39.160 | and stuff like that.

00:29:40.000 | So right now, we don't support it natively.

00:29:43.200 | And the type of support, or real native support,

00:29:46.040 | will be slightly more complex than just OSing into it,

00:29:48.720 | like is the case with Slack, as an example,

00:29:50.880 | because it's probably gonna be,

00:29:52.520 | oh, you want to connect your Salesforce to us?

00:29:54.440 | Give us the SoQL, that's the Salesforce QL language.

00:29:58.600 | Give us the queries you want us to run on it,

00:30:00.840 | and inject in the context of Dust.

00:30:03.040 | So that's interesting how not only integrations are cool,

00:30:06.200 | and some of them require a bit of work on the user,

00:30:08.480 | and for some of them that are really valuable to our users,

00:30:10.480 | but we don't support yet,

00:30:11.400 | they can just build them internally

00:30:13.160 | and push the data to us.

00:30:14.600 | - I think I understand the Salesforce thing,

00:30:16.440 | but let me just clarify it.

00:30:17.440 | Are you using browser automation,

00:30:19.200 | because there's no API for something?

00:30:20.680 | - No, no, no, no, no.

00:30:21.520 | In that case, so we do have browser automation

00:30:24.240 | for all the use cases and apply the public web,

00:30:27.240 | but for most of the integration

00:30:28.480 | with the internal system of the company,

00:30:29.880 | it really runs through API.

00:30:31.520 | - Haven't you felt the pull to RPA,

00:30:33.880 | browser automation, that kind of stuff?

00:30:35.560 | - I mean, what I've been saying for a long time,

00:30:37.280 | maybe I'm wrong,

00:30:38.120 | is that if the future is

00:30:40.040 | that you're gonna stand in front of your computer

00:30:41.640 | and looking at an agent clicking on stuff,

00:30:44.160 | then I'll eat my computer.

00:30:45.880 | And my computer is a big Lenovo, it's black,

00:30:48.320 | doesn't sound good at all compared to a Mac.

00:30:50.760 | And if the APIs are there, we should use them.

00:30:53.480 | There's always gonna be a long tail of stuff

00:30:56.000 | that don't have APIs,

00:30:56.920 | but as the world is moving forward,

00:30:59.000 | that's disappearing.

00:31:00.480 | So the core API value in the past

00:31:04.080 | has really been, oh, this old '90s product

00:31:07.160 | doesn't have an API,

00:31:08.000 | so I need to use the UI to automate.

00:31:09.840 | I think for most of the ICP companies,

00:31:12.640 | the companies that are ICP for us,

00:31:14.000 | the scale-ups that are between 500 and 5,000 people,

00:31:17.040 | tech companies, most of the SaaS they use have APIs.

00:31:21.080 | Not as an interesting question for the open web,

00:31:24.240 | because there are stuff that you wanna do

00:31:26.280 | that involve websites that don't necessarily have APIs.

00:31:29.240 | And the current state of web integration from,

00:31:32.720 | which is us and OpenAI and Anthropic,

00:31:35.360 | I don't even know if they have web navigation,

00:31:37.040 | but I don't think so.

00:31:38.040 | The current state of affair is really, really broken,

00:31:40.480 | because you have what?

00:31:41.320 | You have basically search and headless browsing.

00:31:44.000 | But headless browsing, I think everybody's doing

00:31:46.840 | basically body.innertext and fill that into the model.

00:31:51.840 | Right?

00:31:52.960 | - There's parsers into Markdown and stuff.

00:31:54.720 | - We're super excited by the companies

00:31:56.200 | that are exploring the capability of rendering a webpage

00:31:59.360 | into a way that is compatible for a model,

00:32:01.280 | being able to maintain the selector,

00:32:03.200 | so that's basically the place where to click in the page

00:32:06.200 | through that process, expose the actions to the model,

00:32:09.200 | have the model select an action

00:32:10.680 | in a way that is compatible with model,

00:32:12.760 | which is not a big page of a full DOM that is very noisy,

00:32:16.760 | and then being able to decompress that

00:32:19.320 | back to the original page and take the action.

00:32:22.080 | And that's something that is really exciting

00:32:24.000 | and that will kind of change the level of things

00:32:27.160 | that agents can do on the web.

00:32:29.720 | That I feel exciting, but I also feel that the bulk

00:32:33.120 | of the useful stuff that you can do within the company

00:32:36.000 | can be done through API, the data can be retrieved by API,

00:32:38.800 | the actions can be taken through API.

00:32:40.640 | - For listeners, I'll note that you're basically

00:32:42.480 | completely disagreeing with David Wan.

00:32:44.520 | - Exactly, exactly, I've seen it since summer.

00:32:47.200 | - Adept is where it is, and Dust is where it is,

00:32:49.800 | so Dust is still standing.

00:32:51.640 | - Can we just quickly comment on function calling?

00:32:53.920 | - Yeah.

00:32:54.760 | - You mentioned you don't need the models to be that smart

00:32:56.120 | to actually pick the tools.

00:32:57.480 | Have you seen the models not be good enough,

00:32:59.240 | or is it just like, you just don't wanna put

00:33:01.120 | the complexity in there?

00:33:02.600 | Is there any room for improvement left in function calling,

00:33:05.760 | or do you feel you usually consistently get always

00:33:08.120 | the right response, the right parameters, and all that?

00:33:10.240 | - So that's a tricky product question,

00:33:12.200 | because if the instructions are good and precise,

00:33:14.800 | then you don't have any issue,

00:33:15.800 | because it's scripted for you,

00:33:16.960 | and the model just look at the scripts and just follow

00:33:19.160 | and say, oh, he's probably talking about that action,

00:33:21.360 | and I'm gonna use it, and the parameters are kind of

00:33:23.360 | abused from the state of the conversation,

00:33:24.880 | I'll just go with it.

00:33:26.200 | If you provide a very high level,

00:33:28.520 | kind of a auto-GPT-esque level in the instructions,

00:33:31.080 | and provide 16 different tools to your model,

00:33:33.520 | yes, we're seeing the models in that state making mistakes.

00:33:37.080 | And there is obviously some progress can be made

00:33:41.680 | on the capabilities, but the interesting part

00:33:44.360 | is that there is already so much work

00:33:46.880 | and that can assist, augment, accelerate,

00:33:49.680 | by just going with pretty simply

00:33:52.240 | scripted for actions agents.

00:33:54.600 | What I'm excited about, by starting in,

00:33:56.720 | like pushing our users to create rather simple agents,

00:33:59.880 | is that once you have those working really well,

00:34:03.040 | you can create meta-agents that use the agents as actions,

00:34:06.040 | and all of a sudden, you can kind of have a hierarchy

00:34:08.640 | of responsibility that will probably get you almost

00:34:12.080 | to the point of the auto-GPT value.

00:34:14.200 | It require the construction of intermediary artifacts,

00:34:17.280 | but you're probably gonna be able to achieve

00:34:19.720 | something great.

00:34:20.600 | I'll give you some example.

00:34:21.680 | We have, our incidents are shared in Slack,

00:34:24.400 | in a specific channel, or shipped, are shared in Slack.

00:34:27.280 | We have a weekly meeting where we have a table

00:34:29.440 | about incidents and shipped stuff.

00:34:32.040 | We're not writing that weekly meeting table anymore.

00:34:34.240 | We have an assistant that just go find the right data

00:34:36.320 | on Slack and create the table for us.

00:34:38.680 | And that assistant works perfectly.

00:34:40.800 | It's trivially simple, right?

00:34:42.720 | Take one week of data from that channel

00:34:44.760 | and just create the table.

00:34:46.480 | And then we have, in that weekly meeting,

00:34:48.760 | some, obviously, some graphs and reporting

00:34:52.040 | about our financials and our progress and our ARR,

00:34:55.240 | and we've created assistants to generate

00:34:57.320 | those graphs directly, and those assistants works great.

00:35:00.720 | By creating those assistants that cover those small parts

00:35:02.840 | of that weekly meeting, slowly, we're getting to,

00:35:05.200 | in a world where we'll have a weekly meeting assistant,

00:35:07.880 | we'll just call it, you don't need to prompt it,

00:35:09.640 | you don't need to say anything.

00:35:10.680 | It's gonna run those different assistants

00:35:12.440 | and get that Notion page just ready.

00:35:14.440 | And by doing that, if you get there,

00:35:16.760 | and that's an objective for us, to us using Dust, get there,

00:35:20.280 | you're saving, I don't know, an hour of company time

00:35:23.520 | every time you run it.

00:35:24.400 | - Yeah, that's my pet topic of NPM for agents.

00:35:27.480 | It's like, how do you build dependency graphs of agents

00:35:30.120 | and how do you share them?

00:35:31.560 | Because why do I have to rebuild some of the smaller levels

00:35:35.040 | of what you built already?

00:35:36.280 | - I have a quick follow-up question on agents

00:35:38.800 | managing other agents.

00:35:39.920 | It's a topic of a lot of research,

00:35:42.640 | both from like Microsoft and even in startups.

00:35:45.520 | What you've discovered best practice for,

00:35:48.280 | let's say like a manager agent

00:35:49.760 | controlling a bunch of small agents,

00:35:51.840 | that it's two-way communication.

00:35:53.440 | I don't know, is there should be a protocol format?

00:35:56.000 | - To be completely honest, the state we are at right now

00:35:58.960 | is creating the simple agents.

00:36:00.280 | So we haven't even explored yet the meta agents.

00:36:02.440 | We know it's there.

00:36:03.280 | We know it's gonna be valuable.

00:36:04.520 | We know it's gonna be awesome.

00:36:05.760 | But we're starting there

00:36:07.040 | because it's the simplest place to start.

00:36:09.200 | And it's also what the market understands.

00:36:11.600 | If you go to a company, random SaaS B2B company,

00:36:15.920 | not necessarily specialized in AI,

00:36:17.560 | and you take an operational team

00:36:19.880 | and you tell them build some tooling for yourself,

00:36:22.000 | they'll understand the small agents.

00:36:23.640 | If you tell them build AutoGP, they'll go, "Auto what?"

00:36:27.640 | - And I noticed that in your language,

00:36:29.040 | you're very much focused on non-technical users.

00:36:31.160 | You don't really mention API here.

00:36:33.120 | You mention instruction instead of system prompt, right?

00:36:36.560 | That's very conscious.

00:36:37.760 | - Yeah, it's very conscious.

00:36:38.800 | It's a mark of our designer, Ed,

00:36:40.680 | who kind of pushed us to create a friendly product.

00:36:45.320 | I was knee-deep into AI when I started, obviously.

00:36:48.600 | And my co-founder, Gabriel, was a Stripe as well.

00:36:51.880 | We started a company, Glaser,

00:36:52.920 | that got acquired by Stripe 15 years ago.

00:36:55.040 | Was at Allen, a healthcare company in Paris after that.

00:36:58.960 | It was a little bit less so knee-deep in AI,

00:37:01.560 | but really focused on product.

00:37:03.360 | And I didn't realize how important it is

00:37:05.920 | to make that technology not scary to end users.

00:37:10.240 | It didn't feel scary to me,

00:37:11.720 | but it was really seen by Ed, our designer,

00:37:15.000 | that it was feeling scary to the users.

00:37:17.880 | And so we were very proactive and very deliberate

00:37:20.440 | about creating a brand that feels not too scary

00:37:23.000 | and creating a wording and a language, as you say,

00:37:25.680 | that really tried to communicate the fact

00:37:28.400 | that it's gonna be fine, it's gonna be easy,

00:37:30.200 | you're gonna make it.

00:37:31.200 | - And another big point that David had about ADAPT

00:37:33.960 | is we need to build an environment

00:37:35.920 | for the agents to act.

00:37:37.240 | And then if you have the environment,

00:37:38.200 | you can simulate what they do.

00:37:40.120 | How's that different when you're interacting with APIs

00:37:42.880 | and you're kind of touching systems

00:37:44.680 | that you cannot really simulate?

00:37:45.880 | Like, if you call it the Salesforce API,

00:37:47.400 | you're just calling it.

00:37:48.760 | - Yep, so I think that goes back to the DNA of the companies

00:37:52.120 | that are very different.

00:37:53.240 | ADAPT, I think, was a product company

00:37:55.520 | with a very strong research DNA,

00:37:56.920 | and they were still doing research.

00:37:58.040 | One of their goal was building a model,

00:38:00.080 | and that's why they raised a large amount of money,

00:38:01.760 | et cetera.

00:38:02.760 | We are 100% deliberately product company.

00:38:06.640 | We don't do research, we don't train models,

00:38:09.040 | we don't even run GPUs.

00:38:10.640 | We're using the models that exist,

00:38:11.960 | and we try to push the product boundary

00:38:13.920 | as far as possible with the existing models.

00:38:16.360 | So that creates an issue.

00:38:17.720 | Indeed, so to answer your question,

00:38:19.360 | when you're interacting in the real world,

00:38:20.840 | where you cannot simulate,

00:38:22.480 | so you cannot improve the models,

00:38:24.400 | even improving your instructions

00:38:27.480 | is complicated for a builder.

00:38:29.520 | The hope is that you can use models

00:38:31.240 | to evaluate the conversations

00:38:33.080 | so that you can get at least feedback

00:38:34.680 | and you can get contradictive information

00:38:36.520 | about the performance of your assistants.

00:38:38.320 | But if you take actual trace of interaction

00:38:41.520 | of humans with those agents,

00:38:43.440 | it is even for us human extremely hard to decide

00:38:45.960 | whether it was a productive interaction

00:38:47.400 | or a really bad interaction.

00:38:49.040 | You don't know why the person left,

00:38:50.480 | you don't know if they left happy or not.

00:38:52.600 | So being extremely, extremely, extremely pragmatic here,

00:38:56.080 | it becomes a product issue.

00:38:57.440 | We have to build a product that satisfies the end users

00:39:00.720 | to provide feedback so that as a first step,

00:39:04.240 | person that is building the agent can iterate on it.

00:39:06.880 | As a second step, maybe later when we start training model

00:39:09.440 | and post-training, et cetera,

00:39:10.920 | we can optimize around that for each of those companies.

00:39:13.440 | - Yeah.

00:39:14.280 | Do you see in the future products offering

00:39:16.200 | kind of like a simulation environment,

00:39:18.520 | the same way all SaaS now kind of offers APIs

00:39:21.560 | to build programmatically?

00:39:22.680 | Like in cybersecurity,

00:39:24.200 | there are a lot of companies working

00:39:25.880 | on building simulative environments

00:39:27.440 | so that then you can use agents like Red Team,

00:39:29.560 | but I haven't really seen that.

00:39:31.040 | - Yeah, no, me neither.

00:39:32.480 | That's a super interesting question.

00:39:34.760 | I think it really going to depend on how much,

00:39:37.280 | because you need to simulate to generate data,

00:39:39.520 | you need to train data to train models.

00:39:41.280 | And the question is at the end is,

00:39:43.360 | are we going to be training models

00:39:44.880 | or are we just going to be using frontier models as they are?

00:39:48.880 | On that question, I don't have a strong opinion.

00:39:51.600 | It might be the case that we'll be training models

00:39:53.760 | because in all of those AI first products,

00:39:56.400 | the model is so close to the product surface

00:39:59.360 | that as you get big and you want to really own your product,

00:40:02.880 | you're going to have to own the model as well.

00:40:05.680 | Owning the model doesn't mean doing the pre-training,

00:40:08.280 | that would be crazy,

00:40:09.440 | but at least having an internal post-training

00:40:12.120 | realignment loop, it makes a lot of sense.

00:40:14.760 | And so if we see many companies

00:40:16.080 | going towards that over time,

00:40:18.440 | then there might be incentives for the SaaS' of the world

00:40:23.240 | to provide assistance in getting there.

00:40:25.600 | But at the same time, there's a tension

00:40:26.600 | because those SaaS,

00:40:27.720 | they don't want to be interacted by agents.

00:40:31.240 | They want the human to click on the button.

00:40:33.440 | - So that's an incentive. - Yeah, they got to sell seats.

00:40:35.320 | - Yeah, exactly. - Exactly.

00:40:37.560 | - Just a quick question on models.

00:40:39.680 | I'm sure you've used many, probably not just OpenAI.

00:40:42.320 | Would you characterize some models as better than others?

00:40:45.680 | Do you use any open source models?

00:40:47.600 | What have been the trends in models over the last two years?

00:40:50.080 | - We've seen over the past two years

00:40:51.520 | kind of a bit of a race in between models.

00:40:54.320 | And at times it's the OpenAI model that is the best,

00:40:58.560 | at times it's the Anthropic models that is the best.

00:41:01.400 | Our take on that is that we are agnostic

00:41:03.000 | and we let our users pick their model.

00:41:05.600 | - Oh, they choose?

00:41:06.440 | - Yeah, so when you create an assistant or an agent,

00:41:08.400 | you can just say, "Oh, I'm going to run it on GPT-4,

00:41:11.720 | "GPT-4 Turbo," or...

00:41:13.040 | - Don't you think for the non-technical user,

00:41:14.600 | that is actually an abstraction

00:41:15.680 | that you should take away from that?

00:41:16.840 | - We have a sane default.

00:41:18.320 | So we move the default to the latest model that is cool,

00:41:22.200 | and we have a sane default,

00:41:23.240 | and it's actually not very visible.

00:41:24.760 | In our flow to create an agent,

00:41:26.600 | you would have to go in advance and go pick your model.

00:41:29.440 | So this is something

00:41:30.280 | that the technical person will care about,

00:41:33.360 | but that's something that obviously

00:41:34.840 | is a bit too complicated for the...

00:41:37.240 | - And do you care most about function calling

00:41:38.840 | or instruction following or something else?

00:41:40.880 | - I think we care most for function calling

00:41:43.160 | because there's nothing worse than a function call,

00:41:47.280 | including incorrect parameters or being a bit off

00:41:49.800 | because it just drives the whole interaction off.

00:41:53.480 | - So go to the Berkeley function calling.

00:41:55.640 | - Yeah, these days, it's funny how the comparison

00:41:59.480 | between GPT-4.0 and GPT-4 Turbo

00:42:01.400 | is still up in the air on function calling.

00:42:03.800 | I personally don't have proof, but I know many people,

00:42:05.640 | and I'm probably part of them,

00:42:07.040 | to think that GPT-4 Turbo is still better

00:42:09.200 | than GPT-4.0 on function calling.

00:42:10.720 | - Wow.

00:42:11.560 | - We'll see what comes out of the O1 class

00:42:14.920 | if it ever gets a function calling.

00:42:17.360 | And Cloud 3.5 Summit is great as well.

00:42:19.800 | They kind of innovated in an interesting way,

00:42:21.600 | which was never quite publicized,

00:42:23.200 | but it's that they have that kind of chain of thought step

00:42:26.560 | whenever you use a Cloud model or Summit model

00:42:29.440 | with function calling.

00:42:30.640 | That chain of thought step doesn't exist

00:42:31.880 | when you just interact with it just for answering questions,

00:42:35.040 | but when you use function calling, you get that step,

00:42:36.720 | and it really helps getting better function calling.

00:42:39.440 | - Yeah, we actually just recorded a podcast

00:42:41.520 | with the Berkeley team that runs that leaderboard this week.

00:42:44.160 | So they just released V3.

00:42:45.960 | It was V1 like two months ago, and then V2, V3.

00:42:48.400 | Turbo is on top.

00:42:49.400 | - Turbo is on top.

00:42:50.240 | - Turbo is over 4.0.

00:42:51.240 | And then the third place is XLAM from Salesforce,

00:42:54.360 | which is a large action model

00:42:55.960 | that they've been trying to popularize.

00:42:58.000 | O1 Mini is actually on here, I think.

00:43:00.240 | O1 Mini is number 11.

00:43:01.920 | - But arguably O1 Mini has been in a line for that.

00:43:05.360 | - Do you use leaderboards?

00:43:06.520 | Do you have your own evals?

00:43:07.880 | I mean, this is kind of intuitive, right?

00:43:09.720 | Like using the older model is better.

00:43:11.440 | I think most people just upgrade.

00:43:13.280 | Yeah.

00:43:14.280 | What's the eval process like?

00:43:15.560 | - It's funny because I've been doing research for three years

00:43:18.080 | and we have bigger stuff to cook.

00:43:21.080 | - Yeah.

00:43:21.920 | - When you're deploying in a company,

00:43:23.400 | one thing where we really spike

00:43:25.080 | is that when we manage to activate the company,

00:43:26.640 | we have a crazy penetration.

00:43:27.960 | The highest penetration we have is 88% daily active users

00:43:32.480 | within the entire employee of the company.

00:43:34.880 | The kind of average penetration and activation

00:43:37.760 | we have in our current enterprise customers

00:43:39.960 | is something like more like 60 to 70% weekly active.

00:43:43.760 | So we basically have the entire company interacting with us.

00:43:47.000 | And when you're there,

00:43:48.640 | there is so many stuff that matters most

00:43:51.120 | than getting evals, getting the best model,

00:43:54.400 | because there is so many places where you can create products

00:43:57.840 | or do stuff that will give you the 80% with the work you do,

00:44:02.560 | whereas deciding if it's GPT-4 or GPT-4 Turbo or et cetera,

00:44:07.160 | you know, it'll just give you the 5% improvement.

00:44:10.880 | - Yeah, yeah, yeah.

00:44:11.720 | - But the reality is that you want to focus on the places

00:44:13.200 | where you can really change the direction

00:44:14.680 | or change the interaction more drastically.

00:44:17.680 | But that's something that we'll have to do eventually

00:44:19.400 | because we still want to be serious people.

00:44:20.840 | - It's funny 'cause in some ways the model labs

00:44:23.440 | are competing for you, right?

00:44:25.600 | You don't have to do any effort.

00:44:26.840 | You just switch model and then it'll grow.

00:44:29.280 | What are you really limited by?

00:44:30.760 | Is it additional sources?

00:44:32.720 | It's not models, right?

00:44:33.880 | You're not really limited by quality of model.

00:44:36.280 | - Right now we are limited by, yes, the infrastructure part,

00:44:41.520 | which is ability to connect easily for users

00:44:45.000 | to all the data they need to do the job they want to do.

00:44:48.000 | - Because you maintain all your own stuff.

00:44:49.440 | You know, there are companies out there

00:44:50.760 | that are starting to provide integrations as a service, right?

00:44:53.760 | I used to work in an integrations company.

00:44:54.960 | - Yeah, yeah, I know, I know, I know.

00:44:55.960 | It's just that there is some intricacies

00:44:57.840 | about how you chunk stuff and how you process information

00:45:01.160 | from one platform to the other.

00:45:02.920 | If you look at the end of the spectrum,

00:45:04.680 | you could think of, you could say,

00:45:06.040 | "Oh, I'm going to support Airbyte,"

00:45:07.480 | and Airbyte kind of has--

00:45:08.520 | - I used to work at Airbyte, yeah.

00:45:09.360 | - Oh, really?

00:45:10.600 | - They were French founders as well.

00:45:11.440 | - French, yeah, I was going to say.

00:45:12.280 | - I know Jean very well.

00:45:14.120 | I'm seeing him today.

00:45:15.040 | And the reality is that if you look at Notion,

00:45:18.400 | Airbyte does the job of taking Notion

00:45:20.520 | and putting it in a structured way,

00:45:22.360 | but that's a way that is not really usable

00:45:24.320 | to actually make it available to models in a useful way.

00:45:28.280 | Because you get all the blocks, details, et cetera,

00:45:30.520 | which is useful for many use cases.

00:45:32.160 | - Because also it's for data scientists and not for AI.

00:45:34.920 | - The reality of Notion is that sometimes you have a,

00:45:37.920 | so when you have a page, there's a lot of structure in it,

00:45:40.180 | and you want to capture the structure

00:45:42.640 | and chunk the information

00:45:43.840 | in a way that respects that structure.

00:45:45.640 | In Notion, you have databases.

00:45:47.120 | Sometimes those databases are real tabular data.

00:45:49.680 | Sometimes those databases are full of text.

00:45:52.120 | You want to get the distinction

00:45:54.480 | and understand that this database

00:45:55.880 | should be considered like text information,

00:45:58.840 | whereas this other one

00:45:59.680 | is actually quantitative information.

00:46:01.400 | And to really get a very high-quality interaction

00:46:04.200 | with that piece of information,

00:46:05.880 | I haven't found a solution that will work

00:46:08.120 | without us owning the connection end-to-end.

00:46:09.840 | That's why I don't invest in this Composio.

00:46:12.380 | There's All Hands from Graham Newbig.

00:46:15.340 | There's all these other companies that are,

00:46:16.660 | like, we will do the integrations for you.

00:46:18.260 | You just, we have the open-source community.

00:46:20.020 | We'll do it off the shelf.

00:46:20.900 | But then you are so specific in your needs

00:46:24.180 | that you want to own it.

00:46:25.340 | - Yeah, exactly.

00:46:26.180 | - You can talk to Michel about that.

00:46:27.180 | You know, he wants to put the AI in there, but, you know.

00:46:29.940 | - I will, I will.

00:46:31.340 | - Cool.

00:46:32.180 | - What are we missing?

00:46:33.020 | You know, what are like the things

00:46:33.940 | that are like sneakily hard that you're tackling

00:46:37.040 | that maybe people don't even realize

00:46:38.820 | they're like really hard?

00:46:39.960 | - The real parts, as we kind of touch base

00:46:42.420 | throughout the conversation,

00:46:43.540 | is really building the infra that works for those agents,

00:46:46.760 | because it's a tenuous walk.

00:46:48.720 | It's an evergreen piece of work

00:46:51.020 | because you always have an extra integration

00:46:53.020 | that will be useful to a non-negligible set of your users.

00:46:56.700 | I'm super excited about is that

00:46:58.940 | there's so many interactions

00:47:00.200 | that shouldn't be conversational interactions,

00:47:02.560 | and that could be very useful.

00:47:04.020 | Basically, know that we have the firehose of information

00:47:07.220 | of those companies,

00:47:08.300 | and there's not gonna be that many companies

00:47:09.780 | that capture the firehose of information.

00:47:11.580 | When you have the firehose of information,

00:47:13.220 | you can do a ton of stuff with models

00:47:16.020 | that are just not accelerating people,

00:47:18.700 | but giving them superhuman capability,

00:47:20.900 | even with the current model capability,

00:47:22.340 | because you can just sift through much more information.

00:47:24.660 | An example is documentation repair.

00:47:26.620 | If I have the firehose of Slack messages

00:47:28.500 | and new Notion pages,

00:47:29.980 | if somebody says, "I own that page,

00:47:31.620 | "I wanna be updated when there is a piece of information

00:47:34.500 | "that should update that page,"

00:47:35.820 | this is not possible.

00:47:36.960 | You get an email, resume is saying,

00:47:38.520 | "Oh, look at that Slack message.

00:47:40.140 | "It says the opposite of what you have in that paragraph.

00:47:42.160 | "Maybe you wanna update or just ping that person."

00:47:44.540 | I think there is a lot to be explored on the product layer

00:47:49.540 | in terms of what it means to interact

00:47:51.500 | productively with those models,

00:47:52.740 | and that's a problem that is extremely hard

00:47:55.460 | and extremely exciting.

00:47:56.660 | - One thing you keep mentioning about infra work,

00:47:58.560 | obviously, Dust is building that in front

00:48:00.900 | and serving that in a very consumer-friendly way.

00:48:04.560 | You always talk about infra being additional sources,

00:48:07.140 | additional connectors.

00:48:08.340 | That is very important.

00:48:09.180 | But I'm also interested in the vertical infra.

00:48:11.180 | There is an orchestrator underlying all these things,

00:48:13.500 | where you're doing asynchronous work.

00:48:15.100 | For example, the simplest one is a cron job.

00:48:17.700 | You just schedule things.

00:48:19.060 | But also, for if this and that,

00:48:20.580 | you have to wait for something to be executed

00:48:22.740 | and proceed to the next task.

00:48:24.740 | I used to work on an orchestrator as well, Temporal.

00:48:27.460 | - We use Temporal.

00:48:28.300 | - Oh, you use Temporal? - Yeah.

00:48:29.300 | - Oh, how was the experience?

00:48:30.460 | I need the NPS.

00:48:31.500 | (all laughing)

00:48:32.680 | - We're doing a self-discovery call now?

00:48:34.900 | - No, but you can also complain to me

00:48:36.240 | 'cause I don't work there anymore.

00:48:38.400 | - No, we love Temporal.

00:48:39.240 | There's some edges that are a bit rough,

00:48:41.800 | surprisingly rough.

00:48:42.800 | And you would say, "Why is it so complicated?"

00:48:44.600 | - Is it versioning?

00:48:45.440 | It's always versioning.

00:48:46.760 | - Stuff like that.

00:48:47.760 | But we really love it.

00:48:48.920 | And we use it for exactly what you said,

00:48:51.040 | like managing the entire set of stuff that needs to happen

00:48:55.200 | so that in semi-real time,

00:48:57.280 | we get all the updates from Slack or Notion

00:48:59.880 | or GitHub into the system.

00:49:02.240 | And whenever we see that piece of information goes through,

00:49:05.080 | maybe trigger workflows because to run agents,

00:49:08.520 | because they need to provide alerts to users

00:49:10.160 | and stuff like that.

00:49:11.000 | And Temporal is great.

00:49:12.360 | Love it.

00:49:13.200 | - You haven't evaluated others.

00:49:14.480 | You don't want to build your own.

00:49:15.940 | You're happy with-

00:49:16.880 | - Oh no, we're not in the business

00:49:18.520 | of replacing Temporal. - Building orchestrators.

00:49:20.360 | - And Temporal is so,

00:49:21.540 | I mean, or any other competitive product,

00:49:23.680 | they're very general.

00:49:24.840 | If it's there, there's an interesting theory

00:49:26.600 | about buy versus build.

00:49:28.040 | I think in that case, when you're a high-growth company,

00:49:31.740 | your buy-build trade-off is very much on the side of buy,

00:49:35.400 | because if you have the capability,

00:49:36.240 | you're just going to be saving time,

00:49:37.600 | you can focus on your core competency, et cetera.

00:49:39.840 | And it's funny because we're seeing,

00:49:41.040 | we're starting to see the post-high-growth company,

00:49:43.880 | post-SKF company,

00:49:45.240 | going back on that trade-off, interestingly.

00:49:47.320 | So that's the cloud and use

00:49:48.800 | about removing Zendesk and Salesforce.

00:49:51.080 | - Do you believe that, by the way?

00:49:52.080 | - Yeah, I do share the pockets with them.

00:49:54.000 | - Oh yeah? - It's true, yeah.

00:49:54.840 | - No, no, I know, of course they say it's true,

00:49:56.640 | but like also how well is it going to go?

00:49:58.360 | - So I'm not talking about deflecting

00:50:01.160 | the customer traffic.

00:50:02.660 | I'm talking about building AI

00:50:04.800 | on top of Salesforce and Zendesk,

00:50:06.640 | basically, if I understand correctly.

00:50:08.360 | And all of a sudden,

00:50:09.540 | your product surface become much smaller

00:50:12.640 | because you're interacting with an AI system

00:50:15.000 | that will take some actions.

00:50:16.400 | And so all of a sudden,

00:50:17.320 | you don't need the product layer anymore.

00:50:19.080 | And you realize that,

00:50:19.920 | oh, those things are just database

00:50:21.500 | that I pay a hundred times the price, right?

00:50:24.840 | Because you're a post-SKF company,

00:50:27.480 | and you have tech capabilities,

00:50:29.400 | you are incentivized to reduce your costs

00:50:31.400 | and you have the capability to do so.

00:50:32.760 | And then it makes sense to just scratch the SaaS away.

00:50:35.000 | So it's interesting that we might see

00:50:36.760 | kind of a bad time for SaaS

00:50:39.040 | in post-hyper-growth tech companies.

00:50:42.240 | So it's still a big market,

00:50:43.840 | but it's not that big

00:50:44.680 | because if you're not a tech company,

00:50:46.600 | you don't have the capabilities to reduce SaaS cost.

00:50:48.600 | If you're a high-growth company,

00:50:50.160 | always going to be buying

00:50:51.040 | because you go faster with that.

00:50:52.800 | But that's an interesting new space,

00:50:54.800 | new category of companies that might remove some SaaS.

00:50:57.560 | - Yeah, Alessio's firm has an interesting thesis

00:50:59.420 | on the future of SaaS in AI.

00:51:01.240 | - Service as a software, we call it.

00:51:02.840 | It's basically like,

00:51:03.680 | well, the most extreme is like,

00:51:05.040 | why is there any software at all?

00:51:06.520 | You know, ideally, it's all a labor interface

00:51:08.520 | where you're asking somebody to do something for you,

00:51:10.480 | whether that's a person, an AI agent or not.

00:51:12.760 | - Yeah, yeah, that's interesting.

00:51:14.560 | I have to ask.

00:51:15.400 | - Are you paying for Temporal Cloud or are you self-hosting?

00:51:17.840 | - Oh, no, no, we're paying, we're paying.

00:51:19.080 | - Oh, okay, interesting.

00:51:20.120 | - We're paying way too much.

00:51:21.680 | It's crazy expensive, but that makes us--

00:51:24.200 | - That's why as a shareholder, I like to hear that.

00:51:26.320 | - Makes us go faster, so we're happy to pay.

00:51:28.720 | - Other things in the infrastack,

00:51:29.960 | I just want a list for other founders to think about.

00:51:32.080 | Ops, API Gateway, evals, you know,

00:51:34.680 | anything interesting there that you build or buy?

00:51:37.560 | - I mean, there's always an interesting question.

00:51:39.320 | We've been building a lot around the interface

00:51:41.440 | between models and because Dust,

00:51:44.880 | the original version was an orchestration platform,

00:51:47.480 | and we basically provide a unified interface

00:51:50.200 | to every model providers.

00:51:52.160 | - That's what I call gateway.

00:51:53.200 | - That we add because Dust was that,

00:51:55.240 | and so we continued building upon, and we own it.

00:51:58.160 | But that's an interesting question,

00:51:59.360 | whether you want to build that or buy it.

00:52:02.400 | - I would say light LLM is the current open source consensus.

00:52:05.320 | - Exactly, yeah.

00:52:06.240 | There's an interesting question there.

00:52:08.080 | - Ops, Datadog, just tracking.

00:52:10.320 | - Oh yeah, so Datadog is an obvious,

00:52:12.640 | what are the mistakes that I regret?

00:52:14.960 | (laughing)

00:52:15.840 | I started as pure JavaScript, not TypeScript,

00:52:18.160 | and I think you want to, if you're wondering,

00:52:21.000 | oh, I want to go fast, I'll do a little bit of JavaScript.

00:52:22.760 | No, no, just start with TypeScript.

00:52:24.320 | - I see, I see, okay.

00:52:25.960 | - That is--

00:52:26.800 | - So interesting, you are a research engineer

00:52:29.440 | that came out of OpenAI that bet in TypeScript.

00:52:31.880 | - Well, the reality is that if you're building a product,

00:52:34.400 | you're going to be doing a lot of JavaScript, right?

00:52:36.680 | And Next, we're using Next as an example.

00:52:39.160 | It's a great platform, and our internal service

00:52:42.880 | is actually not built in Python either.

00:52:44.760 | It's built in Rust.

00:52:46.000 | - That's another fascinating choice.

00:52:47.320 | The Next.js story, it's interesting because Next.js

00:52:49.400 | is obviously the king of the world in JavaScript land,

00:52:51.720 | but recently, ChachiBT just rewrote from Next.js to Remix.

00:52:56.480 | We are going to be having them on

00:52:57.640 | to talk about the big rewrite.

00:52:58.920 | That is like the biggest news in front-end world in a while.

00:53:02.280 | - All right, just to wrap, in 2023,

00:53:04.640 | you predicted the first billion-dollar company

00:53:06.760 | with just one person running it,

00:53:08.160 | and you said that's basically like a sign of AGI,

00:53:10.160 | once we get there, and you said it had already been started.

00:53:13.640 | Any 2024 updates on the take?

00:53:16.440 | - That quote was probably independently invented it,

00:53:19.160 | but Sam Altman stole it from me eventually.

00:53:23.040 | But anyway, it's a good quote.

00:53:25.920 | I hypothesize it was maybe already being started,

00:53:28.480 | but if it's a uni-person company,

00:53:30.480 | it would probably grow really fast,

00:53:32.080 | and so we should probably see it already.

00:53:34.240 | I guess we're going to have to wait for it a little bit,

00:53:36.920 | and I think it's because the dust of the world don't exist,

00:53:39.600 | and so you don't have that thing that lets you run those,

00:53:43.200 | just do anything with models.

00:53:45.040 | But one thing that is exciting is maybe that

00:53:48.800 | we're going to be able to scale a team

00:53:51.520 | much further than before.

00:53:53.280 | Our generation of company might be

00:53:54.560 | the first billion-dollar companies

00:53:56.600 | with engineering teams of 20 people.

00:53:58.240 | That would be so exciting as well.

00:53:59.960 | That would be so great, you know?

00:54:01.200 | You don't have the management hurdle,

00:54:02.680 | you're just 20 focused people

00:54:04.160 | with a lot of assistance from machines to achieve your job.

00:54:07.720 | That would be great, and that I believe in a bit more.

00:54:10.320 | - Yeah, I've written a post

00:54:11.560 | called "Maximum Enterprise Utilization,"

00:54:13.760 | kind of like you have MFU for GPUs,

00:54:15.720 | but it's basically like so many people are focused on,

00:54:18.000 | oh, it's kind of like displaced jobs and whatnot,

00:54:20.600 | but I'm like, there's so much work that people don't do

00:54:22.840 | because they don't have the people,

00:54:24.240 | and maybe the question is that you just don't scale

00:54:26.000 | to that size, you know, to begin with,

00:54:28.160 | and maybe everybody will use dust,

00:54:29.760 | and dust is only going to be 20 people,

00:54:31.960 | and then people using dust will be two people.

00:54:34.720 | - So my hot take is I actually know what vertical

00:54:37.120 | they will be in.

00:54:37.960 | They'll be content creators and podcasters.

00:54:39.960 | - There's already two of us, so we're at max capacity.

00:54:43.320 | - Most people would regard Jimmy Donaldson,

00:54:45.200 | like Mr. Beast, as a billionaire,

00:54:46.840 | but his team is, he's got about like 200 people,

00:54:49.400 | so he's not a single person company.

00:54:50.920 | The closer one actually is Joe Rogan,

00:54:52.760 | where he basically just has like a guy.

00:54:54.760 | - Hey, Jamie, put it on the screen.

00:54:56.200 | - Exactly, exactly, but Joe, I don't think,

00:54:58.680 | he sold his future for 250 million to Spotify,

00:55:01.480 | so he's not going to hit that billionaire status.

00:55:03.400 | The non-consensus one will be the hot girl,

00:55:05.520 | who just started a podcast, anyway,

00:55:08.480 | but like you want creators who are empowered

00:55:11.320 | by a bunch of agents, dust agents, to do all this stuff,

00:55:16.000 | because then ultimately it's just the brand, the curation.

00:55:19.880 | What is the role of the human then?

00:55:21.080 | What is that one person supposed to do

00:55:22.800 | if you have all these agents?

00:55:24.160 | - That's a good question.

00:55:25.720 | I mean, I think it was,

00:55:27.280 | I think it was Pinterest or Dropbox founder at the time was,

00:55:32.080 | when you're CEO, you mostly have an editorial position.

00:55:35.360 | You're here to say yes and no

00:55:37.080 | to the things you are supposed to do.

00:55:38.480 | - Okay, so I make a daily AI newsletter,

00:55:41.840 | where I just, it's 99% AI generated,

00:55:44.640 | but I serve the role as the editor.

00:55:46.360 | Like I write commentary, I choose between four options.

00:55:49.440 | - You decide what goes in and goes out.

00:55:51.120 | And ultimately, as you said,

00:55:52.240 | you build up your brand through those many decisions.

00:55:54.960 | - So you should pursue creators.

00:55:56.960 | (laughs)

00:55:57.880 | - Yeah, that's true.

00:55:58.720 | And you've made a, I think you've made a,

00:56:00.680 | you've have an upcoming podcast with Notebook NLM,

00:56:02.760 | which has been doing a crazy stuff.

00:56:04.120 | - Oh yeah, that is exciting.

00:56:05.480 | They were just in here yesterday.

00:56:06.760 | I'll tell you one agent that we need,

00:56:08.200 | if you want to pursue the creator market,

00:56:09.680 | the one agent that we haven't paid for

00:56:11.280 | is our video editor agent.

00:56:13.160 | So if you want, you need to, you know,

00:56:15.320 | wrap FFmpeg in a GPT.

00:56:17.800 | (laughs)

00:56:20.040 | - Awesome, this was great.

00:56:21.640 | Anything we missed?

00:56:23.280 | Any final kind of like call to action hiring?

00:56:25.560 | It's like, obviously people should buy the product

00:56:27.360 | and pay you.

00:56:28.200 | - Yeah, obviously.

00:56:29.040 | And no, I think we didn't dive into the vertical

00:56:31.840 | versus horizontal approach to AI agents.

00:56:34.720 | - Quick take on that, yeah.

00:56:35.960 | - We mentioned a few things.

00:56:37.040 | We spike at penetration and that's just awesome

00:56:39.400 | because we carry the tool

00:56:40.320 | that the entire company has and use.

00:56:42.720 | So we create a ton of value,

00:56:44.200 | but it makes our go-to-market much harder.

00:56:46.080 | Vertical solutions have a go-to-market

00:56:47.600 | that is much easier because they're like,

00:56:49.400 | "Oh, I'm going to solve the lawyer stuff."

00:56:53.360 | But the potential within the company after that is limited.

00:56:56.320 | So there's really a nice tension there.

00:56:58.000 | We're true believers of the horizontal approach

00:57:01.560 | and we'll see how that plays out.

00:57:03.400 | But I think it's an interesting thing to think about

00:57:06.280 | when as a founder or as a technical person

00:57:08.840 | working with agents,

00:57:10.120 | what do you want to solve?

00:57:11.160 | Do you want to solve something general

00:57:12.520 | or do you want to solve something specific?

00:57:13.960 | And it has a lot of impact on eventually

00:57:15.920 | what type of company you're going to build.

00:57:17.200 | - Yeah, I'll provide you my response on that.

00:57:19.040 | So I've gone the other way.

00:57:19.880 | I've gone products over platform.

00:57:21.280 | And it's basically your sense on the products

00:57:23.520 | drives your platform development.

00:57:26.160 | In other words, if you're trying to be as many things

00:57:28.480 | to as many people as possible,

00:57:29.680 | we're just trying to be one thing.

00:57:30.800 | We build our brand in one specific niche.

00:57:33.640 | And in future, if we want to choose to spin off platforms

00:57:36.360 | for other things, we can because we have that brand.

00:57:38.840 | So for example, Perplexity,

00:57:40.600 | we went for products in search, right?

00:57:42.760 | But then we also have Perplexity Labs

00:57:44.480 | that like, here's the info that we use for search

00:57:46.200 | and whatever.

00:57:47.040 | - The contrary argument to that is that

00:57:48.960 | you always can have lateral movement within companies,

00:57:51.600 | but if you're Zendesk,

00:57:54.200 | you're not going to be...

00:57:55.600 | - Zendesk...

00:57:56.440 | - Serving...

00:57:57.280 | - Web services.

00:57:58.120 | (laughing)

00:57:59.080 | There are a few, you know,

00:58:00.080 | there's success stories on both sides.

00:58:01.600 | There's Amazon and Amazon Web Services.

00:58:03.040 | - And sorry by platform,

00:58:04.400 | I don't really mean the platform as the platform platform.

00:58:06.920 | I mean like the product that is useful

00:58:09.120 | to everybody within the company.

00:58:10.560 | And I'll take on that is that

00:58:12.200 | there is so many operations within the company.

00:58:14.280 | Some of them have been extremely rationalized by the market,

00:58:17.040 | like salespeople, like support

00:58:19.200 | has been extremely rationalized.

00:58:20.400 | And so you can probably create

00:58:21.800 | very powerful vertical product around that.

00:58:24.760 | But there is so many operations that make up a company

00:58:27.200 | that are specific to the company

00:58:28.720 | that you need a product to help people

00:58:32.640 | get assisted on those operations.

00:58:33.960 | And that's kind of the bet we have.

00:58:35.480 | - Excellent.

00:58:36.320 | - Awesome, man. Thanks again for the time.

00:58:37.520 | - Thank you very much for having me.

00:58:38.360 | It was so much fun.

00:58:39.240 | - Yeah, great discussion.

00:58:40.400 | - Thank you.

00:58:41.760 | (upbeat music)

00:58:44.360 | (upbeat music)

00:58:46.960 | (upbeat music)

00:58:49.560 | (gentle music)