back to index

Agents @ Work: Dust.tt — with Stanislas Polu


Whisper Transcript | Transcript Only Page

00:00:00.000 | (upbeat music)
00:00:02.580 | - Hey, everyone.
00:00:04.520 | Welcome to the "Lit and Space" podcast.
00:00:06.160 | This is Alessio, partner and CTO at Decibel Partners,
00:00:09.120 | and I'm joined by my co-host, Swiggs, founder of Small.ai.
00:00:12.160 | - Hey, and today we're in a studio with Stanford, welcome.
00:00:14.800 | - Thank you very much for having me.
00:00:16.120 | - Visiting from Paris. - Paris.
00:00:17.800 | - And you have had a very distinguished career.
00:00:21.240 | It's very hard to summarize,
00:00:22.680 | but you went to college in both
00:00:25.800 | Ecole Polytechnique and Stanford,
00:00:27.680 | and then you worked in a number of places,
00:00:29.480 | Oracle, Totems, Stripe, and then OpenAI Preach at UBT.
00:00:33.680 | We'll spend a little bit of time about that.
00:00:35.840 | About two years ago, you left OpenAI to start Dust.
00:00:38.280 | I think you were one of the first OpenAI alum founders.
00:00:41.800 | - Yeah, I think it was about at the same time
00:00:43.320 | as the Adapt guys, so it was that first wave.
00:00:46.120 | - Yeah, and people really loved the David episode.
00:00:49.000 | We love a few sort of OpenAI stories,
00:00:51.600 | you know, for back in the day,
00:00:53.320 | like we were talking about pre-recording,
00:00:55.440 | probably the statute of limitations
00:00:56.560 | on some of those stories has expired.
00:00:58.560 | So you can talk a little bit more freely
00:01:00.240 | without them coming after you.
00:01:02.160 | But maybe we'll just talk about,
00:01:03.280 | like, what was your journey into AI?
00:01:05.640 | You know, you were at Stripe for almost five years.
00:01:07.840 | There are a lot of Stripe alums going into OpenAI.
00:01:09.520 | I think the Stripe culture has come
00:01:10.800 | into OpenAI quite a bit.
00:01:12.120 | - Yeah, so I think the buses of Stripe people
00:01:14.880 | really started flowing in, I guess, after ChatGPD.
00:01:18.600 | But yeah, my journey into AI is--
00:01:20.720 | - I mean, Greg Brockman.
00:01:21.840 | - Yeah, yeah, from Greg, of course.
00:01:25.680 | And Daniela, actually, back in the days.
00:01:28.120 | Daniela Modigliani.
00:01:28.960 | - Yeah, she was COO.
00:01:31.280 | I mean, she is COO, yeah.
00:01:32.600 | - She had a pretty high job at OpenAI at the time,
00:01:35.000 | yeah, for sure.
00:01:35.960 | My journey started as anybody else.
00:01:38.200 | You're fascinated with computer science
00:01:39.720 | and you want to make them think it's awesome,
00:01:41.560 | but it doesn't work.
00:01:42.760 | I mean, it was a long time ago.
00:01:44.000 | I was, like, maybe 16, so it was 25 years ago.
00:01:47.360 | Then the first big exposure to AI would be at Stanford.
00:01:50.680 | And I'm going to, like, disclose how old I am
00:01:54.240 | because, at the time, it was a class taught by Andrew Eng.
00:01:59.240 | And it was, there was no deep learning.
00:02:01.320 | It was half features for vision and a star algorithm.
00:02:05.360 | So it was fun.
00:02:06.280 | But it was the early days of deep learning.
00:02:08.440 | At the time, I think a few years after,
00:02:09.880 | it was the first project at Google.
00:02:11.680 | But, you know, that cat face or the human face
00:02:14.800 | trained from many images.
00:02:16.760 | Went to, hesitated doing a PhD, more in systems.
00:02:19.920 | Eventually decided to go into getting a job.
00:02:24.600 | Went to Oracle, started a company,
00:02:26.400 | did a gazillion mistakes, got acquired by Stripe.
00:02:29.200 | Worked with Greg Buckman there.
00:02:30.680 | And at the end of Stripe,
00:02:32.040 | I started interesting myself in AI again.
00:02:34.280 | Felt like it was the time you had the Atari games,
00:02:36.440 | you had the self-driving craziness at the time.
00:02:39.480 | And I started exploring projects.
00:02:41.280 | It felt like the Atari games were incredible,
00:02:44.000 | but they were still games.
00:02:45.000 | And I was looking into exploring projects
00:02:46.600 | that would have an impact on the world.
00:02:48.920 | And so I decided to explore three things,
00:02:50.840 | self-driving cars, cybersecurity and AI,
00:02:54.120 | and math and AI.
00:02:55.720 | It's like, I think, by a decreasing order
00:02:57.880 | of impact on the world, I guess.
00:03:00.120 | - Yeah.
00:03:01.400 | Discovering new math would be very foundational.
00:03:03.480 | - It is extremely foundational,
00:03:04.480 | but it's not as direct as driving people around.
00:03:07.040 | - Sorry, you're doing this at Stripe.
00:03:08.440 | You were like thinking about your next move.
00:03:09.280 | - No, it was like, you have to Stripe,
00:03:10.320 | kind of a bit of time where I started exploring.
00:03:13.400 | Did a bunch of work with friends
00:03:15.000 | on trying to get RC cars to drive autonomously.
00:03:18.640 | Almost started a company in France or Europe
00:03:21.160 | about self-driving trucks.
00:03:23.520 | We decided to not go for it
00:03:25.080 | because it was like probably very operational.
00:03:27.880 | And I think the idea and ADN of the company,
00:03:30.400 | of the team wasn't there.
00:03:31.600 | And also I realized that if I wake up a day
00:03:34.480 | and because of a bug I wrote, I killed a family,
00:03:37.440 | it would be a bad experience.
00:03:39.880 | And so just decided like, no, that's just too crazy.
00:03:43.120 | Then I explored cybersecurity with a friend.
00:03:45.800 | We're trying to apply transformers to cut fuzzing.
00:03:48.120 | So cut fuzzing, you have kind of an, sorry,
00:03:50.440 | an algorithm that goes really fast
00:03:52.200 | and tries to mutate the inputs of a library to find bugs.
00:03:56.440 | And we try to apply a transformer to that
00:03:58.320 | and do reinforcement learning with the signal
00:04:00.600 | of how much you propagate within the binary.
00:04:03.960 | Didn't work at all because the transformers are so slow
00:04:06.440 | compared to evolutionary algorithms
00:04:08.400 | that it kind of didn't work.
00:04:09.960 | And then I started interested myself in math and AI.
00:04:12.800 | And started working on SAT solving with AI.
00:04:15.320 | And at the same time, OpenAI was kind of starting
00:04:17.320 | the reasoning team that were tackling that project as well.
00:04:21.000 | I was in chat and in touch with Greg
00:04:22.840 | and eventually got in touch with Ilya
00:04:24.800 | and finally found my way to OpenAI.
00:04:27.560 | I don't know how much you want to dig into that.
00:04:29.360 | The way to find your way to OpenAI when you're in Paris
00:04:31.480 | was kind of an interesting adventure as well.
00:04:33.040 | - Please, and I want to note, this was a two month journey.
00:04:36.600 | You did all this in two months, the search.
00:04:39.040 | - The search for what, sorry?
00:04:40.160 | - Your search for your next thing.
00:04:41.280 | 'Cause you left in July, 2019,
00:04:43.760 | and then you joined OpenAI September.
00:04:45.640 | - I'm gonna be ashamed to say that.
00:04:47.640 | - You were searching before, yeah.
00:04:48.760 | - I was searching before.
00:04:49.920 | - I mean, it's normal, it's normal.
00:04:51.520 | - No, the truth is that I moved back to Paris through Scribe.
00:04:54.600 | And I just felt the hardship of being remote from your team
00:04:58.080 | nine hours away.
00:04:59.200 | And so it kind of freed a bit of time
00:05:01.680 | for me to start the exploration before.
00:05:03.720 | Sorry, Patrick, sorry, John.
00:05:05.000 | (all laughing)
00:05:06.000 | - Hopefully they're listening.
00:05:07.520 | Joining OpenAI from Paris and from like,
00:05:11.600 | obviously you had worked with Greg, but not anyone else.
00:05:14.360 | - No, yeah, so I didn't work with,
00:05:15.520 | I had worked with Greg, but not Ilya,
00:05:17.040 | but I had started chatting with Ilya,
00:05:18.680 | and Ilya was kind of excited because he knew
00:05:20.640 | that I was a good engineer through Greg, I presume,
00:05:22.680 | but I was not a trained researcher,
00:05:24.200 | didn't do a PhD, never did research.
00:05:26.200 | And I started chatting and he was excited
00:05:27.760 | all the way to the point where he was like,
00:05:29.320 | "Hey, come pass interviews, it's gonna be fun."
00:05:31.960 | I think he didn't care where I was,
00:05:33.240 | he just wanted to try working together.
00:05:35.800 | So I go to SF, go through the interview process,
00:05:38.680 | get an offer, and so I get Bob McGrew on the phone
00:05:41.640 | for the first time, he's like,
00:05:42.640 | "Hey, Stan, it's awesome, you've got an offer.
00:05:44.480 | "When are you coming to SF?"
00:05:45.960 | I'm like, "Hey, it's awesome,
00:05:47.400 | "I'm not coming to the SF, I'm based in Paris
00:05:50.640 | "and we just moved."
00:05:51.920 | He was like, "Hey, it's awesome.
00:05:52.920 | "Well, you don't have an offer anymore."
00:05:55.520 | - Oh my God.
00:05:56.600 | - No, it wasn't as hard as that,
00:05:58.040 | but that's basically the idea.
00:05:59.640 | And it took me like maybe a couple more time
00:06:02.440 | to keep chatting and they eventually decided
00:06:04.880 | to try a contractor set up,
00:06:06.400 | and that's how I kind of started working at OpenAI,
00:06:09.600 | officially as a contractor, but in practice,
00:06:12.360 | really felt like being an employee.
00:06:13.840 | - What did you work on?
00:06:15.120 | - So it was solely focused on math and AI,
00:06:18.760 | and in particular in the application,
00:06:20.720 | so the study of the larger-gauge models,
00:06:23.360 | mathematical reasoning capabilities,
00:06:25.440 | and in particular in the context of formal mathematics.
00:06:28.760 | The motivation was simple, transformers are very creative,
00:06:31.920 | but yet they do mistakes,
00:06:33.880 | and formal math systems have the ability to verify a proof,
00:06:38.760 | and the tactics they can use to solve problems
00:06:42.320 | are very mechanical, so you miss the creativity.
00:06:44.840 | And so the idea was to try to explore both together,
00:06:47.720 | you would get the creativity of the LLMs
00:06:49.680 | and the kind of verification capabilities
00:06:51.800 | of the formal system.
00:06:53.040 | A formal system, just to give a little bit of context,
00:06:55.120 | is a system in which a proof is a program,
00:06:58.600 | and the formal system is a type system,
00:07:00.800 | a type system that's so evolved
00:07:02.120 | that you can verify the program.
00:07:04.000 | If the type checks, it means that the program is correct.
00:07:06.400 | - Is the verification much faster
00:07:09.720 | than actually executing the program?
00:07:12.440 | It is, right?
00:07:13.280 | - Verification is instantaneous, basically.
00:07:15.320 | So the truth is that what you code in involves tactics
00:07:18.720 | that may involve computation to search for solutions,
00:07:22.800 | so it's not instantaneous.
00:07:24.160 | You do have to do the computation
00:07:25.640 | to expand the tactics into the actual proof.
00:07:28.200 | The verification of the proof at the very low level
00:07:31.280 | is instantaneous.
00:07:32.760 | - How quickly do you run into halting problem,
00:07:35.720 | P and P type things,
00:07:37.840 | and possibilities where you're just like that?
00:07:39.840 | - I mean, you don't run into it at the time.
00:07:41.480 | It was really trying to solve very easy problems.
00:07:45.720 | So I think the-
00:07:47.080 | - Can you give an example of easy?
00:07:48.600 | - Yeah, so that's the mass benchmark
00:07:50.640 | that everybody knows today.
00:07:51.800 | - The Dan Hendricks one.
00:07:52.640 | - The Dan Hendricks one, yeah.
00:07:53.880 | And I think it was the low-end part
00:07:56.200 | of the mass benchmark at the time,
00:07:58.040 | because that mass benchmark includes AMC problems,
00:08:00.800 | AMCA times 10, 12, so these are the easy ones,
00:08:03.880 | then AME problems, somewhat harder,
00:08:06.240 | and some IMO problems like-
00:08:07.960 | - For our listeners, we covered this
00:08:08.960 | in our Benchmarks 101 episode.
00:08:10.400 | AMC is literally the grade of high school,
00:08:13.560 | grade eight, grade 10, grade 12.
00:08:15.560 | So you can solve this.
00:08:16.680 | (laughs)
00:08:17.960 | Just briefly to mention this,
00:08:19.400 | because I don't think we'll touch on this again.
00:08:21.320 | There's a bit of work with Lean,
00:08:23.520 | and then more recently with DeepMind scoring silver
00:08:28.520 | on the IMO.
00:08:29.720 | Any commentary on how math has evolved
00:08:31.720 | from your early work to today?
00:08:34.120 | - I mean, that result is mind-blowing.
00:08:37.920 | I mean, from my perspective, spent three years on that.
00:08:40.120 | At the same time, Guillaume Lampe in Paris,
00:08:42.400 | we were both in Paris, actually.
00:08:43.840 | He was at FAIR, was working on some problems.
00:08:46.120 | We were pushing the boundaries,
00:08:47.960 | and the goal was the IMO.
00:08:50.120 | And we cracked a few problems here and there,
00:08:52.680 | but the idea of getting a medal at an IMO
00:08:56.120 | was just remote.
00:08:58.200 | - Yeah.
00:08:59.040 | - So this is an impressive result.
00:09:00.360 | And we can, I think the DeepMind team
00:09:03.840 | just did a good job of scaling.
00:09:07.360 | I think there's nothing too magical in their approach,
00:09:09.480 | even if it hasn't been published.
00:09:11.000 | There's a Dan Silver talk from seven days ago
00:09:13.360 | where it goes a little bit into more details.
00:09:15.520 | It feels like there's nothing magical there.
00:09:17.200 | It's really applying reinforcement learning
00:09:19.440 | and scaling up the amount of data
00:09:21.120 | that can generate through autoformalization.
00:09:22.920 | So we can dig into what autoformalization means,
00:09:24.960 | if you want.
00:09:25.800 | (laughs)
00:09:26.640 | - Let's talk about the tail end, maybe, of the OpenAI.
00:09:29.560 | So you joined and you're like,
00:09:30.520 | "I'm gonna work on math and do all of these things."
00:09:33.000 | I saw on one of your blog posts, you mentioned
00:09:34.880 | you fine-tuned over 10,000 models at OpenAI
00:09:37.440 | using 10 million A100 hours.
00:09:40.600 | How did the research evolve from the GPT-2
00:09:45.360 | and then getting closer to DaVinci 003?
00:09:47.680 | And then you left just before ChatGPT was released,
00:09:49.920 | but tell people a bit more about the research path
00:09:52.840 | that took you there.
00:09:53.680 | - Yeah, I can give you my perspective of it.
00:09:55.520 | I think at OpenAI,
00:09:57.040 | there's always been a large chunk of the compute
00:09:58.800 | that was reserved to train the GPTs, which makes sense.
00:10:02.360 | So it was pre-entropic splits.
00:10:04.520 | Most of the compute was going to a product called Nest,
00:10:07.240 | which was basically GPT-3.
00:10:09.440 | And then you had a bunch of, let's say remote,
00:10:12.840 | not core research teams that were trying to explore
00:10:17.040 | maybe more specific problems
00:10:18.520 | or maybe the algorithm part of it.
00:10:20.920 | The interesting part, I don't know if it was
00:10:23.040 | where your question was going, is that in those labs,
00:10:25.800 | you're managing researchers.
00:10:27.480 | So by definition, you shouldn't be managing them.
00:10:30.400 | But in that space, there's a managing tool that is great,
00:10:33.680 | which is compute allocation.
00:10:35.280 | Basically, by managing the compute allocation,
00:10:37.240 | you can message the team
00:10:39.400 | of where you think the priority should go.
00:10:41.920 | And so it was really a question of,
00:10:44.960 | you were free as a researcher
00:10:46.720 | to work on whatever you wanted,
00:10:49.320 | but if it was not aligned with OpenAI mission,
00:10:51.640 | and that's fair, you wouldn't get the compute allocation.
00:10:55.160 | As it happens, solving math was very much aligned
00:10:58.280 | with the direction of OpenAI.
00:11:01.200 | And so I was lucky to generally get the compute
00:11:03.880 | I needed to make good progress.
00:11:05.920 | - What do you need to show as incremental results
00:11:08.960 | to get funded for further results?
00:11:11.160 | - It's an imperfect process.
00:11:12.720 | If you're working on math and AI,
00:11:14.160 | obviously there's kind of a prior
00:11:15.760 | that it's going to be aligned with the company.
00:11:17.520 | So it's much easier than to go into something
00:11:19.720 | much riskier, I guess.
00:11:20.960 | You have to show incremental progress, I guess.
00:11:23.080 | It's like you ask for a certain amount of compute
00:11:25.880 | and you deliver a few weeks after,
00:11:27.680 | and you demonstrated you have a progress.
00:11:29.600 | Progress might be a positive result.
00:11:31.200 | Progress might be a strong negative result.
00:11:33.880 | And a strong negative result is actually often
00:11:36.240 | much harder to get or much more interesting
00:11:39.000 | than a positive result.
00:11:40.320 | And then it generally goes into, as any organization,
00:11:44.320 | you would have kind of a people finding your project
00:11:47.120 | or any other project kind of a cool and fancy.
00:11:50.240 | And so you would have that kind of phase
00:11:51.760 | of growing up compute allocation for it
00:11:54.160 | all the way to a point.
00:11:55.400 | And then maybe you reach an apex
00:11:58.880 | and then maybe you go back to mostly to zero
00:12:01.280 | and restart the process
00:12:02.400 | because you're going in a different direction
00:12:03.760 | or something else.
00:12:04.600 | That's how I felt.
00:12:05.440 | - Explore, exploit.
00:12:06.320 | - Yeah, yeah, exactly.
00:12:07.400 | Exactly, exactly.
00:12:08.600 | (laughing)
00:12:09.440 | It's a reinforcement learning approach.
00:12:10.280 | - Classic PhD student search process.
00:12:13.560 | - And you were reporting to Ilya,
00:12:15.600 | like the results you were kind of bringing back to him
00:12:17.880 | or like, what's the structure?
00:12:18.960 | It's almost like when you're doing
00:12:20.080 | such cutting edge research,
00:12:21.520 | you need to report to somebody
00:12:22.840 | who is actually really smart
00:12:23.920 | to understand the directions, right?
00:12:25.640 | - So we had a reasoning team,
00:12:27.080 | which was working on reasoning, obviously,
00:12:29.760 | and some math in general.
00:12:31.280 | So, and that team had a manager,
00:12:33.240 | but Ilya was extremely involved in the team
00:12:35.160 | as an advisor, I guess.
00:12:37.240 | Since he brought me in OpenAI,
00:12:39.080 | I was lucky to,
00:12:40.320 | mostly for during the first years
00:12:41.680 | to have kind of a direct access to him.
00:12:44.200 | He would really coach me as a trainee researcher, I guess,
00:12:47.960 | with good engineering skills.
00:12:50.120 | And Ilya, I think at OpenAI,
00:12:52.600 | he was the one showing the North Star, right?
00:12:55.600 | He was, his job,
00:12:56.800 | and I think he really enjoyed that
00:12:58.360 | and he did it super well,
00:12:59.600 | was going through the teams and saying,
00:13:01.960 | this is where we should be going
00:13:03.720 | and trying to, you know,
00:13:04.800 | flock the different teams together towards an objective.
00:13:08.360 | - I would say like the public perception of him
00:13:10.240 | is that he was the strongest believer in scaling.
00:13:12.800 | - Oh yeah.
00:13:13.640 | - He was, he has always pursued like the compression thesis.
00:13:17.120 | - Yep.
00:13:17.960 | - You have worked with him personally.
00:13:19.320 | What does the public not know about how he works?
00:13:22.400 | - I think he's really focused on building the vision
00:13:25.160 | and communicating the vision within the company,
00:13:27.240 | which was extremely useful.
00:13:28.680 | I was personally surprised that he spent so much time,
00:13:31.760 | you know, working on communicating that vision
00:13:34.160 | and getting the teams to work together versus--
00:13:36.560 | - To be specific, vision is AGI.
00:13:38.080 | - Oh yeah, vision is like, yeah,
00:13:40.040 | it's the belief in compression and scanning computes.
00:13:43.560 | I remember when I started working on the reasoning team,
00:13:45.840 | it was, the excitement was really
00:13:47.720 | about scaling the computer on reasoning.
00:13:49.840 | And that was really the belief
00:13:51.280 | he wanted to ingrain in the team.
00:13:52.760 | And that's what has been useful to the team.
00:13:55.440 | And with the DeepMind results,
00:13:57.240 | shows that it was the right approach
00:13:58.880 | with the success of GPT-4 and stuff,
00:14:01.040 | shows that it was the right approach.
00:14:02.400 | - And was it according to the neural scaling laws,
00:14:05.280 | the Kaplan paper that was published?
00:14:08.120 | - I think it was before that,
00:14:09.440 | because those ones came with GPT-3,
00:14:13.360 | basically at the time of GPT-3 being released
00:14:15.360 | or being ready internally.
00:14:17.000 | But before that, there really was a strong belief in scale.
00:14:20.960 | I think it was just the belief that the transformer
00:14:23.120 | was a generic enough architecture
00:14:25.560 | that you could learn anything,
00:14:26.960 | and that this was just a question of scaling.
00:14:28.960 | - Any other fun stories you want to tell?
00:14:31.360 | - David, Sam Altman, Greg, you know, any.
00:14:34.120 | I didn't work, weirdly, I didn't work that much with Greg
00:14:37.000 | when I was at OpenAI.
00:14:38.400 | He was, he had always been mostly focused
00:14:40.560 | on training the GPTs, and rightfully so.
00:14:43.400 | One thing about Sam Altman, he really impressed me
00:14:46.000 | because when I joined, he had joined not that long ago,
00:14:49.920 | and it felt like he was kind of a very high-level CEO.
00:14:54.720 | And I was mind-blown by how deep he was able
00:14:59.000 | to go into the subjects within a year or something,
00:15:02.320 | all the way to a situation where when I was having lunch
00:15:05.640 | by year two, I was at OpenAI with him,
00:15:07.800 | he would just quite know deeply what I was doing.
00:15:10.920 | - With no ML background, like, you know.
00:15:14.320 | - Yeah, with no ML, but I didn't have any either,
00:15:17.120 | so I guess that explains why.
00:15:19.560 | But I think you can, it's a question about really,
00:15:22.360 | you don't necessarily need to understand
00:15:24.320 | the very technicalities of how things are done,
00:15:27.240 | but you need to understand what's the goal
00:15:29.400 | and what's being done, and what are the recent results,
00:15:31.840 | and all of that in you, and we could have
00:15:33.440 | kind of a very productive discussion,
00:15:35.240 | and that really impressed me, given the size at the time
00:15:38.560 | of OpenAI, which was not negligible.
00:15:40.560 | - Yeah, I mean, you've been, you were a founder before.
00:15:42.520 | - Yep. - You're a founder now.
00:15:43.480 | - Yep. - And you've seen Sam
00:15:44.520 | as a founder.
00:15:45.360 | How has he affected you as a founder?
00:15:47.400 | - I think having that capability of changing
00:15:50.040 | the scale of your attention in the company,
00:15:52.600 | because most of the time, you operate at a very high level,
00:15:55.080 | but being able to go deep down and being in the known
00:15:57.520 | of what's happening on the ground is something
00:15:59.600 | that I feel is really enlightening.
00:16:02.320 | That's not a place in which I ever was as a founder,
00:16:05.440 | because first company, we went all the way to 10 people.
00:16:08.520 | Current company, there's 25 of us,
00:16:10.400 | so the high level, the sky and the ground
00:16:13.280 | are pretty much at the same place.
00:16:14.440 | (all laughing)
00:16:16.000 | - Yeah, you're being too humble.
00:16:17.280 | I mean, Stripe was also like a huge rocket ship.
00:16:19.840 | - Stripe, I was a founder, so I was, like at OpenAI,
00:16:22.360 | I was really happy being on the ground,
00:16:24.640 | pushing the machine, making it work.
00:16:26.680 | - Yeah.
00:16:27.520 | - Last OpenAI question. - Yep.
00:16:28.800 | - The Anthropic split you mentioned,
00:16:30.640 | you were around for that, very dramatic.
00:16:32.240 | David also left around that time, you left.
00:16:34.960 | This year, we've also had a similar management shakeup,
00:16:38.240 | let's just call it.
00:16:39.120 | Can you compare what it was like going through that split
00:16:41.840 | during that time?
00:16:42.960 | And then like, does that have any similarities now?
00:16:46.000 | Like, are we gonna see a new Anthropic emerge
00:16:48.520 | from these folks that you just left?
00:16:50.720 | - That, I really, really don't know.
00:16:52.640 | At the time, the split was pretty surprising
00:16:55.360 | because they had been trying GPT-3, it was a success.
00:16:59.080 | And to be completely transparent,
00:17:00.600 | I wasn't in the weeds of the splits.
00:17:03.080 | What I understood of it is that there was a disagreement
00:17:06.400 | of the commercialization of that technology.
00:17:09.480 | I think the focal point of that disagreement
00:17:11.640 | was the fact that we started working on the API
00:17:14.160 | and wanted to make those models available through an API.
00:17:17.080 | Is that really the core disagreement?
00:17:20.240 | I don't know.
00:17:21.080 | - Was it safety, was it commercialization?
00:17:22.560 | - Exactly.
00:17:23.400 | - Or did they just wanna start a company?
00:17:24.240 | - Exactly, exactly, that I don't know.
00:17:26.000 | But I think what I was surprised of
00:17:29.040 | is how quickly OpenAI recovered at a time.
00:17:32.480 | And I think it's just because we were mostly a research org
00:17:36.400 | and the mission was so clear
00:17:37.960 | that some divergence in some teams, some people leave,
00:17:41.680 | the mission is still there.
00:17:42.680 | We have the compute, we have a site.
00:17:44.640 | So it just keep going.
00:17:46.200 | - Yeah, very deep bench, like just a lot of talent.
00:17:48.800 | - Yeah.
00:17:49.640 | - So that was the OpenAI part of the history.
00:17:52.440 | - Exactly.
00:17:53.280 | - So then you leave OpenAI in September, 2022.
00:17:55.880 | And I would say in Silicon Valley,
00:17:57.840 | the two hottest companies at the time
00:17:59.400 | were you and Linkgrain.
00:18:01.000 | What was that start like?
00:18:02.720 | And what did you decide to start
00:18:03.920 | with a more developer focus,
00:18:05.640 | kind of like a AI engineer tool,
00:18:07.960 | rather than going back into doing some more research
00:18:10.240 | on something else?
00:18:11.080 | - Yeah, first, I'm not a trained researcher.
00:18:13.280 | So going through OpenAI was really kind of the PhD
00:18:15.840 | I always wanted to do.
00:18:17.200 | But research is hard.
00:18:18.440 | You're digging into a field all day long
00:18:21.200 | for weeks and weeks and weeks.
00:18:23.320 | And you find something,
00:18:25.000 | you get super excited for 12 seconds.
00:18:28.000 | And at the 13th second, you're like,
00:18:29.360 | "Oh yeah, that was obvious."
00:18:30.720 | And you go back to digging.
00:18:32.320 | I'm not a trained, formally trained researcher.
00:18:35.080 | And it wasn't kind of necessarily an ambition of me
00:18:37.360 | of having a research career.
00:18:40.280 | And I felt the hardness of it.
00:18:42.560 | I enjoyed a lot of that a ton.
00:18:45.960 | But at the time I decided that I wanted to go back
00:18:49.160 | to something more productive.
00:18:51.200 | And the other fun motivation was like,
00:18:54.200 | I mean, if we believe in AGI
00:18:56.040 | and if we believe the timelines might not be too long,
00:18:58.680 | it's actually the last train leaving a station
00:19:00.560 | to start a company.
00:19:01.640 | After that, it's going to be computers all the way down.
00:19:04.880 | And so that was kind of the true motivation
00:19:06.840 | for like trying to go there.
00:19:09.720 | So that's kind of the core motivation
00:19:11.240 | at the beginning of personally.
00:19:12.680 | And the motivation for starting a company was pretty simple.
00:19:16.360 | I had seen GPT-4 internally at the time.
00:19:18.560 | It was September, 2022.
00:19:20.360 | So it was pre-GPT, but GPT-4 was ready since,
00:19:23.800 | I mean, I'd been ready for a few months internally.
00:19:26.040 | I was like, okay, that's obvious.
00:19:27.520 | The capabilities are there to create an insane amount
00:19:30.200 | of value to the world.
00:19:31.440 | And yet the deployment is not there yet.
00:19:34.360 | The revenue of OpenAI at the time were ridiculously small
00:19:37.680 | compared to what it is today.
00:19:39.080 | And so the thesis was there's probably a lot to be done
00:19:42.080 | at the product level to unlock the usage.
00:19:45.120 | - Yep.
00:19:45.960 | Let's talk a bit more about the form factor, maybe.
00:19:48.240 | I think one of the first successes you have
00:19:50.720 | was kind of like the WebGPT-like thing,
00:19:52.880 | like using the models to traverse the web
00:19:55.440 | and like summarize thing.
00:19:56.280 | And the browser was really the interface.
00:19:58.840 | Why did you start with the browser?
00:20:00.720 | Like what was it important?
00:20:01.760 | And then you built XB1,
00:20:03.040 | which was kind of like the browser extension.
00:20:05.200 | - So the starting point at the time was,
00:20:07.080 | so if you wanted to talk about LLMs,
00:20:09.720 | it was still a rather small community,
00:20:12.960 | a community of mostly researchers,
00:20:15.400 | and to some extent, very early adopters,
00:20:17.920 | very early engineers.
00:20:19.440 | It was almost inconceivable to just build a product
00:20:22.680 | and go sell it to the enterprise.
00:20:24.400 | Though at the time there was a few companies doing that,
00:20:26.280 | the one on marketing, I don't remember its name, Jasper.
00:20:29.800 | But so the natural first intention,
00:20:32.240 | the first, first, first intention
00:20:33.480 | was to go to the developers
00:20:34.960 | and try to create tooling for them
00:20:36.960 | to create product on top of those models.
00:20:39.520 | And so that's what Dust was originally.
00:20:41.560 | It was quite different than Lanchain,
00:20:43.240 | and Lanchain just beat the shit out of us,
00:20:46.920 | which is great.
00:20:48.080 | - It's a choice.
00:20:49.280 | You were cloud, in closed source.
00:20:51.400 | They were open source.
00:20:52.240 | - Yeah, so technically we were open source
00:20:54.080 | and we still are open source,
00:20:55.040 | but I think that doesn't really matter.
00:20:56.760 | I had the strong belief from my research time
00:20:59.480 | that you cannot create an LLM-based workflow
00:21:04.200 | on just one example.
00:21:05.680 | Basically, if you just have one example, you overfit.
00:21:08.320 | So as you develop your interaction,
00:21:10.280 | your orchestration around the LLM,
00:21:11.880 | you need a dozen example.
00:21:14.000 | Obviously, if you're running a dozen example
00:21:15.800 | on a multi-step workflow, you start paralyzing stuff.
00:21:19.240 | And if you do that in the console,
00:21:21.960 | you just have like a messy stream of tokens going out
00:21:25.440 | and it's very hard to observe what's going there.
00:21:28.720 | And so the idea was to go with a new UI
00:21:30.520 | so that you could kind of introspect easily
00:21:33.080 | the output of each interaction with the model
00:21:35.760 | and dig into there through a new UI, which is--
00:21:38.200 | - Was that open source?
00:21:39.320 | I actually didn't--
00:21:40.160 | - Oh yeah, it wasn't.
00:21:41.000 | I mean, Dust is entirely open source even today.
00:21:43.520 | We're not going for--
00:21:44.360 | - If it matters, I didn't know that.
00:21:45.520 | - No, no, no, no, no.
00:21:47.080 | The reason why is because we're not open source
00:21:48.600 | because we're not doing an open source strategy.
00:21:51.160 | It's not an open source go-to-market at all.
00:21:53.080 | We're open source because we can and it's fun.
00:21:55.240 | - Open source is marketing.
00:21:56.360 | You have all the downsides of open source,
00:21:57.680 | which is that people can clone you.
00:21:59.680 | - But I think that downside is a big fallacy.
00:22:02.320 | - Okay.
00:22:03.160 | - Yes, anybody can clone Dust today,
00:22:04.800 | but the value of Dust is not the current state.
00:22:07.320 | The value of Dust is number of eyeballs
00:22:09.400 | and hands of developers
00:22:11.000 | that are creating to it in the future.
00:22:13.440 | And so, yes, anybody can close today,
00:22:15.360 | but that wouldn't change anything.
00:22:17.720 | There is some value in being open source.
00:22:19.960 | In a discussion with the security team,
00:22:22.120 | you can be extremely transparent and just show the code.
00:22:24.880 | When you have discussion with users
00:22:26.520 | and there's a bug or a feature missing,
00:22:28.600 | you can just point to the issue, show the pull request.
00:22:31.280 | - PR welcome.
00:22:32.120 | - Show the, exactly.
00:22:33.120 | Oh, PR welcome, that doesn't happen that much.
00:22:36.160 | But you can show the progress.
00:22:38.000 | If the person that you're chatting with
00:22:40.280 | is a little bit technical,
00:22:41.120 | they really enjoy seeing the pull requests advancing
00:22:43.520 | and seeing all the way to deploy.
00:22:45.280 | And then the downsides are mostly around security.
00:22:48.440 | You never want to do security by obfuscation.
00:22:50.960 | But the truth is that your vector of attack
00:22:54.080 | is facilitated by you being open source.
00:22:56.760 | But at the same time, it's a good thing
00:22:58.160 | because if you're doing anything like bug bunting
00:23:00.520 | or stuff like that,
00:23:01.840 | you just give much more tools to the bug buntiers
00:23:05.200 | so that their output is much better.
00:23:07.280 | So there's many, many, many trade-offs.
00:23:09.360 | I don't believe in the value of the code base per se.
00:23:12.560 | - Wow.
00:23:13.400 | - I think it's really the people that are on the code base
00:23:15.120 | that have the value and the go-to-market and the product
00:23:18.040 | and all of those things that are around the code base.
00:23:20.960 | Obviously, that's not true for every code base.
00:23:23.120 | If you're working on a very secret kernel
00:23:25.280 | to accelerate the inference of LLMs,
00:23:28.560 | I would buy that you don't want to be open source.
00:23:31.240 | But for product stuff,
00:23:32.520 | I really think there's very little risk.
00:23:35.240 | - Yeah.
00:23:36.080 | - I signed up for XP1, I was looking, January, 2023.
00:23:39.720 | I think at the time you were on DaVinci 003.
00:23:42.720 | Given that you had seen GPT-4,
00:23:44.760 | how did you feel having to push a product out
00:23:47.080 | that was using this model that was so inferior?
00:23:49.520 | And you're like, "Please, just use it today.
00:23:51.120 | "I promise it's going to get better."
00:23:52.360 | It's like just overall as a founder,
00:23:54.360 | how do you build something
00:23:55.200 | that maybe doesn't quite work with the model today,
00:23:57.080 | but you're just expecting the new model to be better?
00:23:59.360 | - Yeah, so actually, XP1 was even on a smaller one
00:24:02.920 | that was the post-Chad GPT release small version,
00:24:05.880 | so it was--
00:24:06.720 | - Ada, Babbage.
00:24:07.560 | - No, no, no, not that far away,
00:24:08.880 | but it was the small version of Chad GPT, basically.
00:24:12.240 | I don't remember its name.
00:24:13.600 | Yes, you have a frustration there,
00:24:15.440 | but at the same time, I think XP1 was designed,
00:24:18.080 | was an experiment, but was designed as a way to be useful
00:24:20.520 | at the current capability of the model.
00:24:22.560 | If you just want to extract data from a LinkedIn page,
00:24:25.240 | that model was just fine.
00:24:26.840 | If you want to summarize an article on a newspaper,
00:24:29.480 | that model was just fine.
00:24:31.000 | And so it was really a question of trying to find a product
00:24:33.720 | that works with the current capability,
00:24:36.000 | knowing that you will always have tailwinds
00:24:39.080 | as models get better and faster and cheaper.
00:24:41.240 | So that was kind of a, there's a bit of a frustration
00:24:43.760 | because you know what's out there
00:24:44.880 | and you know that you don't have access to it yet,
00:24:46.520 | but it's also interesting to try to find a product
00:24:49.480 | that works with the current capability.
00:24:51.360 | - And we highlighted XP1 in our anatomy of autonomy posts
00:24:54.840 | in April of last year, which was, you know,
00:24:57.400 | where are all the agents, right?
00:24:59.160 | So now we spent 30 minutes getting
00:25:01.120 | to what you're building now.
00:25:02.960 | So you basically had a developer framework,
00:25:06.160 | then you had a browser extension,
00:25:07.840 | then you had all these things,
00:25:08.800 | and then you kind of got to where Dust is today.
00:25:10.960 | So maybe just give people an overview
00:25:12.640 | of what Dust is today and the courtesies behind it.
00:25:16.080 | - Yeah, of course.
00:25:16.920 | So Dust, we really want to build the infrastructure
00:25:19.280 | so that companies can deploy agents within their teams.
00:25:23.280 | We are horizontal by nature
00:25:25.680 | because we strongly believe in the emergence of use cases
00:25:28.280 | from the people having access to creating an agent
00:25:30.760 | that don't need to be developers.
00:25:32.600 | They have to be tinkerers, they have to be curious,
00:25:35.120 | but they can, like anybody can create an agent
00:25:37.440 | that will solve an operational thing
00:25:39.960 | that they're doing in their day-to-day job.
00:25:42.000 | And to make those agents useful,
00:25:44.360 | there's two focus, which is interesting.
00:25:46.760 | The first one is an infrastructure focus.
00:25:48.520 | You have to build the pipes
00:25:49.840 | so that the agent has access to the data,
00:25:51.800 | you have to build the pipes such that the agents
00:25:53.560 | can take action, can access the web, et cetera.
00:25:55.800 | So that's really an infrastructure play.
00:25:58.040 | Maintaining connections to Notion, Slack, GitHub,
00:26:01.240 | all of them is a lot of work.
00:26:04.280 | It is boring work, boring infrastructure work,
00:26:06.840 | but that's something that we know is extremely valuable
00:26:09.440 | in the same way that Stripe is extremely valuable
00:26:11.280 | because it maintains the pipes.
00:26:13.120 | And we have that dual focus
00:26:15.080 | because we're also building the product
00:26:17.240 | for people to use it.
00:26:18.640 | And there it's fascinating because everything started
00:26:21.760 | from the conversational interface, obviously,
00:26:24.240 | which is a great starting point,
00:26:26.160 | but we're only scratching the surface, right?
00:26:29.280 | I think we are at the Pong level of LLM productization,
00:26:33.960 | and we haven't invented the C3,
00:26:35.960 | we haven't invented Counter-Strike,
00:26:37.520 | we haven't invented Cyberpunk 2077.
00:26:41.640 | So this is really, our mission is to really create
00:26:46.080 | the product that let people equip themselves
00:26:48.520 | to just get away all the work that can be automated
00:26:52.120 | or assisted by LLMs.
00:26:54.080 | - And can you just comment on different takes
00:26:56.480 | that people had?
00:26:57.320 | So maybe at the most open, it's like auto-GPT.
00:27:01.200 | It's just kind of like, just try and do anything.
00:27:03.000 | It's like, it's all magic.
00:27:04.240 | There's no way for you to do anything.
00:27:06.000 | Then you had the Adapt.
00:27:08.160 | We had David on the podcast.
00:27:09.440 | They're very super hands-on with each individual customer
00:27:12.280 | to build super tailored.
00:27:14.160 | How do you decide where to draw the line
00:27:15.880 | between this is magic, this is exposed to you,
00:27:18.120 | especially in a market where most people don't know
00:27:20.520 | how to build with AI at all.
00:27:21.880 | So if you expect them to do the thing,
00:27:23.560 | they're probably not going to do it.
00:27:24.840 | - Yeah, exactly.
00:27:25.680 | So the auto-GPT approach obviously is extremely exciting,
00:27:28.920 | but we know that the agentic capability of models
00:27:32.200 | are not quite there yet.
00:27:33.520 | It just gets lost.
00:27:35.680 | So we're starting where it works.
00:27:37.760 | Same with the XP one, and where it works is pretty simple.
00:27:40.440 | It's like a simple workflows that involve a couple tools
00:27:45.440 | where you don't even need to have the model
00:27:48.320 | decide which tools it's used in the sense of
00:27:51.120 | you just want people to put it in the instructions.
00:27:53.920 | It's like, take that page, do that search,
00:27:57.200 | pick up that document, do the work that I want
00:27:59.800 | in the format I want, and give me the results.
00:28:01.760 | There's no smartness there, right?
00:28:03.640 | In terms of orchestrating the tools,
00:28:06.080 | it's mostly using English for people to program a workflow
00:28:10.360 | where you don't have the constraint
00:28:11.600 | of having compatible API between the two.
00:28:14.000 | - That kind of personal automation,
00:28:16.000 | would you say it's kind of like a LLM Zapier type of thing?
00:28:18.800 | Like if this, then that, and then, you know,
00:28:20.880 | do this, then this.
00:28:21.720 | It's still very, you're programming with English?
00:28:24.520 | - So you're programming with English.
00:28:25.760 | So you're just saying, oh, do this, and then that.
00:28:28.720 | You can even create some form of APIs.
00:28:31.320 | You say, when I give you the command X, do this.
00:28:34.480 | When I give you the command Y, do this,
00:28:36.360 | and you describe the workflow.
00:28:37.920 | But you don't have to create boxes
00:28:39.800 | and create the workflow explicitly.
00:28:41.720 | It's just need to describe what are the tasks
00:28:43.720 | supposed to be and make the tool available to the agent.
00:28:47.360 | Tool can be a semantic search.
00:28:49.200 | The tool can be querying into a structured database.
00:28:52.080 | The tool can be searching on the web.
00:28:54.640 | And obviously, the interesting tools
00:28:57.040 | that we're only starting to scratch
00:28:58.800 | are actually creating external actions,
00:29:00.880 | like reimbursing something on Stripe,
00:29:03.400 | sending an email, clicking on a button in the admin,
00:29:06.680 | or something like that.
00:29:07.560 | - Do you maintain all these integrations?
00:29:09.320 | - Today, we maintain most of the integrations.
00:29:11.720 | We do always have an escape hatch
00:29:13.600 | for people to kind of custom integrate.
00:29:17.000 | But the reality is that, the reality of the market today
00:29:19.560 | is that people just want it to work, right?
00:29:22.280 | And so it's mostly us maintaining the integration.
00:29:25.400 | As an example, a very good source of information
00:29:27.480 | that is tricky to productize is Salesforce,
00:29:30.880 | because Salesforce is basically a database and a UI,
00:29:33.480 | and they do the fuck they want with it.
00:29:35.320 | (all laughing)
00:29:37.120 | And so every company has different models
00:29:39.160 | and stuff like that.
00:29:40.000 | So right now, we don't support it natively.
00:29:43.200 | And the type of support, or real native support,
00:29:46.040 | will be slightly more complex than just OSing into it,
00:29:48.720 | like is the case with Slack, as an example,
00:29:50.880 | because it's probably gonna be,
00:29:52.520 | oh, you want to connect your Salesforce to us?
00:29:54.440 | Give us the SoQL, that's the Salesforce QL language.
00:29:58.600 | Give us the queries you want us to run on it,
00:30:00.840 | and inject in the context of Dust.
00:30:03.040 | So that's interesting how not only integrations are cool,
00:30:06.200 | and some of them require a bit of work on the user,
00:30:08.480 | and for some of them that are really valuable to our users,
00:30:10.480 | but we don't support yet,
00:30:11.400 | they can just build them internally
00:30:13.160 | and push the data to us.
00:30:14.600 | - I think I understand the Salesforce thing,
00:30:16.440 | but let me just clarify it.
00:30:17.440 | Are you using browser automation,
00:30:19.200 | because there's no API for something?
00:30:20.680 | - No, no, no, no, no.
00:30:21.520 | In that case, so we do have browser automation
00:30:24.240 | for all the use cases and apply the public web,
00:30:27.240 | but for most of the integration
00:30:28.480 | with the internal system of the company,
00:30:29.880 | it really runs through API.
00:30:31.520 | - Haven't you felt the pull to RPA,
00:30:33.880 | browser automation, that kind of stuff?
00:30:35.560 | - I mean, what I've been saying for a long time,
00:30:37.280 | maybe I'm wrong,
00:30:38.120 | is that if the future is
00:30:40.040 | that you're gonna stand in front of your computer
00:30:41.640 | and looking at an agent clicking on stuff,
00:30:44.160 | then I'll eat my computer.
00:30:45.880 | And my computer is a big Lenovo, it's black,
00:30:48.320 | doesn't sound good at all compared to a Mac.
00:30:50.760 | And if the APIs are there, we should use them.
00:30:53.480 | There's always gonna be a long tail of stuff
00:30:56.000 | that don't have APIs,
00:30:56.920 | but as the world is moving forward,
00:30:59.000 | that's disappearing.
00:31:00.480 | So the core API value in the past
00:31:04.080 | has really been, oh, this old '90s product
00:31:07.160 | doesn't have an API,
00:31:08.000 | so I need to use the UI to automate.
00:31:09.840 | I think for most of the ICP companies,
00:31:12.640 | the companies that are ICP for us,
00:31:14.000 | the scale-ups that are between 500 and 5,000 people,
00:31:17.040 | tech companies, most of the SaaS they use have APIs.
00:31:21.080 | Not as an interesting question for the open web,
00:31:24.240 | because there are stuff that you wanna do
00:31:26.280 | that involve websites that don't necessarily have APIs.
00:31:29.240 | And the current state of web integration from,
00:31:32.720 | which is us and OpenAI and Anthropic,
00:31:35.360 | I don't even know if they have web navigation,
00:31:37.040 | but I don't think so.
00:31:38.040 | The current state of affair is really, really broken,
00:31:40.480 | because you have what?
00:31:41.320 | You have basically search and headless browsing.
00:31:44.000 | But headless browsing, I think everybody's doing
00:31:46.840 | basically body.innertext and fill that into the model.
00:31:51.840 | Right?
00:31:52.960 | - There's parsers into Markdown and stuff.
00:31:54.720 | - We're super excited by the companies
00:31:56.200 | that are exploring the capability of rendering a webpage
00:31:59.360 | into a way that is compatible for a model,
00:32:01.280 | being able to maintain the selector,
00:32:03.200 | so that's basically the place where to click in the page
00:32:06.200 | through that process, expose the actions to the model,
00:32:09.200 | have the model select an action
00:32:10.680 | in a way that is compatible with model,
00:32:12.760 | which is not a big page of a full DOM that is very noisy,
00:32:16.760 | and then being able to decompress that
00:32:19.320 | back to the original page and take the action.
00:32:22.080 | And that's something that is really exciting
00:32:24.000 | and that will kind of change the level of things
00:32:27.160 | that agents can do on the web.
00:32:29.720 | That I feel exciting, but I also feel that the bulk
00:32:33.120 | of the useful stuff that you can do within the company
00:32:36.000 | can be done through API, the data can be retrieved by API,
00:32:38.800 | the actions can be taken through API.
00:32:40.640 | - For listeners, I'll note that you're basically
00:32:42.480 | completely disagreeing with David Wan.
00:32:44.520 | - Exactly, exactly, I've seen it since summer.
00:32:47.200 | - Adept is where it is, and Dust is where it is,
00:32:49.800 | so Dust is still standing.
00:32:51.640 | - Can we just quickly comment on function calling?
00:32:53.920 | - Yeah.
00:32:54.760 | - You mentioned you don't need the models to be that smart
00:32:56.120 | to actually pick the tools.
00:32:57.480 | Have you seen the models not be good enough,
00:32:59.240 | or is it just like, you just don't wanna put
00:33:01.120 | the complexity in there?
00:33:02.600 | Is there any room for improvement left in function calling,
00:33:05.760 | or do you feel you usually consistently get always
00:33:08.120 | the right response, the right parameters, and all that?
00:33:10.240 | - So that's a tricky product question,
00:33:12.200 | because if the instructions are good and precise,
00:33:14.800 | then you don't have any issue,
00:33:15.800 | because it's scripted for you,
00:33:16.960 | and the model just look at the scripts and just follow
00:33:19.160 | and say, oh, he's probably talking about that action,
00:33:21.360 | and I'm gonna use it, and the parameters are kind of
00:33:23.360 | abused from the state of the conversation,
00:33:24.880 | I'll just go with it.
00:33:26.200 | If you provide a very high level,
00:33:28.520 | kind of a auto-GPT-esque level in the instructions,
00:33:31.080 | and provide 16 different tools to your model,
00:33:33.520 | yes, we're seeing the models in that state making mistakes.
00:33:37.080 | And there is obviously some progress can be made
00:33:41.680 | on the capabilities, but the interesting part
00:33:44.360 | is that there is already so much work
00:33:46.880 | and that can assist, augment, accelerate,
00:33:49.680 | by just going with pretty simply
00:33:52.240 | scripted for actions agents.
00:33:54.600 | What I'm excited about, by starting in,
00:33:56.720 | like pushing our users to create rather simple agents,
00:33:59.880 | is that once you have those working really well,
00:34:03.040 | you can create meta-agents that use the agents as actions,
00:34:06.040 | and all of a sudden, you can kind of have a hierarchy
00:34:08.640 | of responsibility that will probably get you almost
00:34:12.080 | to the point of the auto-GPT value.
00:34:14.200 | It require the construction of intermediary artifacts,
00:34:17.280 | but you're probably gonna be able to achieve
00:34:19.720 | something great.
00:34:20.600 | I'll give you some example.
00:34:21.680 | We have, our incidents are shared in Slack,
00:34:24.400 | in a specific channel, or shipped, are shared in Slack.
00:34:27.280 | We have a weekly meeting where we have a table
00:34:29.440 | about incidents and shipped stuff.
00:34:32.040 | We're not writing that weekly meeting table anymore.
00:34:34.240 | We have an assistant that just go find the right data
00:34:36.320 | on Slack and create the table for us.
00:34:38.680 | And that assistant works perfectly.
00:34:40.800 | It's trivially simple, right?
00:34:42.720 | Take one week of data from that channel
00:34:44.760 | and just create the table.
00:34:46.480 | And then we have, in that weekly meeting,
00:34:48.760 | some, obviously, some graphs and reporting
00:34:52.040 | about our financials and our progress and our ARR,
00:34:55.240 | and we've created assistants to generate
00:34:57.320 | those graphs directly, and those assistants works great.
00:35:00.720 | By creating those assistants that cover those small parts
00:35:02.840 | of that weekly meeting, slowly, we're getting to,
00:35:05.200 | in a world where we'll have a weekly meeting assistant,
00:35:07.880 | we'll just call it, you don't need to prompt it,
00:35:09.640 | you don't need to say anything.
00:35:10.680 | It's gonna run those different assistants
00:35:12.440 | and get that Notion page just ready.
00:35:14.440 | And by doing that, if you get there,
00:35:16.760 | and that's an objective for us, to us using Dust, get there,
00:35:20.280 | you're saving, I don't know, an hour of company time
00:35:23.520 | every time you run it.
00:35:24.400 | - Yeah, that's my pet topic of NPM for agents.
00:35:27.480 | It's like, how do you build dependency graphs of agents
00:35:30.120 | and how do you share them?
00:35:31.560 | Because why do I have to rebuild some of the smaller levels
00:35:35.040 | of what you built already?
00:35:36.280 | - I have a quick follow-up question on agents
00:35:38.800 | managing other agents.
00:35:39.920 | It's a topic of a lot of research,
00:35:42.640 | both from like Microsoft and even in startups.
00:35:45.520 | What you've discovered best practice for,
00:35:48.280 | let's say like a manager agent
00:35:49.760 | controlling a bunch of small agents,
00:35:51.840 | that it's two-way communication.
00:35:53.440 | I don't know, is there should be a protocol format?
00:35:56.000 | - To be completely honest, the state we are at right now
00:35:58.960 | is creating the simple agents.
00:36:00.280 | So we haven't even explored yet the meta agents.
00:36:02.440 | We know it's there.
00:36:03.280 | We know it's gonna be valuable.
00:36:04.520 | We know it's gonna be awesome.
00:36:05.760 | But we're starting there
00:36:07.040 | because it's the simplest place to start.
00:36:09.200 | And it's also what the market understands.
00:36:11.600 | If you go to a company, random SaaS B2B company,
00:36:15.920 | not necessarily specialized in AI,
00:36:17.560 | and you take an operational team
00:36:19.880 | and you tell them build some tooling for yourself,
00:36:22.000 | they'll understand the small agents.
00:36:23.640 | If you tell them build AutoGP, they'll go, "Auto what?"
00:36:27.640 | - And I noticed that in your language,
00:36:29.040 | you're very much focused on non-technical users.
00:36:31.160 | You don't really mention API here.
00:36:33.120 | You mention instruction instead of system prompt, right?
00:36:36.560 | That's very conscious.
00:36:37.760 | - Yeah, it's very conscious.
00:36:38.800 | It's a mark of our designer, Ed,
00:36:40.680 | who kind of pushed us to create a friendly product.
00:36:45.320 | I was knee-deep into AI when I started, obviously.
00:36:48.600 | And my co-founder, Gabriel, was a Stripe as well.
00:36:51.880 | We started a company, Glaser,
00:36:52.920 | that got acquired by Stripe 15 years ago.
00:36:55.040 | Was at Allen, a healthcare company in Paris after that.
00:36:58.960 | It was a little bit less so knee-deep in AI,
00:37:01.560 | but really focused on product.
00:37:03.360 | And I didn't realize how important it is
00:37:05.920 | to make that technology not scary to end users.
00:37:10.240 | It didn't feel scary to me,
00:37:11.720 | but it was really seen by Ed, our designer,
00:37:15.000 | that it was feeling scary to the users.
00:37:17.880 | And so we were very proactive and very deliberate
00:37:20.440 | about creating a brand that feels not too scary
00:37:23.000 | and creating a wording and a language, as you say,
00:37:25.680 | that really tried to communicate the fact
00:37:28.400 | that it's gonna be fine, it's gonna be easy,
00:37:30.200 | you're gonna make it.
00:37:31.200 | - And another big point that David had about ADAPT
00:37:33.960 | is we need to build an environment
00:37:35.920 | for the agents to act.
00:37:37.240 | And then if you have the environment,
00:37:38.200 | you can simulate what they do.
00:37:40.120 | How's that different when you're interacting with APIs
00:37:42.880 | and you're kind of touching systems
00:37:44.680 | that you cannot really simulate?
00:37:45.880 | Like, if you call it the Salesforce API,
00:37:47.400 | you're just calling it.
00:37:48.760 | - Yep, so I think that goes back to the DNA of the companies
00:37:52.120 | that are very different.
00:37:53.240 | ADAPT, I think, was a product company
00:37:55.520 | with a very strong research DNA,
00:37:56.920 | and they were still doing research.
00:37:58.040 | One of their goal was building a model,
00:38:00.080 | and that's why they raised a large amount of money,
00:38:01.760 | et cetera.
00:38:02.760 | We are 100% deliberately product company.
00:38:06.640 | We don't do research, we don't train models,
00:38:09.040 | we don't even run GPUs.
00:38:10.640 | We're using the models that exist,
00:38:11.960 | and we try to push the product boundary
00:38:13.920 | as far as possible with the existing models.
00:38:16.360 | So that creates an issue.
00:38:17.720 | Indeed, so to answer your question,
00:38:19.360 | when you're interacting in the real world,
00:38:20.840 | where you cannot simulate,
00:38:22.480 | so you cannot improve the models,
00:38:24.400 | even improving your instructions
00:38:27.480 | is complicated for a builder.
00:38:29.520 | The hope is that you can use models
00:38:31.240 | to evaluate the conversations
00:38:33.080 | so that you can get at least feedback
00:38:34.680 | and you can get contradictive information
00:38:36.520 | about the performance of your assistants.
00:38:38.320 | But if you take actual trace of interaction
00:38:41.520 | of humans with those agents,
00:38:43.440 | it is even for us human extremely hard to decide
00:38:45.960 | whether it was a productive interaction
00:38:47.400 | or a really bad interaction.
00:38:49.040 | You don't know why the person left,
00:38:50.480 | you don't know if they left happy or not.
00:38:52.600 | So being extremely, extremely, extremely pragmatic here,
00:38:56.080 | it becomes a product issue.
00:38:57.440 | We have to build a product that satisfies the end users
00:39:00.720 | to provide feedback so that as a first step,
00:39:04.240 | person that is building the agent can iterate on it.
00:39:06.880 | As a second step, maybe later when we start training model
00:39:09.440 | and post-training, et cetera,
00:39:10.920 | we can optimize around that for each of those companies.
00:39:13.440 | - Yeah.
00:39:14.280 | Do you see in the future products offering
00:39:16.200 | kind of like a simulation environment,
00:39:18.520 | the same way all SaaS now kind of offers APIs
00:39:21.560 | to build programmatically?
00:39:22.680 | Like in cybersecurity,
00:39:24.200 | there are a lot of companies working
00:39:25.880 | on building simulative environments
00:39:27.440 | so that then you can use agents like Red Team,
00:39:29.560 | but I haven't really seen that.
00:39:31.040 | - Yeah, no, me neither.
00:39:32.480 | That's a super interesting question.
00:39:34.760 | I think it really going to depend on how much,
00:39:37.280 | because you need to simulate to generate data,
00:39:39.520 | you need to train data to train models.
00:39:41.280 | And the question is at the end is,
00:39:43.360 | are we going to be training models
00:39:44.880 | or are we just going to be using frontier models as they are?
00:39:48.880 | On that question, I don't have a strong opinion.
00:39:51.600 | It might be the case that we'll be training models
00:39:53.760 | because in all of those AI first products,
00:39:56.400 | the model is so close to the product surface
00:39:59.360 | that as you get big and you want to really own your product,
00:40:02.880 | you're going to have to own the model as well.
00:40:05.680 | Owning the model doesn't mean doing the pre-training,
00:40:08.280 | that would be crazy,
00:40:09.440 | but at least having an internal post-training
00:40:12.120 | realignment loop, it makes a lot of sense.
00:40:14.760 | And so if we see many companies
00:40:16.080 | going towards that over time,
00:40:18.440 | then there might be incentives for the SaaS' of the world
00:40:23.240 | to provide assistance in getting there.
00:40:25.600 | But at the same time, there's a tension
00:40:26.600 | because those SaaS,
00:40:27.720 | they don't want to be interacted by agents.
00:40:31.240 | They want the human to click on the button.
00:40:33.440 | - So that's an incentive. - Yeah, they got to sell seats.
00:40:35.320 | - Yeah, exactly. - Exactly.
00:40:37.560 | - Just a quick question on models.
00:40:39.680 | I'm sure you've used many, probably not just OpenAI.
00:40:42.320 | Would you characterize some models as better than others?
00:40:45.680 | Do you use any open source models?
00:40:47.600 | What have been the trends in models over the last two years?
00:40:50.080 | - We've seen over the past two years
00:40:51.520 | kind of a bit of a race in between models.
00:40:54.320 | And at times it's the OpenAI model that is the best,
00:40:58.560 | at times it's the Anthropic models that is the best.
00:41:01.400 | Our take on that is that we are agnostic
00:41:03.000 | and we let our users pick their model.
00:41:05.600 | - Oh, they choose?
00:41:06.440 | - Yeah, so when you create an assistant or an agent,
00:41:08.400 | you can just say, "Oh, I'm going to run it on GPT-4,
00:41:11.720 | "GPT-4 Turbo," or...
00:41:13.040 | - Don't you think for the non-technical user,
00:41:14.600 | that is actually an abstraction
00:41:15.680 | that you should take away from that?
00:41:16.840 | - We have a sane default.
00:41:18.320 | So we move the default to the latest model that is cool,
00:41:22.200 | and we have a sane default,
00:41:23.240 | and it's actually not very visible.
00:41:24.760 | In our flow to create an agent,
00:41:26.600 | you would have to go in advance and go pick your model.
00:41:29.440 | So this is something
00:41:30.280 | that the technical person will care about,
00:41:33.360 | but that's something that obviously
00:41:34.840 | is a bit too complicated for the...
00:41:37.240 | - And do you care most about function calling
00:41:38.840 | or instruction following or something else?
00:41:40.880 | - I think we care most for function calling
00:41:43.160 | because there's nothing worse than a function call,
00:41:47.280 | including incorrect parameters or being a bit off
00:41:49.800 | because it just drives the whole interaction off.
00:41:53.480 | - So go to the Berkeley function calling.
00:41:55.640 | - Yeah, these days, it's funny how the comparison
00:41:59.480 | between GPT-4.0 and GPT-4 Turbo
00:42:01.400 | is still up in the air on function calling.
00:42:03.800 | I personally don't have proof, but I know many people,
00:42:05.640 | and I'm probably part of them,
00:42:07.040 | to think that GPT-4 Turbo is still better
00:42:09.200 | than GPT-4.0 on function calling.
00:42:10.720 | - Wow.
00:42:11.560 | - We'll see what comes out of the O1 class
00:42:14.920 | if it ever gets a function calling.
00:42:17.360 | And Cloud 3.5 Summit is great as well.
00:42:19.800 | They kind of innovated in an interesting way,
00:42:21.600 | which was never quite publicized,
00:42:23.200 | but it's that they have that kind of chain of thought step
00:42:26.560 | whenever you use a Cloud model or Summit model
00:42:29.440 | with function calling.
00:42:30.640 | That chain of thought step doesn't exist
00:42:31.880 | when you just interact with it just for answering questions,
00:42:35.040 | but when you use function calling, you get that step,
00:42:36.720 | and it really helps getting better function calling.
00:42:39.440 | - Yeah, we actually just recorded a podcast
00:42:41.520 | with the Berkeley team that runs that leaderboard this week.
00:42:44.160 | So they just released V3.
00:42:45.960 | It was V1 like two months ago, and then V2, V3.
00:42:48.400 | Turbo is on top.
00:42:49.400 | - Turbo is on top.
00:42:50.240 | - Turbo is over 4.0.
00:42:51.240 | And then the third place is XLAM from Salesforce,
00:42:54.360 | which is a large action model
00:42:55.960 | that they've been trying to popularize.
00:42:58.000 | O1 Mini is actually on here, I think.
00:43:00.240 | O1 Mini is number 11.
00:43:01.920 | - But arguably O1 Mini has been in a line for that.
00:43:05.360 | - Do you use leaderboards?
00:43:06.520 | Do you have your own evals?
00:43:07.880 | I mean, this is kind of intuitive, right?
00:43:09.720 | Like using the older model is better.
00:43:11.440 | I think most people just upgrade.
00:43:13.280 | Yeah.
00:43:14.280 | What's the eval process like?
00:43:15.560 | - It's funny because I've been doing research for three years
00:43:18.080 | and we have bigger stuff to cook.
00:43:21.080 | - Yeah.
00:43:21.920 | - When you're deploying in a company,
00:43:23.400 | one thing where we really spike
00:43:25.080 | is that when we manage to activate the company,
00:43:26.640 | we have a crazy penetration.
00:43:27.960 | The highest penetration we have is 88% daily active users
00:43:32.480 | within the entire employee of the company.
00:43:34.880 | The kind of average penetration and activation
00:43:37.760 | we have in our current enterprise customers
00:43:39.960 | is something like more like 60 to 70% weekly active.
00:43:43.760 | So we basically have the entire company interacting with us.
00:43:47.000 | And when you're there,
00:43:48.640 | there is so many stuff that matters most
00:43:51.120 | than getting evals, getting the best model,
00:43:54.400 | because there is so many places where you can create products
00:43:57.840 | or do stuff that will give you the 80% with the work you do,
00:44:02.560 | whereas deciding if it's GPT-4 or GPT-4 Turbo or et cetera,
00:44:07.160 | you know, it'll just give you the 5% improvement.
00:44:10.880 | - Yeah, yeah, yeah.
00:44:11.720 | - But the reality is that you want to focus on the places
00:44:13.200 | where you can really change the direction
00:44:14.680 | or change the interaction more drastically.
00:44:17.680 | But that's something that we'll have to do eventually
00:44:19.400 | because we still want to be serious people.
00:44:20.840 | - It's funny 'cause in some ways the model labs
00:44:23.440 | are competing for you, right?
00:44:25.600 | You don't have to do any effort.
00:44:26.840 | You just switch model and then it'll grow.
00:44:29.280 | What are you really limited by?
00:44:30.760 | Is it additional sources?
00:44:32.720 | It's not models, right?
00:44:33.880 | You're not really limited by quality of model.
00:44:36.280 | - Right now we are limited by, yes, the infrastructure part,
00:44:41.520 | which is ability to connect easily for users
00:44:45.000 | to all the data they need to do the job they want to do.
00:44:48.000 | - Because you maintain all your own stuff.
00:44:49.440 | You know, there are companies out there
00:44:50.760 | that are starting to provide integrations as a service, right?
00:44:53.760 | I used to work in an integrations company.
00:44:54.960 | - Yeah, yeah, I know, I know, I know.
00:44:55.960 | It's just that there is some intricacies
00:44:57.840 | about how you chunk stuff and how you process information
00:45:01.160 | from one platform to the other.
00:45:02.920 | If you look at the end of the spectrum,
00:45:04.680 | you could think of, you could say,
00:45:06.040 | "Oh, I'm going to support Airbyte,"
00:45:07.480 | and Airbyte kind of has--
00:45:08.520 | - I used to work at Airbyte, yeah.
00:45:09.360 | - Oh, really?
00:45:10.600 | - They were French founders as well.
00:45:11.440 | - French, yeah, I was going to say.
00:45:12.280 | - I know Jean very well.
00:45:14.120 | I'm seeing him today.
00:45:15.040 | And the reality is that if you look at Notion,
00:45:18.400 | Airbyte does the job of taking Notion
00:45:20.520 | and putting it in a structured way,
00:45:22.360 | but that's a way that is not really usable
00:45:24.320 | to actually make it available to models in a useful way.
00:45:28.280 | Because you get all the blocks, details, et cetera,
00:45:30.520 | which is useful for many use cases.
00:45:32.160 | - Because also it's for data scientists and not for AI.
00:45:34.920 | - The reality of Notion is that sometimes you have a,
00:45:37.920 | so when you have a page, there's a lot of structure in it,
00:45:40.180 | and you want to capture the structure
00:45:42.640 | and chunk the information
00:45:43.840 | in a way that respects that structure.
00:45:45.640 | In Notion, you have databases.
00:45:47.120 | Sometimes those databases are real tabular data.
00:45:49.680 | Sometimes those databases are full of text.
00:45:52.120 | You want to get the distinction
00:45:54.480 | and understand that this database
00:45:55.880 | should be considered like text information,
00:45:58.840 | whereas this other one
00:45:59.680 | is actually quantitative information.
00:46:01.400 | And to really get a very high-quality interaction
00:46:04.200 | with that piece of information,
00:46:05.880 | I haven't found a solution that will work
00:46:08.120 | without us owning the connection end-to-end.
00:46:09.840 | That's why I don't invest in this Composio.
00:46:12.380 | There's All Hands from Graham Newbig.
00:46:15.340 | There's all these other companies that are,
00:46:16.660 | like, we will do the integrations for you.
00:46:18.260 | You just, we have the open-source community.
00:46:20.020 | We'll do it off the shelf.
00:46:20.900 | But then you are so specific in your needs
00:46:24.180 | that you want to own it.
00:46:25.340 | - Yeah, exactly.
00:46:26.180 | - You can talk to Michel about that.
00:46:27.180 | You know, he wants to put the AI in there, but, you know.
00:46:29.940 | - I will, I will.
00:46:31.340 | - Cool.
00:46:32.180 | - What are we missing?
00:46:33.020 | You know, what are like the things
00:46:33.940 | that are like sneakily hard that you're tackling
00:46:37.040 | that maybe people don't even realize
00:46:38.820 | they're like really hard?
00:46:39.960 | - The real parts, as we kind of touch base
00:46:42.420 | throughout the conversation,
00:46:43.540 | is really building the infra that works for those agents,
00:46:46.760 | because it's a tenuous walk.
00:46:48.720 | It's an evergreen piece of work
00:46:51.020 | because you always have an extra integration
00:46:53.020 | that will be useful to a non-negligible set of your users.
00:46:56.700 | I'm super excited about is that
00:46:58.940 | there's so many interactions
00:47:00.200 | that shouldn't be conversational interactions,
00:47:02.560 | and that could be very useful.
00:47:04.020 | Basically, know that we have the firehose of information
00:47:07.220 | of those companies,
00:47:08.300 | and there's not gonna be that many companies
00:47:09.780 | that capture the firehose of information.
00:47:11.580 | When you have the firehose of information,
00:47:13.220 | you can do a ton of stuff with models
00:47:16.020 | that are just not accelerating people,
00:47:18.700 | but giving them superhuman capability,
00:47:20.900 | even with the current model capability,
00:47:22.340 | because you can just sift through much more information.
00:47:24.660 | An example is documentation repair.
00:47:26.620 | If I have the firehose of Slack messages
00:47:28.500 | and new Notion pages,
00:47:29.980 | if somebody says, "I own that page,
00:47:31.620 | "I wanna be updated when there is a piece of information
00:47:34.500 | "that should update that page,"
00:47:35.820 | this is not possible.
00:47:36.960 | You get an email, resume is saying,
00:47:38.520 | "Oh, look at that Slack message.
00:47:40.140 | "It says the opposite of what you have in that paragraph.
00:47:42.160 | "Maybe you wanna update or just ping that person."
00:47:44.540 | I think there is a lot to be explored on the product layer
00:47:49.540 | in terms of what it means to interact
00:47:51.500 | productively with those models,
00:47:52.740 | and that's a problem that is extremely hard
00:47:55.460 | and extremely exciting.
00:47:56.660 | - One thing you keep mentioning about infra work,
00:47:58.560 | obviously, Dust is building that in front
00:48:00.900 | and serving that in a very consumer-friendly way.
00:48:04.560 | You always talk about infra being additional sources,
00:48:07.140 | additional connectors.
00:48:08.340 | That is very important.
00:48:09.180 | But I'm also interested in the vertical infra.
00:48:11.180 | There is an orchestrator underlying all these things,
00:48:13.500 | where you're doing asynchronous work.
00:48:15.100 | For example, the simplest one is a cron job.
00:48:17.700 | You just schedule things.
00:48:19.060 | But also, for if this and that,
00:48:20.580 | you have to wait for something to be executed
00:48:22.740 | and proceed to the next task.
00:48:24.740 | I used to work on an orchestrator as well, Temporal.
00:48:27.460 | - We use Temporal.
00:48:28.300 | - Oh, you use Temporal? - Yeah.
00:48:29.300 | - Oh, how was the experience?
00:48:30.460 | I need the NPS.
00:48:31.500 | (all laughing)
00:48:32.680 | - We're doing a self-discovery call now?
00:48:34.900 | - No, but you can also complain to me
00:48:36.240 | 'cause I don't work there anymore.
00:48:38.400 | - No, we love Temporal.
00:48:39.240 | There's some edges that are a bit rough,
00:48:41.800 | surprisingly rough.
00:48:42.800 | And you would say, "Why is it so complicated?"
00:48:44.600 | - Is it versioning?
00:48:45.440 | It's always versioning.
00:48:46.760 | - Stuff like that.
00:48:47.760 | But we really love it.
00:48:48.920 | And we use it for exactly what you said,
00:48:51.040 | like managing the entire set of stuff that needs to happen
00:48:55.200 | so that in semi-real time,
00:48:57.280 | we get all the updates from Slack or Notion
00:48:59.880 | or GitHub into the system.
00:49:02.240 | And whenever we see that piece of information goes through,
00:49:05.080 | maybe trigger workflows because to run agents,
00:49:08.520 | because they need to provide alerts to users
00:49:10.160 | and stuff like that.
00:49:11.000 | And Temporal is great.
00:49:12.360 | Love it.
00:49:13.200 | - You haven't evaluated others.
00:49:14.480 | You don't want to build your own.
00:49:15.940 | You're happy with-
00:49:16.880 | - Oh no, we're not in the business
00:49:18.520 | of replacing Temporal. - Building orchestrators.
00:49:20.360 | - And Temporal is so,
00:49:21.540 | I mean, or any other competitive product,
00:49:23.680 | they're very general.
00:49:24.840 | If it's there, there's an interesting theory
00:49:26.600 | about buy versus build.
00:49:28.040 | I think in that case, when you're a high-growth company,
00:49:31.740 | your buy-build trade-off is very much on the side of buy,
00:49:35.400 | because if you have the capability,
00:49:36.240 | you're just going to be saving time,
00:49:37.600 | you can focus on your core competency, et cetera.
00:49:39.840 | And it's funny because we're seeing,
00:49:41.040 | we're starting to see the post-high-growth company,
00:49:43.880 | post-SKF company,
00:49:45.240 | going back on that trade-off, interestingly.
00:49:47.320 | So that's the cloud and use
00:49:48.800 | about removing Zendesk and Salesforce.
00:49:51.080 | - Do you believe that, by the way?
00:49:52.080 | - Yeah, I do share the pockets with them.
00:49:54.000 | - Oh yeah? - It's true, yeah.
00:49:54.840 | - No, no, I know, of course they say it's true,
00:49:56.640 | but like also how well is it going to go?
00:49:58.360 | - So I'm not talking about deflecting
00:50:01.160 | the customer traffic.
00:50:02.660 | I'm talking about building AI
00:50:04.800 | on top of Salesforce and Zendesk,
00:50:06.640 | basically, if I understand correctly.
00:50:08.360 | And all of a sudden,
00:50:09.540 | your product surface become much smaller
00:50:12.640 | because you're interacting with an AI system
00:50:15.000 | that will take some actions.
00:50:16.400 | And so all of a sudden,
00:50:17.320 | you don't need the product layer anymore.
00:50:19.080 | And you realize that,
00:50:19.920 | oh, those things are just database
00:50:21.500 | that I pay a hundred times the price, right?
00:50:24.840 | Because you're a post-SKF company,
00:50:27.480 | and you have tech capabilities,
00:50:29.400 | you are incentivized to reduce your costs
00:50:31.400 | and you have the capability to do so.
00:50:32.760 | And then it makes sense to just scratch the SaaS away.
00:50:35.000 | So it's interesting that we might see
00:50:36.760 | kind of a bad time for SaaS
00:50:39.040 | in post-hyper-growth tech companies.
00:50:42.240 | So it's still a big market,
00:50:43.840 | but it's not that big
00:50:44.680 | because if you're not a tech company,
00:50:46.600 | you don't have the capabilities to reduce SaaS cost.
00:50:48.600 | If you're a high-growth company,
00:50:50.160 | always going to be buying
00:50:51.040 | because you go faster with that.
00:50:52.800 | But that's an interesting new space,
00:50:54.800 | new category of companies that might remove some SaaS.
00:50:57.560 | - Yeah, Alessio's firm has an interesting thesis
00:50:59.420 | on the future of SaaS in AI.
00:51:01.240 | - Service as a software, we call it.
00:51:02.840 | It's basically like,
00:51:03.680 | well, the most extreme is like,
00:51:05.040 | why is there any software at all?
00:51:06.520 | You know, ideally, it's all a labor interface
00:51:08.520 | where you're asking somebody to do something for you,
00:51:10.480 | whether that's a person, an AI agent or not.
00:51:12.760 | - Yeah, yeah, that's interesting.
00:51:14.560 | I have to ask.
00:51:15.400 | - Are you paying for Temporal Cloud or are you self-hosting?
00:51:17.840 | - Oh, no, no, we're paying, we're paying.
00:51:19.080 | - Oh, okay, interesting.
00:51:20.120 | - We're paying way too much.
00:51:21.680 | It's crazy expensive, but that makes us--
00:51:24.200 | - That's why as a shareholder, I like to hear that.
00:51:26.320 | - Makes us go faster, so we're happy to pay.
00:51:28.720 | - Other things in the infrastack,
00:51:29.960 | I just want a list for other founders to think about.
00:51:32.080 | Ops, API Gateway, evals, you know,
00:51:34.680 | anything interesting there that you build or buy?
00:51:37.560 | - I mean, there's always an interesting question.
00:51:39.320 | We've been building a lot around the interface
00:51:41.440 | between models and because Dust,
00:51:44.880 | the original version was an orchestration platform,
00:51:47.480 | and we basically provide a unified interface
00:51:50.200 | to every model providers.
00:51:52.160 | - That's what I call gateway.
00:51:53.200 | - That we add because Dust was that,
00:51:55.240 | and so we continued building upon, and we own it.
00:51:58.160 | But that's an interesting question,
00:51:59.360 | whether you want to build that or buy it.
00:52:02.400 | - I would say light LLM is the current open source consensus.
00:52:05.320 | - Exactly, yeah.
00:52:06.240 | There's an interesting question there.
00:52:08.080 | - Ops, Datadog, just tracking.
00:52:10.320 | - Oh yeah, so Datadog is an obvious,
00:52:12.640 | what are the mistakes that I regret?
00:52:14.960 | (laughing)
00:52:15.840 | I started as pure JavaScript, not TypeScript,
00:52:18.160 | and I think you want to, if you're wondering,
00:52:21.000 | oh, I want to go fast, I'll do a little bit of JavaScript.
00:52:22.760 | No, no, just start with TypeScript.
00:52:24.320 | - I see, I see, okay.
00:52:25.960 | - That is--
00:52:26.800 | - So interesting, you are a research engineer
00:52:29.440 | that came out of OpenAI that bet in TypeScript.
00:52:31.880 | - Well, the reality is that if you're building a product,
00:52:34.400 | you're going to be doing a lot of JavaScript, right?
00:52:36.680 | And Next, we're using Next as an example.
00:52:39.160 | It's a great platform, and our internal service
00:52:42.880 | is actually not built in Python either.
00:52:44.760 | It's built in Rust.
00:52:46.000 | - That's another fascinating choice.
00:52:47.320 | The Next.js story, it's interesting because Next.js
00:52:49.400 | is obviously the king of the world in JavaScript land,
00:52:51.720 | but recently, ChachiBT just rewrote from Next.js to Remix.
00:52:56.480 | We are going to be having them on
00:52:57.640 | to talk about the big rewrite.
00:52:58.920 | That is like the biggest news in front-end world in a while.
00:53:02.280 | - All right, just to wrap, in 2023,
00:53:04.640 | you predicted the first billion-dollar company
00:53:06.760 | with just one person running it,
00:53:08.160 | and you said that's basically like a sign of AGI,
00:53:10.160 | once we get there, and you said it had already been started.
00:53:13.640 | Any 2024 updates on the take?
00:53:16.440 | - That quote was probably independently invented it,
00:53:19.160 | but Sam Altman stole it from me eventually.
00:53:23.040 | But anyway, it's a good quote.
00:53:25.920 | I hypothesize it was maybe already being started,
00:53:28.480 | but if it's a uni-person company,
00:53:30.480 | it would probably grow really fast,
00:53:32.080 | and so we should probably see it already.
00:53:34.240 | I guess we're going to have to wait for it a little bit,
00:53:36.920 | and I think it's because the dust of the world don't exist,
00:53:39.600 | and so you don't have that thing that lets you run those,
00:53:43.200 | just do anything with models.
00:53:45.040 | But one thing that is exciting is maybe that
00:53:48.800 | we're going to be able to scale a team
00:53:51.520 | much further than before.
00:53:53.280 | Our generation of company might be
00:53:54.560 | the first billion-dollar companies
00:53:56.600 | with engineering teams of 20 people.
00:53:58.240 | That would be so exciting as well.
00:53:59.960 | That would be so great, you know?
00:54:01.200 | You don't have the management hurdle,
00:54:02.680 | you're just 20 focused people
00:54:04.160 | with a lot of assistance from machines to achieve your job.
00:54:07.720 | That would be great, and that I believe in a bit more.
00:54:10.320 | - Yeah, I've written a post
00:54:11.560 | called "Maximum Enterprise Utilization,"
00:54:13.760 | kind of like you have MFU for GPUs,
00:54:15.720 | but it's basically like so many people are focused on,
00:54:18.000 | oh, it's kind of like displaced jobs and whatnot,
00:54:20.600 | but I'm like, there's so much work that people don't do
00:54:22.840 | because they don't have the people,
00:54:24.240 | and maybe the question is that you just don't scale
00:54:26.000 | to that size, you know, to begin with,
00:54:28.160 | and maybe everybody will use dust,
00:54:29.760 | and dust is only going to be 20 people,
00:54:31.960 | and then people using dust will be two people.
00:54:34.720 | - So my hot take is I actually know what vertical
00:54:37.120 | they will be in.
00:54:37.960 | They'll be content creators and podcasters.
00:54:39.960 | - There's already two of us, so we're at max capacity.
00:54:43.320 | - Most people would regard Jimmy Donaldson,
00:54:45.200 | like Mr. Beast, as a billionaire,
00:54:46.840 | but his team is, he's got about like 200 people,
00:54:49.400 | so he's not a single person company.
00:54:50.920 | The closer one actually is Joe Rogan,
00:54:52.760 | where he basically just has like a guy.
00:54:54.760 | - Hey, Jamie, put it on the screen.
00:54:56.200 | - Exactly, exactly, but Joe, I don't think,
00:54:58.680 | he sold his future for 250 million to Spotify,
00:55:01.480 | so he's not going to hit that billionaire status.
00:55:03.400 | The non-consensus one will be the hot girl,
00:55:05.520 | who just started a podcast, anyway,
00:55:08.480 | but like you want creators who are empowered
00:55:11.320 | by a bunch of agents, dust agents, to do all this stuff,
00:55:16.000 | because then ultimately it's just the brand, the curation.
00:55:19.880 | What is the role of the human then?
00:55:21.080 | What is that one person supposed to do
00:55:22.800 | if you have all these agents?
00:55:24.160 | - That's a good question.
00:55:25.720 | I mean, I think it was,
00:55:27.280 | I think it was Pinterest or Dropbox founder at the time was,
00:55:32.080 | when you're CEO, you mostly have an editorial position.
00:55:35.360 | You're here to say yes and no
00:55:37.080 | to the things you are supposed to do.
00:55:38.480 | - Okay, so I make a daily AI newsletter,
00:55:41.840 | where I just, it's 99% AI generated,
00:55:44.640 | but I serve the role as the editor.
00:55:46.360 | Like I write commentary, I choose between four options.
00:55:49.440 | - You decide what goes in and goes out.
00:55:51.120 | And ultimately, as you said,
00:55:52.240 | you build up your brand through those many decisions.
00:55:54.960 | - So you should pursue creators.
00:55:56.960 | (laughs)
00:55:57.880 | - Yeah, that's true.
00:55:58.720 | And you've made a, I think you've made a,
00:56:00.680 | you've have an upcoming podcast with Notebook NLM,
00:56:02.760 | which has been doing a crazy stuff.
00:56:04.120 | - Oh yeah, that is exciting.
00:56:05.480 | They were just in here yesterday.
00:56:06.760 | I'll tell you one agent that we need,
00:56:08.200 | if you want to pursue the creator market,
00:56:09.680 | the one agent that we haven't paid for
00:56:11.280 | is our video editor agent.
00:56:13.160 | So if you want, you need to, you know,
00:56:15.320 | wrap FFmpeg in a GPT.
00:56:17.800 | (laughs)
00:56:20.040 | - Awesome, this was great.
00:56:21.640 | Anything we missed?
00:56:23.280 | Any final kind of like call to action hiring?
00:56:25.560 | It's like, obviously people should buy the product
00:56:27.360 | and pay you.
00:56:28.200 | - Yeah, obviously.
00:56:29.040 | And no, I think we didn't dive into the vertical
00:56:31.840 | versus horizontal approach to AI agents.
00:56:34.720 | - Quick take on that, yeah.
00:56:35.960 | - We mentioned a few things.
00:56:37.040 | We spike at penetration and that's just awesome
00:56:39.400 | because we carry the tool
00:56:40.320 | that the entire company has and use.
00:56:42.720 | So we create a ton of value,
00:56:44.200 | but it makes our go-to-market much harder.
00:56:46.080 | Vertical solutions have a go-to-market
00:56:47.600 | that is much easier because they're like,
00:56:49.400 | "Oh, I'm going to solve the lawyer stuff."
00:56:53.360 | But the potential within the company after that is limited.
00:56:56.320 | So there's really a nice tension there.
00:56:58.000 | We're true believers of the horizontal approach
00:57:01.560 | and we'll see how that plays out.
00:57:03.400 | But I think it's an interesting thing to think about
00:57:06.280 | when as a founder or as a technical person
00:57:08.840 | working with agents,
00:57:10.120 | what do you want to solve?
00:57:11.160 | Do you want to solve something general
00:57:12.520 | or do you want to solve something specific?
00:57:13.960 | And it has a lot of impact on eventually
00:57:15.920 | what type of company you're going to build.
00:57:17.200 | - Yeah, I'll provide you my response on that.
00:57:19.040 | So I've gone the other way.
00:57:19.880 | I've gone products over platform.
00:57:21.280 | And it's basically your sense on the products
00:57:23.520 | drives your platform development.
00:57:26.160 | In other words, if you're trying to be as many things
00:57:28.480 | to as many people as possible,
00:57:29.680 | we're just trying to be one thing.
00:57:30.800 | We build our brand in one specific niche.
00:57:33.640 | And in future, if we want to choose to spin off platforms
00:57:36.360 | for other things, we can because we have that brand.
00:57:38.840 | So for example, Perplexity,
00:57:40.600 | we went for products in search, right?
00:57:42.760 | But then we also have Perplexity Labs
00:57:44.480 | that like, here's the info that we use for search
00:57:46.200 | and whatever.
00:57:47.040 | - The contrary argument to that is that
00:57:48.960 | you always can have lateral movement within companies,
00:57:51.600 | but if you're Zendesk,
00:57:54.200 | you're not going to be...
00:57:55.600 | - Zendesk...
00:57:56.440 | - Serving...
00:57:57.280 | - Web services.
00:57:58.120 | (laughing)
00:57:59.080 | There are a few, you know,
00:58:00.080 | there's success stories on both sides.
00:58:01.600 | There's Amazon and Amazon Web Services.
00:58:03.040 | - And sorry by platform,
00:58:04.400 | I don't really mean the platform as the platform platform.
00:58:06.920 | I mean like the product that is useful
00:58:09.120 | to everybody within the company.
00:58:10.560 | And I'll take on that is that
00:58:12.200 | there is so many operations within the company.
00:58:14.280 | Some of them have been extremely rationalized by the market,
00:58:17.040 | like salespeople, like support
00:58:19.200 | has been extremely rationalized.
00:58:20.400 | And so you can probably create
00:58:21.800 | very powerful vertical product around that.
00:58:24.760 | But there is so many operations that make up a company
00:58:27.200 | that are specific to the company
00:58:28.720 | that you need a product to help people
00:58:32.640 | get assisted on those operations.
00:58:33.960 | And that's kind of the bet we have.
00:58:35.480 | - Excellent.
00:58:36.320 | - Awesome, man. Thanks again for the time.
00:58:37.520 | - Thank you very much for having me.
00:58:38.360 | It was so much fun.
00:58:39.240 | - Yeah, great discussion.
00:58:40.400 | - Thank you.
00:58:41.760 | (upbeat music)
00:58:44.360 | (upbeat music)
00:58:46.960 | (upbeat music)
00:58:49.560 | (gentle music)