Open sourcing the AI ecosystem ft. Arthur Mensch of Mistral AI and Matt Miller

00:00:00.000 | [MUSIC PLAYING]

00:00:03.200 | I'm excited to introduce our first speaker, Arthur

00:00:06.760 | from Mistral.

00:00:08.200 | Arthur is the founder and CEO of Mistral AI.

00:00:11.440 | Despite just being nine months old as a company

00:00:14.880 | and having many fewer resources than some

00:00:17.600 | of the large foundation model companies so far,

00:00:20.040 | I think they've really shocked everybody by putting out

00:00:22.360 | incredibly high quality models approaching GPT-4 and Calibre

00:00:26.000 | out into the open.

00:00:26.800 | So we're thrilled to have Arthur with us today,

00:00:29.400 | all the way from France, to share

00:00:30.880 | more about the opportunity behind building an open source.

00:00:35.160 | And please-- interviewing Arthur will be my partner,

00:00:38.440 | Matt Miller, who is dressed in his best French wear

00:00:41.400 | to honor Arthur today and helps lead our efforts in Europe.

00:00:46.220 | So please welcome Matt and Arthur.

00:00:48.320 | [APPLAUSE]

00:00:53.640 | With all the efficiency of a French train, right?

00:00:56.520 | Just--

00:00:57.680 | Right on time.

00:00:58.400 | Right on time.

00:00:59.040 | We were sweating a little bit back there

00:01:00.720 | because you just walked in the door.

00:01:03.040 | But good to see you.

00:01:03.840 | Thanks for coming all this way.

00:01:05.360 | Thanks for being with us here at AISN today.

00:01:08.040 | Thank you for hosting us.

00:01:09.760 | Yeah, absolutely.

00:01:11.120 | We'd love to maybe start with the background

00:01:13.480 | story of why you chose to start Mistral.

00:01:17.480 | And maybe just take us to the beginning.

00:01:20.440 | We all know about your successful career

00:01:22.640 | at DeepMind, your work on the Chinchilla paper.

00:01:25.920 | Tell us, maybe share with us-- we always

00:01:27.560 | love to hear at Sequoia, and I know that our founder community

00:01:30.140 | also loves to hear that spark that gave you

00:01:32.160 | the idea to launch and to start to break out and start

00:01:35.320 | your own company.

00:01:36.480 | Yeah, sure.

00:01:37.880 | So we started the company in April,

00:01:39.760 | but I guess the idea was out there for a couple of months

00:01:42.920 | before.

00:01:44.680 | Timothée and I were in master together.

00:01:46.840 | Guillaume and I were in school together.

00:01:48.480 | So we knew each other from before.

00:01:50.120 | And we had been in the field for 10 years doing research.

00:01:55.480 | And so we loved the way AI progressed

00:01:57.880 | because of the open exchanges that

00:01:59.840 | occurred between academic labs, industrial labs,

00:02:03.960 | and how everybody was able to build on top of one another.

00:02:08.280 | And it was still the case, I guess,

00:02:11.720 | even in the beginning of the LLM era,

00:02:13.720 | where OpenAI and DeepMind were actually

00:02:19.120 | contributing to one another roadmap.

00:02:23.360 | And this kind of stopped in 2022.

00:02:26.000 | So basically, one of the last papers

00:02:29.040 | doing important changes to the way we train models

00:02:31.480 | was Chinchilla.

00:02:32.360 | And that was the last model that Google ever

00:02:35.080 | published, last important model in the field

00:02:38.280 | that Google published.

00:02:39.760 | And so for us, it was a bit of a shame

00:02:42.560 | that the field stopped doing open contributions

00:02:48.120 | that early in the AI journey because we were very

00:02:50.180 | far away from finishing it.

00:02:53.160 | And so when we saw Chad GPT at the end of the year--

00:02:56.800 | and I think we reflected on the fact

00:03:00.480 | that there was some opportunity for doing things differently,

00:03:03.160 | for doing things from France.

00:03:04.840 | Because in France, as it turned out,

00:03:07.320 | there was a lot of talented people

00:03:09.320 | that were a bit bored in big tech companies.

00:03:12.400 | And so that's how we figured out that there

00:03:14.600 | was an opportunity for building very strong open source

00:03:17.800 | models, going very fast with a lean team of experienced people,

00:03:22.600 | and try to correct the direction that the field was taking.

00:03:27.280 | So we wanted to push the open source models much more.

00:03:31.520 | And I think we did a good job at that

00:03:33.060 | because we've been followed by various companies

00:03:36.720 | in our trajectory.

00:03:38.720 | Wonderful.

00:03:39.360 | And so it was really a lot of the open source movement

00:03:42.000 | was a lot of the drive behind starting the company.

00:03:45.440 | Yeah, that was one of the driver--

00:03:50.480 | our intention and the mission that we gave ourselves

00:03:52.720 | is really to bring AI to the hands of every developer.

00:03:55.800 | And the way it was done and the way

00:03:57.280 | it is still done by our competitors is very close.

00:04:00.400 | And so we want to push a much more open platform.

00:04:03.120 | And we want to spread the adoption

00:04:04.680 | and accelerate the adoption through that strategy.

00:04:07.680 | So that's very much at the core--

00:04:11.760 | the reason why we started the company.

00:04:13.920 | Wonderful.

00:04:14.720 | And just recently, I mean, fast forward to today,

00:04:18.200 | you released Mistral Large.

00:04:19.520 | You've been on this tear of amazing partnerships

00:04:22.120 | with Microsoft, Snowflake, Databricks, announcers.

00:04:25.120 | So how do you balance what you're

00:04:27.540 | going to do open source with what you're

00:04:29.200 | going to do commercially?

00:04:30.280 | And how you're going to think about the trade-off?

00:04:33.160 | Because that's something that many open source companies

00:04:35.640 | contend with.

00:04:36.560 | How do they keep their community thriving?

00:04:38.320 | But then how do they also build a successful business

00:04:40.120 | to contribute to their community?

00:04:41.840 | Yeah, it's a hard question.

00:04:43.200 | And the way we've addressed it is currently

00:04:45.040 | through two families of model.

00:04:46.880 | But this might evolve with time.

00:04:49.320 | We intend to stay the leader in open source.

00:04:51.160 | So that kind of puts a pressure on the open source family,

00:04:54.320 | because there's obviously some contenders out there.

00:04:58.840 | I think compared to how various software providers playing

00:05:02.280 | this strategy developed, we need to go faster.

00:05:06.200 | Because AI develops actually faster than software.

00:05:08.240 | It develops faster than databases.

00:05:10.560 | MongoDB played a very good game at that.

00:05:12.480 | And this is a good example of what we could do.

00:05:16.480 | But we need to adapt faster.

00:05:17.880 | So yeah, there's obviously this tension.

00:05:21.080 | And we are constantly thinking on how we should contribute

00:05:23.880 | to the community, but also how we should show and start

00:05:27.720 | getting some commercial adoption, enterprise deals,

00:05:31.000 | et cetera.

00:05:31.520 | And there's obviously a tension.

00:05:34.160 | And for now, I think we've done a good job at doing it.

00:05:36.600 | But it's a very dynamic thing to think through.

00:05:39.960 | So it's basically every week we think

00:05:41.560 | of what we should release next on both families.

00:05:44.720 | And you have been the fastest in developing models,

00:05:48.600 | fastest reaching different benchmarking levels,

00:05:51.800 | one of the most leanest in amount of expenditure

00:05:54.560 | to reach these benchmarks out of any of the foundational model

00:05:58.640 | companies.

00:05:59.160 | What do you think is giving you that advantage

00:06:01.640 | to move quicker than your predecessors

00:06:05.040 | and more efficiently?

00:06:06.240 | I think we like to get our hands dirty.

00:06:12.200 | Machine learning has always been about crunching numbers,

00:06:15.820 | looking at your data, doing a lot of extract, transform,

00:06:19.600 | and load, and things that are oftentimes not fascinating.

00:06:23.440 | And so we hire people that were willing to do that stuff.

00:06:27.800 | And I think that has been critical to our speed.

00:06:32.240 | And that's something that we want to keep up.

00:06:34.520 | Awesome.

00:06:35.440 | And in addition to the large model,

00:06:37.900 | you also have several small models

00:06:39.360 | that are extremely popular.

00:06:41.200 | When would you tell people that they should spend their time

00:06:43.320 | working with you on the small models?

00:06:44.680 | When would you tell them working on the large models?

00:06:46.520 | And where do you think the economic opportunity

00:06:48.840 | for Mistral lies?

00:06:49.840 | Is it in doing more of the big or doing more of the small?

00:06:53.920 | And I think this is an observation

00:06:56.560 | that every LLM provider has made,

00:06:59.640 | that one size does not fit all.

00:07:02.800 | And depending on what you want to--

00:07:06.000 | when you make an application, you typically

00:07:07.840 | have different large language model calls.

00:07:10.220 | And some should be low latency, because they don't

00:07:12.400 | require a lot of intelligence.

00:07:13.680 | But some should be higher latency

00:07:15.140 | and require more intelligence.

00:07:16.820 | And an efficient application should leverage both of them,

00:07:20.040 | potentially using the large models as an orchestrator

00:07:23.200 | for the small ones.

00:07:25.480 | And I think the challenge here is,

00:07:27.160 | how do you make sure that everything works?

00:07:28.880 | So you end up with a system that is not only a model,

00:07:31.580 | but it's really two models plus an outer loop

00:07:33.720 | of calling your model, calling systems, calling functions.

00:07:37.440 | And I think some of the developer challenges

00:07:40.900 | that we also want to address is, how do you

00:07:43.580 | make sure that this works, that you can evaluate it properly?

00:07:46.340 | How do you make sure that you can do continuous integration?

00:07:48.940 | How do you change--

00:07:50.520 | how do you move from one version to another of a model

00:07:52.720 | and make sure that your application has actually

00:07:54.820 | improved and not deteriorated?

00:07:56.860 | So all of these things are addressed by various companies.

00:08:00.140 | But these are also things that we

00:08:01.900 | think should be core to our value proposition.

00:08:04.980 | And what are some of the most exciting things

00:08:07.420 | you see being built on Mistral?

00:08:08.800 | What are the things that you get really excited about,

00:08:11.040 | that you see the community doing or customers doing?

00:08:13.280 | I think pretty much every young startup in the Bay Area

00:08:17.000 | has been using it for fine-tuning purposes,

00:08:20.400 | for fast application making.

00:08:22.720 | So really, I think part of the value of Mistral, for instance,

00:08:26.240 | is that it's very fast.

00:08:27.240 | And so you can make applications that are more involved.

00:08:31.240 | And so we've seen web search companies using us.

00:08:35.740 | We've seen all of the standard enterprise stuff

00:08:39.900 | as well, like knowledge management, marketing.

00:08:44.140 | The fact that you have access to the weights

00:08:45.940 | means that you can pour in your editorial tone much more.

00:08:48.940 | So that's-- yeah, we see the typical use cases.

00:08:52.300 | I think the-- but the value is that--

00:08:55.580 | or the open source part is that developers have control,

00:08:58.980 | so they can deploy it everywhere.

00:09:00.400 | They can have very high quality of service

00:09:02.100 | because they can use their dedicated instances,

00:09:05.380 | for instance.

00:09:06.180 | And they can modify the weights to suit their needs

00:09:08.820 | and to bump the performance to a level which

00:09:10.960 | is close to the largest ones, the largest models,

00:09:14.100 | while being much cheaper.

00:09:15.940 | And what's the next big thing do you

00:09:17.820 | think that we're going to get to see from you guys?

00:09:19.660 | Can you give us a sneak peek of what might be coming soon,

00:09:22.080 | or what we should be expecting from Mistral?

00:09:24.340 | Yeah, for sure.

00:09:24.980 | So we have-- so Mistral-Large was good, but not good enough.

00:09:28.780 | So we are working on improving it quite heavily.

00:09:31.540 | We have interesting open source models

00:09:35.140 | on various vertical domains that we'll be announcing very soon.

00:09:40.660 | We have-- the platform is currently just APIs,

00:09:43.580 | like serverless APIs.

00:09:45.340 | And so we are working on making customization part of it,

00:09:47.740 | so the fine tuning part.

00:09:50.940 | And obviously, and I think as many other companies,

00:09:53.860 | we're heavily betting on multilingual data

00:09:57.740 | and multilingual model.

00:09:59.700 | Because as a European company, we're also well-positioned.

00:10:02.780 | And this is a demand of our customers

00:10:05.100 | that I think is higher than here.

00:10:08.220 | And then, yeah, eventually, in the months to come,

00:10:11.740 | we will also release some multimodal models.

00:10:15.340 | OK, exciting.

00:10:16.420 | We'll look forward to that.

00:10:18.300 | As you mentioned, many of the people in this room

00:10:20.300 | are using Mistral models.

00:10:21.420 | Many of the companies we work with every day

00:10:23.220 | here in the Silicon Valley ecosystem

00:10:25.020 | are already working with Mistral.

00:10:27.020 | How should they work with you?

00:10:28.540 | And how should they work with the company?

00:10:30.260 | And what's the best way for them to work with you?

00:10:34.000 | Well, they can reach out.

00:10:35.300 | So we have some developer relations

00:10:37.620 | that are really pushing the community forward,

00:10:40.620 | making guides, also gathering use cases

00:10:45.140 | to showcase what you can build with Mistral models.

00:10:47.900 | So this is-- we're very investing a lot

00:10:51.060 | on the community.

00:10:52.900 | Something that basically makes the model better

00:10:56.020 | and that we are trying to set up is our ways for us

00:11:00.340 | to get evaluations, benchmarks, actual use cases on which we

00:11:04.060 | can evaluate our models on.

00:11:05.660 | And so having a mapping of what people

00:11:07.980 | are building with our model is also a way for us

00:11:09.940 | to make a better generation of new open source models.

00:11:13.620 | And so please engage with us to discuss how we can help,

00:11:18.300 | how-- discuss your use cases.

00:11:19.860 | We can advertise it.

00:11:21.380 | We can also gather some insight of the new evaluations

00:11:25.340 | that we should add to our evaluation suit

00:11:27.020 | to verify that our models are getting better over time.

00:11:29.940 | And on the commercial side, our models

00:11:32.380 | are available on our platform.

00:11:33.580 | So the commercial models are actually

00:11:35.180 | working better than the open source ones.

00:11:38.660 | They're also available on various cloud providers

00:11:40.860 | so that it facilitates adoption for enterprises.

00:11:44.500 | And customization capabilities like fine tuning,

00:11:46.620 | which really made the value of the open source models,

00:11:49.020 | are actually coming very soon.

00:11:50.700 | Wonderful.

00:11:51.420 | And you talked a little bit about the benefits

00:11:53.500 | of being in Europe.

00:11:55.140 | You touched on it briefly.

00:11:56.660 | You're already this example, global example,

00:11:59.140 | of the great innovations that can come from Europe

00:12:01.340 | and are coming from Europe.

00:12:03.460 | Talk a little bit more about the advantages of building

00:12:06.180 | a business from France and building

00:12:07.780 | this company from Europe.

00:12:09.100 | The advantage and drawbacks, I guess.

00:12:11.060 | Yeah, both, both.

00:12:12.980 | I guess one advantage is that you

00:12:14.900 | have a very strong junior pool of talent.

00:12:18.020 | So there's a lot of people coming

00:12:20.460 | from masters in France, in Poland, in the UK

00:12:24.260 | that we can train in like three months and get them up to speed,

00:12:27.260 | get them basically producing as much as a million dollar

00:12:32.020 | engineer in the Bay Area for 10 times the cost.

00:12:36.100 | So that's kind of efficient.

00:12:37.780 | Shh, don't tell them all that.

00:12:39.020 | They're going to all hire people in France.

00:12:40.820 | I'm sure.

00:12:43.300 | Like the workforce is very good, engineers and machine

00:12:47.300 | learning engineers.

00:12:50.020 | Generally speaking, we have a lot of support from the state,

00:12:53.580 | which is actually more important in Europe than in the US.

00:12:56.620 | They tend to over-regulate a bit too fast.

00:12:59.380 | So we've been telling them not to,

00:13:00.980 | but they don't always listen.

00:13:03.660 | And then generally, I mean, yeah,

00:13:06.780 | like European companies like to work with us

00:13:08.700 | because we're European and we are better

00:13:11.820 | in European languages, as it turns out.

00:13:13.740 | Like French, the French Mistral Lodge

00:13:16.100 | is actually probably the strongest French model out

00:13:18.340 | there.

00:13:19.380 | So yeah, I guess that's not an advantage,

00:13:22.420 | but at least there's a lot of opportunities

00:13:24.300 | that are geographical and that we're leveraging.

00:13:26.300 | Wonderful.

00:13:27.180 | And paint the picture for us five years from now.

00:13:29.540 | Like I know that this world's moving so fast.

00:13:31.460 | I mean, just think of all the things

00:13:32.960 | you've gone through in the two years.

00:13:35.020 | It's not even two years old as a company.

00:13:36.820 | It's almost two years old as a company.

00:13:39.580 | But five years from now, where does Mistral sit?

00:13:42.540 | What do you think you have achieved?

00:13:44.040 | What does this landscape look like?

00:13:46.180 | So our bet is that basically the platform

00:13:49.620 | and the infrastructure of artificial intelligence

00:13:53.500 | will be open.

00:13:54.700 | And based on that, we'll be able to create assistance and then

00:13:59.040 | potentially autonomous agent.

00:14:00.820 | And we believe that we can become this platform

00:14:04.420 | by being the most open platform out there,

00:14:06.940 | by being independent from cloud providers, et cetera.

00:14:09.420 | So in five years from now, I have literally no idea

00:14:12.260 | of what this is going to look like.

00:14:13.720 | If you looked at the field in 2019,

00:14:17.160 | I don't think you could bet on where we would be today.

00:14:19.660 | But we are evolving toward more and more autonomous agents.

00:14:22.580 | We can do more and more tasks.

00:14:24.100 | I think the way we work is going to be changed profoundly.

00:14:26.860 | And making such agents and assistants

00:14:30.060 | is going to be easier and easier.

00:14:31.460 | So right now, we're focusing on the developer world.

00:14:33.740 | But I expect that AI technology is, in itself,

00:14:39.580 | so easily controllable through human languages,

00:14:43.660 | human language that potentially, at some point,

00:14:45.820 | the developer becomes the user.

00:14:47.760 | And so we are evolving toward any user

00:14:51.880 | being able to create its own assistant

00:14:54.520 | or its own autonomous agent.

00:14:56.080 | I'm pretty sure that in five years from now,

00:14:58.000 | this will be something that you learn to do at school.

00:15:02.000 | Awesome.

00:15:02.960 | Well, we have about five minutes left.

00:15:04.760 | Just want to open up in case there's any questions

00:15:06.840 | from the audience.

00:15:09.120 | Don't be shy.

00:15:09.880 | Sonia's got a question.

00:15:12.320 | How do you see the future of open source

00:15:14.120 | versus commercial models playing out for your company?

00:15:16.440 | I think you made a huge splash with open source at first.

00:15:18.820 | As you mentioned, some of the commercial models

00:15:20.780 | are even better now.

00:15:21.820 | How do you imagine that plays out

00:15:23.160 | over the next handful of years?

00:15:25.100 | Well, I guess the one thing we optimize for

00:15:26.860 | is to be able to continuously produce

00:15:28.480 | open models with a sustainable business model

00:15:30.800 | to actually fuel the development of the next generation.

00:15:36.760 | And so I think that, as I've said,

00:15:39.040 | this is going to evolve with time.

00:15:41.240 | But in order to stay relevant, we

00:15:42.880 | need to stay the best at producing open source

00:15:45.280 | models, at least on some part of the spectrum.

00:15:47.520 | So that can be the small models, that

00:15:49.020 | can be the very big models.

00:15:50.760 | And so that's very much something

00:15:52.400 | that sets the constraints of whatever we can do.

00:15:56.020 | Staying relevant in the open source world,

00:15:58.160 | staying the best solution for developers

00:16:01.320 | is really our mission, and we'll keep doing it.

00:16:05.320 | David?

00:16:06.820 | There's got to be questions from more than just

00:16:08.860 | the Sequoia partners, guys.

00:16:10.080 | Come on.

00:16:11.080 | Can you talk to us a little bit about Llama3 and Facebook

00:16:14.960 | and how you think about competition with them?

00:16:17.360 | Well, Llama3 is working on, I guess, making models.

00:16:21.040 | I'm not sure they will be open source.

00:16:22.640 | I have no idea of what's going on there.

00:16:24.520 | So far, I think we've been delivering

00:16:26.880 | faster and smaller models.

00:16:28.320 | So we expect to be continuing doing it.

00:16:30.520 | But generally, the good thing about open source

00:16:33.100 | is that it's never too much of a competition, because once you

00:16:36.560 | have, like, if you have several actors,

00:16:39.120 | normally that should actually benefit to everybody.

00:16:43.280 | And so there should be some--

00:16:45.280 | if they turn out to be very strong,

00:16:46.840 | there will be some cross-pollination,

00:16:48.340 | and we'll welcome it.

00:16:49.920 | One thing that's made you guys different

00:16:51.680 | from other proprietary model providers

00:16:53.440 | is the partnerships with Snowflakes and Databricks,

00:16:56.260 | for example, and running natively in their clouds,

00:16:58.720 | as opposed to just having API connectivity.

00:17:02.040 | Curious if you can talk about why you did those deals,

00:17:05.280 | and then also what you see as the future of, say,

00:17:07.760 | Databricks or Snowflake in the brave new MLM world.

00:17:11.500 | I guess you should ask them.

00:17:12.800 | But I think, generally speaking, AI models become very strong

00:17:16.680 | if they are connected to data and grounding information.

00:17:21.360 | As it turns out, the enterprise data

00:17:23.840 | is oftentimes either on Snowflake or on Databricks,

00:17:26.160 | or sometimes on AWS.

00:17:28.680 | And so being able for customers to be

00:17:32.560 | able to deploy the technology exactly where their data is

00:17:37.120 | is, I think, quite important.

00:17:38.520 | I expect that this will continue being the case,

00:17:44.440 | especially as, I believe, we'll move on to more

00:17:46.600 | stateful AI deployment.

00:17:47.920 | So today, we deploy serverless APIs with not much state.

00:17:52.240 | It's really like lambda functions.

00:17:55.560 | But as we go forward and as we make models more and more

00:17:58.040 | specialized, as we make them more tuned to use cases,

00:18:01.680 | and as we make them self-improving,

00:18:05.240 | you will have to manage state.

00:18:06.600 | And those could actually be part of the data cloud.

00:18:09.880 | So there's an open question of where do you put the AI state.

00:18:12.720 | And I think that's--

00:18:14.840 | my understanding is that Snowflake and Databricks

00:18:16.880 | would like it to be on their data cloud.

00:18:19.840 | - And I think there was a question right behind him,

00:18:22.440 | the gray switch.

00:18:24.280 | - I'm curious where you draw the line between openness

00:18:27.640 | and proprietary.

00:18:28.620 | So you're releasing the weights.

00:18:30.640 | Would you also be comfortable sharing more

00:18:32.840 | about how you train the models, the recipe for how you collect

00:18:35.600 | the data, how you do mixture of experts training?

00:18:37.720 | Or do you draw the line at, like, we release the weights

00:18:40.040 | and the rest is proprietary?

00:18:41.640 | - That's where we draw the line.

00:18:42.800 | And I think the reason for that is

00:18:44.240 | that it's a very competitive landscape.

00:18:46.400 | And so it's similar to the tension

00:18:51.160 | there is in between having some form of revenue

00:18:54.000 | to sustain the next generation.

00:18:56.000 | And there's also a tension between what you actually

00:18:59.520 | disclose and everything that--

00:19:02.600 | yeah, in order to stay ahead of the curve

00:19:05.680 | and not to give your recipe to your competitors.

00:19:08.920 | And so, again, this is a moving line.

00:19:12.280 | There's also some game theory at stake.

00:19:14.760 | Like, if everybody starts doing it, then we could do it.

00:19:17.920 | But for now, we are not taking this risk, indeed.

00:19:22.720 | - I'm curious, when another company releases weights

00:19:26.680 | for a model like Grok, for example,

00:19:29.800 | and you only see the weights, what kinds of practices

00:19:33.400 | do you guys do internally to see what you can learn from it?

00:19:36.800 | - You can't learn a lot of things from weights.

00:19:39.080 | We don't even look at it.

00:19:40.440 | It's actually too big for us to deploy.

00:19:42.120 | Grok is quite big.

00:19:45.480 | - Or was there any architecture learning?

00:19:48.920 | - I guess they are using, like, a mixture of experts,

00:19:51.760 | pretty standard setting, with a couple of tricks

00:19:56.000 | that I knew about, actually.

00:19:57.640 | Yeah, there's not a lot of things

00:20:01.720 | to learn from the recipe themselves

00:20:03.520 | by looking at the weights.

00:20:04.640 | You can try to infer things, but that's--

00:20:07.200 | like, reverse engineering is not that easy.

00:20:09.960 | It's basically compressing information,

00:20:11.600 | and it compresses information sufficiently highly

00:20:14.240 | so that you can't really find out what's going on.

00:20:16.960 | - The cube is coming.

00:20:26.040 | - It's OK.

00:20:27.680 | Yeah, I'm just curious about, like,

00:20:29.440 | what are you guys going to focus on?

00:20:31.520 | The model sizes, your opinions on that.

00:20:33.400 | Is, like, you guys going to still go on the small?

00:20:35.840 | Or, yeah, going to go with the larger ones, basically?

00:20:39.000 | - So model size are kind of set by, like, scaling loads.

00:20:43.200 | So it depends on, like, the compute you have.

00:20:45.760 | Based on the compute you have, based on the learning

00:20:48.640 | infrastructure you want to go to, you make some choices.

00:20:51.800 | And so you optimize for training cost and for inference cost.

00:20:54.360 | And then there's obviously--

00:20:57.680 | there's the weights in between, like, for--

00:21:01.480 | depends on the weight that you put on the training cost

00:21:03.800 | amortization.

00:21:05.920 | The more you amortize it, the more you can compress models.

00:21:10.040 | But basically, our goal is to be low latency

00:21:14.400 | and to be relevant on the reasoning front.

00:21:17.480 | So that means having a family of model

00:21:19.880 | that goes from the small ones to the very large ones.

00:21:22.680 | [END PLAYBACK]

00:21:27.160 | - Hi.

00:21:27.640 | Are there any plans for Mistral to expand

00:21:30.320 | into the application stack?

00:21:32.280 | So, for example, when AI released the custom GPTs

00:21:35.480 | and the assistance API, is that the direction

00:21:37.840 | that you think that Mistral will take in the future?

00:21:40.040 | - Yeah.

00:21:40.560 | So I think, as I've said, we're really

00:21:43.040 | focusing on the developer first.

00:21:45.120 | But there's many-- like, the frontier

00:21:48.280 | is pretty thin in between developers and users

00:21:50.280 | for this technology.

00:21:51.200 | So that's the reason why we released an assistant

00:21:54.840 | demonstrator called Le Chat, which is the cat in English.

00:21:57.920 | And the point here is to expose it to enterprises as well

00:22:02.960 | and make them able to connect their data,

00:22:06.280 | connect their context.

00:22:08.760 | I think that answers some need from our customers

00:22:14.120 | that many of the people we've been talking to

00:22:17.260 | are willing to adopt the technology,

00:22:18.760 | but they need an entry point.

00:22:19.920 | So if you just give them APIs, they're

00:22:21.800 | going to say, OK, but I need an integrator.

00:22:24.400 | And then if you don't have an integrator at hand,

00:22:26.440 | and oftentimes this is the case, it's

00:22:28.240 | good if you have an off-the-shelf solution

00:22:30.040 | at least to get them into the technology

00:22:32.200 | and show them what they could build for their core business.

00:22:34.740 | So that's the reason why we now have two product offerings.

00:22:37.240 | The first one, which is the platform,

00:22:38.780 | and then we have Le Chat, which should evolve into an enterprise

00:22:41.480 | off-the-shelf solution.

00:22:42.480 | - More over there.

00:22:47.800 | Just wondering, where would you be drawing the line

00:22:50.000 | between stop doing prompt engineering

00:22:52.320 | and start doing fine-tuning?

00:22:53.680 | Because a lot of my friends and our customers

00:22:55.760 | are suffering from where they should be,

00:22:57.960 | stop doing more prompt engineering.

00:23:00.440 | - I think that's the number one pain point

00:23:02.800 | that is hard to solve from a product standpoint.

00:23:08.680 | The question is, normally your workflow

00:23:11.400 | should be what should you evaluate on.

00:23:13.580 | And based on that, have your model find out

00:23:17.520 | a way of solving your task.

00:23:20.220 | And so right now, this is still a bit manual.

00:23:22.600 | You go and you have several versions of prompting.

00:23:26.120 | But this is something that actually AI can help solving.

00:23:29.660 | And I expect that this is going to grow more and more

00:23:31.900 | automatically across time.

00:23:33.940 | And this is something that we'd love to try and enable.

00:23:38.580 | - I wanted to ask a bit more of a personal question.

00:23:40.780 | As a founder in the cutting edge of AI,

00:23:43.700 | how do you balance your time between explore and exploit?

00:23:46.140 | How do you yourself stay on top of a field that's

00:23:48.460 | rapidly evolving and becoming larger and deeper every day?

00:23:52.100 | How do you stay on top?

00:23:53.580 | - So I think this question has--

00:23:56.340 | I mean, we explore on the science part, on the product part,

00:23:58.780 | and on the business part.

00:24:00.700 | And the way you balance it is effectively hard.

00:24:04.180 | For a startup, you do have to exploit a lot

00:24:06.460 | because you need to ship fast.

00:24:09.100 | But on the science part, for instance,

00:24:10.660 | you have two or three people that

00:24:12.060 | are working on the next generation of models.

00:24:14.820 | And sometimes they lose time.

00:24:16.060 | But if you don't do that, you are

00:24:17.500 | at risk of becoming irrelevant.

00:24:19.380 | And this is very true for the product side as well.

00:24:21.900 | So right now, we have a very simple product.

00:24:24.340 | But being able to try out new features

00:24:26.900 | and see how they pick up is something that we need to do.

00:24:31.420 | And on the business part, you never

00:24:33.180 | know who is actually quite mature enough

00:24:35.060 | to use your technology.

00:24:36.460 | So yeah, the balance between exploitation and exploration

00:24:41.580 | is something that we master well at the science level

00:24:43.820 | because we've been doing it for years.

00:24:46.060 | And somehow, it transcribes into the product and the business.

00:24:48.580 | But I guess we are currently still

00:24:50.580 | learning to do it properly.

00:24:53.740 | - So one more question from me, and then I think we'll be done.

00:24:56.540 | We're out of time.

00:24:57.340 | But in the scope of two years, models big,

00:25:01.780 | models small that have taken the world by storm,

00:25:04.660 | killer go-to-market partnerships,

00:25:07.220 | just tremendous momentum at the center of the AI ecosystem.

00:25:11.260 | What advice would you give to founders here?

00:25:13.220 | What you have achieved and the pace

00:25:14.700 | at which you have achieved is truly extraordinary.

00:25:17.700 | What advice would you give to people

00:25:19.260 | here who are at different levels of starting and running

00:25:21.080 | and building their own businesses

00:25:22.460 | around the AI opportunity?

00:25:25.100 | - I would say it's always day one.

00:25:27.020 | So I guess, yeah, we are--

00:25:30.220 | I mean, we got some mind share.

00:25:31.940 | But I mean, there's still many proof points

00:25:34.700 | that we need to establish.

00:25:36.540 | And so, yeah, like, being a founder

00:25:39.220 | is basically waking up every day and figuring out

00:25:42.420 | that you need to build everything from scratch

00:25:45.220 | every time, all the time.

00:25:46.540 | So it's, I guess, a bit exhausting.

00:25:49.420 | But it's also exhilarating.

00:25:51.740 | And so I would recommend to be quite ambitious, usually.

00:25:54.860 | Being more ambitious--

00:25:57.420 | I mean, ambition can get you very far.

00:26:00.900 | And so, yeah, you should dream big.

00:26:04.260 | That would be my advice.

00:26:06.180 | - Awesome.

00:26:06.740 | Thank you, Artur.

00:26:07.460 | Thanks for being with us today.

00:26:09.780 | (applause)

00:26:12.020 | (audience applauds)