back to index

Open sourcing the AI ecosystem ft. Arthur Mensch of Mistral AI and Matt Miller


Whisper Transcript | Transcript Only Page

00:00:00.000 | [MUSIC PLAYING]
00:00:03.200 | I'm excited to introduce our first speaker, Arthur
00:00:06.760 | from Mistral.
00:00:08.200 | Arthur is the founder and CEO of Mistral AI.
00:00:11.440 | Despite just being nine months old as a company
00:00:14.880 | and having many fewer resources than some
00:00:17.600 | of the large foundation model companies so far,
00:00:20.040 | I think they've really shocked everybody by putting out
00:00:22.360 | incredibly high quality models approaching GPT-4 and Calibre
00:00:26.000 | out into the open.
00:00:26.800 | So we're thrilled to have Arthur with us today,
00:00:29.400 | all the way from France, to share
00:00:30.880 | more about the opportunity behind building an open source.
00:00:35.160 | And please-- interviewing Arthur will be my partner,
00:00:38.440 | Matt Miller, who is dressed in his best French wear
00:00:41.400 | to honor Arthur today and helps lead our efforts in Europe.
00:00:46.220 | So please welcome Matt and Arthur.
00:00:48.320 | [APPLAUSE]
00:00:53.640 | With all the efficiency of a French train, right?
00:00:56.520 | Just--
00:00:57.680 | Right on time.
00:00:58.400 | Right on time.
00:00:59.040 | We were sweating a little bit back there
00:01:00.720 | because you just walked in the door.
00:01:03.040 | But good to see you.
00:01:03.840 | Thanks for coming all this way.
00:01:05.360 | Thanks for being with us here at AISN today.
00:01:08.040 | Thank you for hosting us.
00:01:09.760 | Yeah, absolutely.
00:01:11.120 | We'd love to maybe start with the background
00:01:13.480 | story of why you chose to start Mistral.
00:01:17.480 | And maybe just take us to the beginning.
00:01:20.440 | We all know about your successful career
00:01:22.640 | at DeepMind, your work on the Chinchilla paper.
00:01:25.920 | Tell us, maybe share with us-- we always
00:01:27.560 | love to hear at Sequoia, and I know that our founder community
00:01:30.140 | also loves to hear that spark that gave you
00:01:32.160 | the idea to launch and to start to break out and start
00:01:35.320 | your own company.
00:01:36.480 | Yeah, sure.
00:01:37.880 | So we started the company in April,
00:01:39.760 | but I guess the idea was out there for a couple of months
00:01:42.920 | before.
00:01:44.680 | Timothée and I were in master together.
00:01:46.840 | Guillaume and I were in school together.
00:01:48.480 | So we knew each other from before.
00:01:50.120 | And we had been in the field for 10 years doing research.
00:01:55.480 | And so we loved the way AI progressed
00:01:57.880 | because of the open exchanges that
00:01:59.840 | occurred between academic labs, industrial labs,
00:02:03.960 | and how everybody was able to build on top of one another.
00:02:08.280 | And it was still the case, I guess,
00:02:11.720 | even in the beginning of the LLM era,
00:02:13.720 | where OpenAI and DeepMind were actually
00:02:19.120 | contributing to one another roadmap.
00:02:23.360 | And this kind of stopped in 2022.
00:02:26.000 | So basically, one of the last papers
00:02:29.040 | doing important changes to the way we train models
00:02:31.480 | was Chinchilla.
00:02:32.360 | And that was the last model that Google ever
00:02:35.080 | published, last important model in the field
00:02:38.280 | that Google published.
00:02:39.760 | And so for us, it was a bit of a shame
00:02:42.560 | that the field stopped doing open contributions
00:02:48.120 | that early in the AI journey because we were very
00:02:50.180 | far away from finishing it.
00:02:53.160 | And so when we saw Chad GPT at the end of the year--
00:02:56.800 | and I think we reflected on the fact
00:03:00.480 | that there was some opportunity for doing things differently,
00:03:03.160 | for doing things from France.
00:03:04.840 | Because in France, as it turned out,
00:03:07.320 | there was a lot of talented people
00:03:09.320 | that were a bit bored in big tech companies.
00:03:12.400 | And so that's how we figured out that there
00:03:14.600 | was an opportunity for building very strong open source
00:03:17.800 | models, going very fast with a lean team of experienced people,
00:03:22.600 | and try to correct the direction that the field was taking.
00:03:27.280 | So we wanted to push the open source models much more.
00:03:31.520 | And I think we did a good job at that
00:03:33.060 | because we've been followed by various companies
00:03:36.720 | in our trajectory.
00:03:38.720 | Wonderful.
00:03:39.360 | And so it was really a lot of the open source movement
00:03:42.000 | was a lot of the drive behind starting the company.
00:03:45.440 | Yeah, that was one of the driver--
00:03:50.480 | our intention and the mission that we gave ourselves
00:03:52.720 | is really to bring AI to the hands of every developer.
00:03:55.800 | And the way it was done and the way
00:03:57.280 | it is still done by our competitors is very close.
00:04:00.400 | And so we want to push a much more open platform.
00:04:03.120 | And we want to spread the adoption
00:04:04.680 | and accelerate the adoption through that strategy.
00:04:07.680 | So that's very much at the core--
00:04:11.760 | the reason why we started the company.
00:04:13.920 | Wonderful.
00:04:14.720 | And just recently, I mean, fast forward to today,
00:04:18.200 | you released Mistral Large.
00:04:19.520 | You've been on this tear of amazing partnerships
00:04:22.120 | with Microsoft, Snowflake, Databricks, announcers.
00:04:25.120 | So how do you balance what you're
00:04:27.540 | going to do open source with what you're
00:04:29.200 | going to do commercially?
00:04:30.280 | And how you're going to think about the trade-off?
00:04:33.160 | Because that's something that many open source companies
00:04:35.640 | contend with.
00:04:36.560 | How do they keep their community thriving?
00:04:38.320 | But then how do they also build a successful business
00:04:40.120 | to contribute to their community?
00:04:41.840 | Yeah, it's a hard question.
00:04:43.200 | And the way we've addressed it is currently
00:04:45.040 | through two families of model.
00:04:46.880 | But this might evolve with time.
00:04:49.320 | We intend to stay the leader in open source.
00:04:51.160 | So that kind of puts a pressure on the open source family,
00:04:54.320 | because there's obviously some contenders out there.
00:04:58.840 | I think compared to how various software providers playing
00:05:02.280 | this strategy developed, we need to go faster.
00:05:06.200 | Because AI develops actually faster than software.
00:05:08.240 | It develops faster than databases.
00:05:10.560 | MongoDB played a very good game at that.
00:05:12.480 | And this is a good example of what we could do.
00:05:16.480 | But we need to adapt faster.
00:05:17.880 | So yeah, there's obviously this tension.
00:05:21.080 | And we are constantly thinking on how we should contribute
00:05:23.880 | to the community, but also how we should show and start
00:05:27.720 | getting some commercial adoption, enterprise deals,
00:05:31.000 | et cetera.
00:05:31.520 | And there's obviously a tension.
00:05:34.160 | And for now, I think we've done a good job at doing it.
00:05:36.600 | But it's a very dynamic thing to think through.
00:05:39.960 | So it's basically every week we think
00:05:41.560 | of what we should release next on both families.
00:05:44.720 | And you have been the fastest in developing models,
00:05:48.600 | fastest reaching different benchmarking levels,
00:05:51.800 | one of the most leanest in amount of expenditure
00:05:54.560 | to reach these benchmarks out of any of the foundational model
00:05:58.640 | companies.
00:05:59.160 | What do you think is giving you that advantage
00:06:01.640 | to move quicker than your predecessors
00:06:05.040 | and more efficiently?
00:06:06.240 | I think we like to get our hands dirty.
00:06:12.200 | Machine learning has always been about crunching numbers,
00:06:15.820 | looking at your data, doing a lot of extract, transform,
00:06:19.600 | and load, and things that are oftentimes not fascinating.
00:06:23.440 | And so we hire people that were willing to do that stuff.
00:06:27.800 | And I think that has been critical to our speed.
00:06:32.240 | And that's something that we want to keep up.
00:06:34.520 | Awesome.
00:06:35.440 | And in addition to the large model,
00:06:37.900 | you also have several small models
00:06:39.360 | that are extremely popular.
00:06:41.200 | When would you tell people that they should spend their time
00:06:43.320 | working with you on the small models?
00:06:44.680 | When would you tell them working on the large models?
00:06:46.520 | And where do you think the economic opportunity
00:06:48.840 | for Mistral lies?
00:06:49.840 | Is it in doing more of the big or doing more of the small?
00:06:53.920 | And I think this is an observation
00:06:56.560 | that every LLM provider has made,
00:06:59.640 | that one size does not fit all.
00:07:02.800 | And depending on what you want to--
00:07:06.000 | when you make an application, you typically
00:07:07.840 | have different large language model calls.
00:07:10.220 | And some should be low latency, because they don't
00:07:12.400 | require a lot of intelligence.
00:07:13.680 | But some should be higher latency
00:07:15.140 | and require more intelligence.
00:07:16.820 | And an efficient application should leverage both of them,
00:07:20.040 | potentially using the large models as an orchestrator
00:07:23.200 | for the small ones.
00:07:25.480 | And I think the challenge here is,
00:07:27.160 | how do you make sure that everything works?
00:07:28.880 | So you end up with a system that is not only a model,
00:07:31.580 | but it's really two models plus an outer loop
00:07:33.720 | of calling your model, calling systems, calling functions.
00:07:37.440 | And I think some of the developer challenges
00:07:40.900 | that we also want to address is, how do you
00:07:43.580 | make sure that this works, that you can evaluate it properly?
00:07:46.340 | How do you make sure that you can do continuous integration?
00:07:48.940 | How do you change--
00:07:50.520 | how do you move from one version to another of a model
00:07:52.720 | and make sure that your application has actually
00:07:54.820 | improved and not deteriorated?
00:07:56.860 | So all of these things are addressed by various companies.
00:08:00.140 | But these are also things that we
00:08:01.900 | think should be core to our value proposition.
00:08:04.980 | And what are some of the most exciting things
00:08:07.420 | you see being built on Mistral?
00:08:08.800 | What are the things that you get really excited about,
00:08:11.040 | that you see the community doing or customers doing?
00:08:13.280 | I think pretty much every young startup in the Bay Area
00:08:17.000 | has been using it for fine-tuning purposes,
00:08:20.400 | for fast application making.
00:08:22.720 | So really, I think part of the value of Mistral, for instance,
00:08:26.240 | is that it's very fast.
00:08:27.240 | And so you can make applications that are more involved.
00:08:31.240 | And so we've seen web search companies using us.
00:08:35.740 | We've seen all of the standard enterprise stuff
00:08:39.900 | as well, like knowledge management, marketing.
00:08:44.140 | The fact that you have access to the weights
00:08:45.940 | means that you can pour in your editorial tone much more.
00:08:48.940 | So that's-- yeah, we see the typical use cases.
00:08:52.300 | I think the-- but the value is that--
00:08:55.580 | or the open source part is that developers have control,
00:08:58.980 | so they can deploy it everywhere.
00:09:00.400 | They can have very high quality of service
00:09:02.100 | because they can use their dedicated instances,
00:09:05.380 | for instance.
00:09:06.180 | And they can modify the weights to suit their needs
00:09:08.820 | and to bump the performance to a level which
00:09:10.960 | is close to the largest ones, the largest models,
00:09:14.100 | while being much cheaper.
00:09:15.940 | And what's the next big thing do you
00:09:17.820 | think that we're going to get to see from you guys?
00:09:19.660 | Can you give us a sneak peek of what might be coming soon,
00:09:22.080 | or what we should be expecting from Mistral?
00:09:24.340 | Yeah, for sure.
00:09:24.980 | So we have-- so Mistral-Large was good, but not good enough.
00:09:28.780 | So we are working on improving it quite heavily.
00:09:31.540 | We have interesting open source models
00:09:35.140 | on various vertical domains that we'll be announcing very soon.
00:09:40.660 | We have-- the platform is currently just APIs,
00:09:43.580 | like serverless APIs.
00:09:45.340 | And so we are working on making customization part of it,
00:09:47.740 | so the fine tuning part.
00:09:50.940 | And obviously, and I think as many other companies,
00:09:53.860 | we're heavily betting on multilingual data
00:09:57.740 | and multilingual model.
00:09:59.700 | Because as a European company, we're also well-positioned.
00:10:02.780 | And this is a demand of our customers
00:10:05.100 | that I think is higher than here.
00:10:08.220 | And then, yeah, eventually, in the months to come,
00:10:11.740 | we will also release some multimodal models.
00:10:15.340 | OK, exciting.
00:10:16.420 | We'll look forward to that.
00:10:18.300 | As you mentioned, many of the people in this room
00:10:20.300 | are using Mistral models.
00:10:21.420 | Many of the companies we work with every day
00:10:23.220 | here in the Silicon Valley ecosystem
00:10:25.020 | are already working with Mistral.
00:10:27.020 | How should they work with you?
00:10:28.540 | And how should they work with the company?
00:10:30.260 | And what's the best way for them to work with you?
00:10:34.000 | Well, they can reach out.
00:10:35.300 | So we have some developer relations
00:10:37.620 | that are really pushing the community forward,
00:10:40.620 | making guides, also gathering use cases
00:10:45.140 | to showcase what you can build with Mistral models.
00:10:47.900 | So this is-- we're very investing a lot
00:10:51.060 | on the community.
00:10:52.900 | Something that basically makes the model better
00:10:56.020 | and that we are trying to set up is our ways for us
00:11:00.340 | to get evaluations, benchmarks, actual use cases on which we
00:11:04.060 | can evaluate our models on.
00:11:05.660 | And so having a mapping of what people
00:11:07.980 | are building with our model is also a way for us
00:11:09.940 | to make a better generation of new open source models.
00:11:13.620 | And so please engage with us to discuss how we can help,
00:11:18.300 | how-- discuss your use cases.
00:11:19.860 | We can advertise it.
00:11:21.380 | We can also gather some insight of the new evaluations
00:11:25.340 | that we should add to our evaluation suit
00:11:27.020 | to verify that our models are getting better over time.
00:11:29.940 | And on the commercial side, our models
00:11:32.380 | are available on our platform.
00:11:33.580 | So the commercial models are actually
00:11:35.180 | working better than the open source ones.
00:11:38.660 | They're also available on various cloud providers
00:11:40.860 | so that it facilitates adoption for enterprises.
00:11:44.500 | And customization capabilities like fine tuning,
00:11:46.620 | which really made the value of the open source models,
00:11:49.020 | are actually coming very soon.
00:11:50.700 | Wonderful.
00:11:51.420 | And you talked a little bit about the benefits
00:11:53.500 | of being in Europe.
00:11:55.140 | You touched on it briefly.
00:11:56.660 | You're already this example, global example,
00:11:59.140 | of the great innovations that can come from Europe
00:12:01.340 | and are coming from Europe.
00:12:03.460 | Talk a little bit more about the advantages of building
00:12:06.180 | a business from France and building
00:12:07.780 | this company from Europe.
00:12:09.100 | The advantage and drawbacks, I guess.
00:12:11.060 | Yeah, both, both.
00:12:12.980 | I guess one advantage is that you
00:12:14.900 | have a very strong junior pool of talent.
00:12:18.020 | So there's a lot of people coming
00:12:20.460 | from masters in France, in Poland, in the UK
00:12:24.260 | that we can train in like three months and get them up to speed,
00:12:27.260 | get them basically producing as much as a million dollar
00:12:32.020 | engineer in the Bay Area for 10 times the cost.
00:12:36.100 | So that's kind of efficient.
00:12:37.780 | Shh, don't tell them all that.
00:12:39.020 | They're going to all hire people in France.
00:12:40.820 | I'm sure.
00:12:43.300 | Like the workforce is very good, engineers and machine
00:12:47.300 | learning engineers.
00:12:50.020 | Generally speaking, we have a lot of support from the state,
00:12:53.580 | which is actually more important in Europe than in the US.
00:12:56.620 | They tend to over-regulate a bit too fast.
00:12:59.380 | So we've been telling them not to,
00:13:00.980 | but they don't always listen.
00:13:03.660 | And then generally, I mean, yeah,
00:13:06.780 | like European companies like to work with us
00:13:08.700 | because we're European and we are better
00:13:11.820 | in European languages, as it turns out.
00:13:13.740 | Like French, the French Mistral Lodge
00:13:16.100 | is actually probably the strongest French model out
00:13:18.340 | there.
00:13:19.380 | So yeah, I guess that's not an advantage,
00:13:22.420 | but at least there's a lot of opportunities
00:13:24.300 | that are geographical and that we're leveraging.
00:13:26.300 | Wonderful.
00:13:27.180 | And paint the picture for us five years from now.
00:13:29.540 | Like I know that this world's moving so fast.
00:13:31.460 | I mean, just think of all the things
00:13:32.960 | you've gone through in the two years.
00:13:35.020 | It's not even two years old as a company.
00:13:36.820 | It's almost two years old as a company.
00:13:39.580 | But five years from now, where does Mistral sit?
00:13:42.540 | What do you think you have achieved?
00:13:44.040 | What does this landscape look like?
00:13:46.180 | So our bet is that basically the platform
00:13:49.620 | and the infrastructure of artificial intelligence
00:13:53.500 | will be open.
00:13:54.700 | And based on that, we'll be able to create assistance and then
00:13:59.040 | potentially autonomous agent.
00:14:00.820 | And we believe that we can become this platform
00:14:04.420 | by being the most open platform out there,
00:14:06.940 | by being independent from cloud providers, et cetera.
00:14:09.420 | So in five years from now, I have literally no idea
00:14:12.260 | of what this is going to look like.
00:14:13.720 | If you looked at the field in 2019,
00:14:17.160 | I don't think you could bet on where we would be today.
00:14:19.660 | But we are evolving toward more and more autonomous agents.
00:14:22.580 | We can do more and more tasks.
00:14:24.100 | I think the way we work is going to be changed profoundly.
00:14:26.860 | And making such agents and assistants
00:14:30.060 | is going to be easier and easier.
00:14:31.460 | So right now, we're focusing on the developer world.
00:14:33.740 | But I expect that AI technology is, in itself,
00:14:39.580 | so easily controllable through human languages,
00:14:43.660 | human language that potentially, at some point,
00:14:45.820 | the developer becomes the user.
00:14:47.760 | And so we are evolving toward any user
00:14:51.880 | being able to create its own assistant
00:14:54.520 | or its own autonomous agent.
00:14:56.080 | I'm pretty sure that in five years from now,
00:14:58.000 | this will be something that you learn to do at school.
00:15:02.000 | Awesome.
00:15:02.960 | Well, we have about five minutes left.
00:15:04.760 | Just want to open up in case there's any questions
00:15:06.840 | from the audience.
00:15:09.120 | Don't be shy.
00:15:09.880 | Sonia's got a question.
00:15:12.320 | How do you see the future of open source
00:15:14.120 | versus commercial models playing out for your company?
00:15:16.440 | I think you made a huge splash with open source at first.
00:15:18.820 | As you mentioned, some of the commercial models
00:15:20.780 | are even better now.
00:15:21.820 | How do you imagine that plays out
00:15:23.160 | over the next handful of years?
00:15:25.100 | Well, I guess the one thing we optimize for
00:15:26.860 | is to be able to continuously produce
00:15:28.480 | open models with a sustainable business model
00:15:30.800 | to actually fuel the development of the next generation.
00:15:36.760 | And so I think that, as I've said,
00:15:39.040 | this is going to evolve with time.
00:15:41.240 | But in order to stay relevant, we
00:15:42.880 | need to stay the best at producing open source
00:15:45.280 | models, at least on some part of the spectrum.
00:15:47.520 | So that can be the small models, that
00:15:49.020 | can be the very big models.
00:15:50.760 | And so that's very much something
00:15:52.400 | that sets the constraints of whatever we can do.
00:15:56.020 | Staying relevant in the open source world,
00:15:58.160 | staying the best solution for developers
00:16:01.320 | is really our mission, and we'll keep doing it.
00:16:05.320 | David?
00:16:06.820 | There's got to be questions from more than just
00:16:08.860 | the Sequoia partners, guys.
00:16:10.080 | Come on.
00:16:11.080 | Can you talk to us a little bit about Llama3 and Facebook
00:16:14.960 | and how you think about competition with them?
00:16:17.360 | Well, Llama3 is working on, I guess, making models.
00:16:21.040 | I'm not sure they will be open source.
00:16:22.640 | I have no idea of what's going on there.
00:16:24.520 | So far, I think we've been delivering
00:16:26.880 | faster and smaller models.
00:16:28.320 | So we expect to be continuing doing it.
00:16:30.520 | But generally, the good thing about open source
00:16:33.100 | is that it's never too much of a competition, because once you
00:16:36.560 | have, like, if you have several actors,
00:16:39.120 | normally that should actually benefit to everybody.
00:16:43.280 | And so there should be some--
00:16:45.280 | if they turn out to be very strong,
00:16:46.840 | there will be some cross-pollination,
00:16:48.340 | and we'll welcome it.
00:16:49.920 | One thing that's made you guys different
00:16:51.680 | from other proprietary model providers
00:16:53.440 | is the partnerships with Snowflakes and Databricks,
00:16:56.260 | for example, and running natively in their clouds,
00:16:58.720 | as opposed to just having API connectivity.
00:17:02.040 | Curious if you can talk about why you did those deals,
00:17:05.280 | and then also what you see as the future of, say,
00:17:07.760 | Databricks or Snowflake in the brave new MLM world.
00:17:11.500 | I guess you should ask them.
00:17:12.800 | But I think, generally speaking, AI models become very strong
00:17:16.680 | if they are connected to data and grounding information.
00:17:21.360 | As it turns out, the enterprise data
00:17:23.840 | is oftentimes either on Snowflake or on Databricks,
00:17:26.160 | or sometimes on AWS.
00:17:28.680 | And so being able for customers to be
00:17:32.560 | able to deploy the technology exactly where their data is
00:17:37.120 | is, I think, quite important.
00:17:38.520 | I expect that this will continue being the case,
00:17:44.440 | especially as, I believe, we'll move on to more
00:17:46.600 | stateful AI deployment.
00:17:47.920 | So today, we deploy serverless APIs with not much state.
00:17:52.240 | It's really like lambda functions.
00:17:55.560 | But as we go forward and as we make models more and more
00:17:58.040 | specialized, as we make them more tuned to use cases,
00:18:01.680 | and as we make them self-improving,
00:18:05.240 | you will have to manage state.
00:18:06.600 | And those could actually be part of the data cloud.
00:18:09.880 | So there's an open question of where do you put the AI state.
00:18:12.720 | And I think that's--
00:18:14.840 | my understanding is that Snowflake and Databricks
00:18:16.880 | would like it to be on their data cloud.
00:18:19.840 | - And I think there was a question right behind him,
00:18:22.440 | the gray switch.
00:18:24.280 | - I'm curious where you draw the line between openness
00:18:27.640 | and proprietary.
00:18:28.620 | So you're releasing the weights.
00:18:30.640 | Would you also be comfortable sharing more
00:18:32.840 | about how you train the models, the recipe for how you collect
00:18:35.600 | the data, how you do mixture of experts training?
00:18:37.720 | Or do you draw the line at, like, we release the weights
00:18:40.040 | and the rest is proprietary?
00:18:41.640 | - That's where we draw the line.
00:18:42.800 | And I think the reason for that is
00:18:44.240 | that it's a very competitive landscape.
00:18:46.400 | And so it's similar to the tension
00:18:51.160 | there is in between having some form of revenue
00:18:54.000 | to sustain the next generation.
00:18:56.000 | And there's also a tension between what you actually
00:18:59.520 | disclose and everything that--
00:19:02.600 | yeah, in order to stay ahead of the curve
00:19:05.680 | and not to give your recipe to your competitors.
00:19:08.920 | And so, again, this is a moving line.
00:19:12.280 | There's also some game theory at stake.
00:19:14.760 | Like, if everybody starts doing it, then we could do it.
00:19:17.920 | But for now, we are not taking this risk, indeed.
00:19:22.720 | - I'm curious, when another company releases weights
00:19:26.680 | for a model like Grok, for example,
00:19:29.800 | and you only see the weights, what kinds of practices
00:19:33.400 | do you guys do internally to see what you can learn from it?
00:19:36.800 | - You can't learn a lot of things from weights.
00:19:39.080 | We don't even look at it.
00:19:40.440 | It's actually too big for us to deploy.
00:19:42.120 | Grok is quite big.
00:19:45.480 | - Or was there any architecture learning?
00:19:48.920 | - I guess they are using, like, a mixture of experts,
00:19:51.760 | pretty standard setting, with a couple of tricks
00:19:56.000 | that I knew about, actually.
00:19:57.640 | Yeah, there's not a lot of things
00:20:01.720 | to learn from the recipe themselves
00:20:03.520 | by looking at the weights.
00:20:04.640 | You can try to infer things, but that's--
00:20:07.200 | like, reverse engineering is not that easy.
00:20:09.960 | It's basically compressing information,
00:20:11.600 | and it compresses information sufficiently highly
00:20:14.240 | so that you can't really find out what's going on.
00:20:16.960 | - The cube is coming.
00:20:26.040 | - It's OK.
00:20:27.680 | Yeah, I'm just curious about, like,
00:20:29.440 | what are you guys going to focus on?
00:20:31.520 | The model sizes, your opinions on that.
00:20:33.400 | Is, like, you guys going to still go on the small?
00:20:35.840 | Or, yeah, going to go with the larger ones, basically?
00:20:39.000 | - So model size are kind of set by, like, scaling loads.
00:20:43.200 | So it depends on, like, the compute you have.
00:20:45.760 | Based on the compute you have, based on the learning
00:20:48.640 | infrastructure you want to go to, you make some choices.
00:20:51.800 | And so you optimize for training cost and for inference cost.
00:20:54.360 | And then there's obviously--
00:20:57.680 | there's the weights in between, like, for--
00:21:01.480 | depends on the weight that you put on the training cost
00:21:03.800 | amortization.
00:21:05.920 | The more you amortize it, the more you can compress models.
00:21:10.040 | But basically, our goal is to be low latency
00:21:14.400 | and to be relevant on the reasoning front.
00:21:17.480 | So that means having a family of model
00:21:19.880 | that goes from the small ones to the very large ones.
00:21:22.680 | [END PLAYBACK]
00:21:27.160 | - Hi.
00:21:27.640 | Are there any plans for Mistral to expand
00:21:30.320 | into the application stack?
00:21:32.280 | So, for example, when AI released the custom GPTs
00:21:35.480 | and the assistance API, is that the direction
00:21:37.840 | that you think that Mistral will take in the future?
00:21:40.040 | - Yeah.
00:21:40.560 | So I think, as I've said, we're really
00:21:43.040 | focusing on the developer first.
00:21:45.120 | But there's many-- like, the frontier
00:21:48.280 | is pretty thin in between developers and users
00:21:50.280 | for this technology.
00:21:51.200 | So that's the reason why we released an assistant
00:21:54.840 | demonstrator called Le Chat, which is the cat in English.
00:21:57.920 | And the point here is to expose it to enterprises as well
00:22:02.960 | and make them able to connect their data,
00:22:06.280 | connect their context.
00:22:08.760 | I think that answers some need from our customers
00:22:14.120 | that many of the people we've been talking to
00:22:17.260 | are willing to adopt the technology,
00:22:18.760 | but they need an entry point.
00:22:19.920 | So if you just give them APIs, they're
00:22:21.800 | going to say, OK, but I need an integrator.
00:22:24.400 | And then if you don't have an integrator at hand,
00:22:26.440 | and oftentimes this is the case, it's
00:22:28.240 | good if you have an off-the-shelf solution
00:22:30.040 | at least to get them into the technology
00:22:32.200 | and show them what they could build for their core business.
00:22:34.740 | So that's the reason why we now have two product offerings.
00:22:37.240 | The first one, which is the platform,
00:22:38.780 | and then we have Le Chat, which should evolve into an enterprise
00:22:41.480 | off-the-shelf solution.
00:22:42.480 | - More over there.
00:22:47.800 | Just wondering, where would you be drawing the line
00:22:50.000 | between stop doing prompt engineering
00:22:52.320 | and start doing fine-tuning?
00:22:53.680 | Because a lot of my friends and our customers
00:22:55.760 | are suffering from where they should be,
00:22:57.960 | stop doing more prompt engineering.
00:23:00.440 | - I think that's the number one pain point
00:23:02.800 | that is hard to solve from a product standpoint.
00:23:08.680 | The question is, normally your workflow
00:23:11.400 | should be what should you evaluate on.
00:23:13.580 | And based on that, have your model find out
00:23:17.520 | a way of solving your task.
00:23:20.220 | And so right now, this is still a bit manual.
00:23:22.600 | You go and you have several versions of prompting.
00:23:26.120 | But this is something that actually AI can help solving.
00:23:29.660 | And I expect that this is going to grow more and more
00:23:31.900 | automatically across time.
00:23:33.940 | And this is something that we'd love to try and enable.
00:23:38.580 | - I wanted to ask a bit more of a personal question.
00:23:40.780 | As a founder in the cutting edge of AI,
00:23:43.700 | how do you balance your time between explore and exploit?
00:23:46.140 | How do you yourself stay on top of a field that's
00:23:48.460 | rapidly evolving and becoming larger and deeper every day?
00:23:52.100 | How do you stay on top?
00:23:53.580 | - So I think this question has--
00:23:56.340 | I mean, we explore on the science part, on the product part,
00:23:58.780 | and on the business part.
00:24:00.700 | And the way you balance it is effectively hard.
00:24:04.180 | For a startup, you do have to exploit a lot
00:24:06.460 | because you need to ship fast.
00:24:09.100 | But on the science part, for instance,
00:24:10.660 | you have two or three people that
00:24:12.060 | are working on the next generation of models.
00:24:14.820 | And sometimes they lose time.
00:24:16.060 | But if you don't do that, you are
00:24:17.500 | at risk of becoming irrelevant.
00:24:19.380 | And this is very true for the product side as well.
00:24:21.900 | So right now, we have a very simple product.
00:24:24.340 | But being able to try out new features
00:24:26.900 | and see how they pick up is something that we need to do.
00:24:31.420 | And on the business part, you never
00:24:33.180 | know who is actually quite mature enough
00:24:35.060 | to use your technology.
00:24:36.460 | So yeah, the balance between exploitation and exploration
00:24:41.580 | is something that we master well at the science level
00:24:43.820 | because we've been doing it for years.
00:24:46.060 | And somehow, it transcribes into the product and the business.
00:24:48.580 | But I guess we are currently still
00:24:50.580 | learning to do it properly.
00:24:53.740 | - So one more question from me, and then I think we'll be done.
00:24:56.540 | We're out of time.
00:24:57.340 | But in the scope of two years, models big,
00:25:01.780 | models small that have taken the world by storm,
00:25:04.660 | killer go-to-market partnerships,
00:25:07.220 | just tremendous momentum at the center of the AI ecosystem.
00:25:11.260 | What advice would you give to founders here?
00:25:13.220 | What you have achieved and the pace
00:25:14.700 | at which you have achieved is truly extraordinary.
00:25:17.700 | What advice would you give to people
00:25:19.260 | here who are at different levels of starting and running
00:25:21.080 | and building their own businesses
00:25:22.460 | around the AI opportunity?
00:25:25.100 | - I would say it's always day one.
00:25:27.020 | So I guess, yeah, we are--
00:25:30.220 | I mean, we got some mind share.
00:25:31.940 | But I mean, there's still many proof points
00:25:34.700 | that we need to establish.
00:25:36.540 | And so, yeah, like, being a founder
00:25:39.220 | is basically waking up every day and figuring out
00:25:42.420 | that you need to build everything from scratch
00:25:45.220 | every time, all the time.
00:25:46.540 | So it's, I guess, a bit exhausting.
00:25:49.420 | But it's also exhilarating.
00:25:51.740 | And so I would recommend to be quite ambitious, usually.
00:25:54.860 | Being more ambitious--
00:25:57.420 | I mean, ambition can get you very far.
00:26:00.900 | And so, yeah, you should dream big.
00:26:04.260 | That would be my advice.
00:26:06.180 | - Awesome.
00:26:06.740 | Thank you, Artur.
00:26:07.460 | Thanks for being with us today.
00:26:09.780 | (applause)
00:26:12.020 | (audience applauds)