Cohere vs. OpenAI embeddings — multilingual search

00:00:00.000 | Today, we're going to take a look at Cohere's multilingual embedding model.

00:00:04.560 | For those of you that are not aware of Cohere,

00:00:07.600 | they are kind of similar to OpenAI in that they are essentially a service provider

00:00:13.360 | of large language models and all of the services that come with that.

00:00:18.080 | Now, right now they are not as well known as OpenAI,

00:00:22.840 | which is understandable, OpenAI has been around for a bit longer,

00:00:26.120 | but Cohere is actually a really good company that offers a lot of really good tooling

00:00:31.600 | that is actually very much comparable to what OpenAI offers.

00:00:36.400 | And that's actually the first thing I want to look at here.

00:00:38.600 | I just want to show you a few comparison points

00:00:41.600 | between Cohere and OpenAI in terms of embedding models.

00:00:46.200 | Okay, so we're going to first take a look at the cost between these two.

00:00:49.640 | OpenAI's sort of premier embedding model right now is Arda002,

00:00:54.720 | it comes out to this much per 1,000 tokens.

00:00:58.120 | Cohere doesn't have a per 1,000 tokens for the cost,

00:01:03.280 | it actually goes with $1 per 1,000 embeddings.

00:01:07.560 | What does one embedding mean?

00:01:09.360 | Well, basically every call or every chunk of text that you ask Cohere to embed,

00:01:15.160 | that is one embedding.

00:01:16.480 | So one embedding, the maximum size of that is actually just over 4,000 tokens.

00:01:23.000 | So if you're maxing out every embedding,

00:01:26.080 | as in you are sending 4,000 tokens to every embedding call,

00:01:31.720 | then that means you would be getting this comparable price here,

00:01:36.400 | which is actually half price, which is pretty good.

00:01:41.320 | Now, if we kind of translate this into something that's a bit more understandable,

00:01:45.520 | we have like 13 paragraphs is roughly about 1,000 tokens.

00:01:50.200 | These are the prices, right?

00:01:51.440 | So with Arda, with OpenAI, it's $1 per 32,500 paragraphs.

00:01:57.280 | Cohere is actually $1 per 65,000 paragraphs,

00:02:02.720 | which is really good, but there is obviously a catch,

00:02:06.680 | which is this thing up here, or this.

00:02:11.160 | $1 per 1,000 embeddings, right?

00:02:15.360 | The chances are you're probably not going to use 4,000 embeddings

00:02:19.760 | with every call to Cohere.

00:02:22.400 | So 2,000 tokens, well, that's probably like 26 paragraphs.

00:02:26.560 | If you're embedding 26 paragraphs at a time,

00:02:29.120 | realistically, you're probably going to do much less, right?

00:02:32.800 | So if, let's say, you're going for more like 1,000 tokens,

00:02:37.120 | which I think is more realistic,

00:02:39.120 | then obviously the price of Cohere

00:02:41.600 | is actually double the price of OpenAI in this instance.

00:02:48.200 | So it kind of depends on what you're doing there,

00:02:51.200 | as to whether you are throwing a load of text

00:02:53.880 | into your embeddings or not.

00:02:56.560 | So I think the costs are pretty comparable.

00:02:59.200 | Cohere can be cheaper, but it can also be more expensive,

00:03:03.040 | according to this logic anyway.

00:03:05.520 | Okay, so one thing I missed very quickly

00:03:08.280 | is the on-prem solution that Cohere offers.

00:03:11.760 | So we have it here.

00:03:13.280 | Essentially, you can run your own AWS instance.

00:03:18.280 | And in the time that it would take you,

00:03:20.400 | this is assuming you're running at 100%,

00:03:22.720 | in the time that it would take you to encode 1 billion paragraphs,

00:03:26.160 | if you use Cohere's on-prem solution,

00:03:28.680 | you would end up paying $2,500.

00:03:32.120 | It's also a lot quicker, and there are the other benefits as well.

00:03:36.840 | But I thought when we're talking about cost,

00:03:38.960 | we should definitely include that in there.

00:03:41.480 | So, you know, it depends, essentially.

00:03:44.120 | Embedding size, actually, you know,

00:03:46.200 | this is a good indicator of how much it's going to cost you.

00:03:49.000 | So it's actually under cost.

00:03:51.560 | The higher your embedding size,

00:03:53.760 | the more storage you need to store all of your embeddings

00:03:57.240 | after you've created them, right?

00:03:59.280 | So the embedding size, smaller, is cheaper.

00:04:03.840 | So Cohere is half the size of OpenAI in this case.

00:04:08.000 | So, you know, long-term,

00:04:10.440 | you would probably actually be saving money with Cohere

00:04:14.200 | with this embedding size if you're storing a lot of vectors.

00:04:18.920 | So, you know, that's definitely something to consider.

00:04:22.120 | Like if you consider this with the embedding cost initially,

00:04:25.720 | you know, maybe you're actually saving money with Cohere,

00:04:28.440 | even if you're just embedding like 1,000 tokens

00:04:31.280 | or even 500 tokens at a time.

00:04:34.320 | Long-term, you're probably going to end up saving money.

00:04:37.840 | Now, performance.

00:04:39.640 | So this is kind of hard to judge

00:04:44.000 | because this is a single benchmark

00:04:47.600 | that Nozoram has put together.

00:04:50.120 | And, okay, I mean, Cohere for sure is coming out on top here.

00:04:54.880 | It's kind of hard to say, again,

00:04:56.640 | like whether this is representative across the board or not.

00:05:01.640 | But nonetheless, the two models that are comparable here

00:05:06.000 | are Cohere's multilingual model

00:05:08.040 | and OpenAI's ARDA002 model, which is English.

00:05:12.200 | And this is a English search task.

00:05:15.320 | So it's pretty interesting that OpenAI's best English language model

00:05:19.080 | is comparable to Cohere's multilingual model.

00:05:22.920 | Cohere's English model is better.

00:05:25.160 | And then there's the Cohere Reranker.

00:05:27.120 | This is an embedding model.

00:05:28.320 | It's like imagine you retrieve all of your items

00:05:32.400 | or you get two chunks of text

00:05:34.320 | and you feed them into like a transform model

00:05:37.000 | and compare them directly.

00:05:38.640 | It is basically a lot slower,

00:05:40.640 | but generally speaking, it will be more accurate.

00:05:43.960 | So I think they are pretty interesting results.

00:05:48.800 | It seems like they're kind of on par,

00:05:51.600 | like OpenAI and Cohere are very on par,

00:05:54.600 | but it seems like Cohere, at least from what I've seen here,

00:05:58.560 | is slightly ahead of OpenAI in terms of performance

00:06:04.800 | on that single benchmark,

00:06:05.960 | which is not the best comparison, in all fairness,

00:06:11.080 | but also slightly cheaper in the long run

00:06:13.960 | because of the embedding size.

00:06:16.080 | But again, everything here is so close

00:06:19.120 | that it's going to depend a lot on your particular use case.

00:06:23.720 | So it's not that Cohere is better than OpenAI.

00:06:26.280 | It's just that in some cases, they probably are better.

00:06:30.480 | And in some cases, they're probably cheaper as well.

00:06:33.280 | So that's definitely something to consider.

00:06:37.040 | Now, how do we actually use Cohere for embeddings?

00:06:41.160 | So we're going to be focusing on the Cohere multilingual model.

00:06:44.360 | And this example we're going to be running through

00:06:46.920 | is not really my example.

00:06:50.240 | I've taken this example from Nils Reimers

00:06:53.600 | based on a webinar that we are doing together.

00:06:57.240 | He's basically put all this together

00:06:59.440 | and I've just kind of reformatted it in a way

00:07:02.600 | so that I can show you how it works

00:07:05.160 | and also show you, kind of focus on the multilingual search component of Cohere

00:07:10.120 | and show you how it works.

00:07:12.160 | So let's just jump straight into it.

00:07:15.840 | Right, so the first thing we need to do is our pip installs.

00:07:19.120 | So we have Hugging Fist datasets here.

00:07:22.240 | Again, data from that Cohere and Pinecone client.

00:07:25.200 | We're using the gRPC client so that we can upsert things faster.

00:07:30.040 | We'll see how to use that soon.

00:07:33.160 | Now, I actually have a couple of notes here.

00:07:35.720 | So a couple of things to point out with Cohere's multilingual model

00:07:40.720 | is that it supports more than 100 languages.

00:07:42.960 | I think the benchmarks that they've tested it on

00:07:47.120 | cover 16 of those languages or something around there.

00:07:51.520 | And of course, you can create embeddings for longer chunks of text.

00:07:55.360 | And this is the dataset we're going to be using.

00:07:58.280 | It's some straight data from Wikipedia that Nils put together, I believe.

00:08:04.120 | And it's just hosted under Cohere on Hugging Fist datasets.

00:08:08.840 | So let's have a look at these.

00:08:11.560 | For now, we're just going to look at the English and Italian

00:08:13.800 | and we're going to see how we would put those and create a search with them.

00:08:17.800 | And then what I'm going to do is switch across to an example

00:08:22.040 | where we have way more data in the database

00:08:25.480 | and that covers, I think, nine languages.

00:08:28.560 | But it is pretty interesting.

00:08:30.240 | So this is what a day looks like.

00:08:32.680 | We just have some text in the middle.

00:08:34.800 | That's what we're going to be encoding.

00:08:38.040 | So if we're embedding these chunks one at a time,

00:08:41.000 | maybe it would be more expensive using Cohere.

00:08:43.680 | But I think, in reality, we could put a lot more of these together.

00:08:48.280 | So we could put together like five of these chunks or more

00:08:51.840 | and it should work pretty well.

00:08:54.640 | So, okay, let's go down.

00:08:57.840 | Here, you need a Cohere API key.

00:09:00.800 | So to get that, you would go to here.

00:09:05.360 | So you type in dashboard.cohere.ai.

00:09:10.000 | Okay, and you'll probably have to log in

00:09:11.640 | if you haven't already logged in to Cohere.

00:09:13.760 | And then you go over to the left here

00:09:16.280 | and you will find some API keys.

00:09:18.880 | From there, you take your API key and you just put it in here.

00:09:23.400 | Okay, I have my API key stored already

00:09:27.040 | in a variable called Cohere API key.

00:09:29.920 | Cool. Then this is how you would embed something, right?

00:09:33.520 | So we have a list of texts that we would like to embed

00:09:39.120 | and we just pass them to this co.embed.

00:09:43.000 | So co is just a client that we've initialized up here.

00:09:46.040 | So co.embed text and then you have your model.

00:09:50.120 | This is the only multilingual model that Cohere offers at the moment.

00:09:54.920 | But, I mean, if you compare that to OpenAI right now,

00:09:58.960 | they just offer English models.

00:10:01.200 | So I think they've taken the lead with that, which is pretty cool.

00:10:06.160 | Pull embeddings from response.

00:10:08.600 | So, okay, we create our embeddings.

00:10:10.440 | It gives us a response and it has a lot of information in there.

00:10:13.680 | But all we need are the embeddings, right?

00:10:15.440 | So we're just starting those out.

00:10:17.560 | And then we see dimensionality of those embeddings,

00:10:21.040 | which is going to be 768.

00:10:23.800 | So that's the dimensionality.

00:10:25.400 | And then we have two of those vector embeddings there, right?

00:10:28.800 | So we have two 768 dimensional vectors

00:10:31.920 | because we have two sentences.

00:10:34.720 | All right, now that's how we would use Cohere's embedding model.

00:10:40.080 | But before we move on to actually creating our index,

00:10:43.120 | where we're going to sort all of those embeddings,

00:10:45.120 | we need to initialize an index.

00:10:47.320 | So we're going to be using a vector database called Pinecone for this.

00:10:52.000 | Now, Pinecone, again, we need API key, which we can get from over here.

00:10:57.560 | Again, it's free. So app.pinecone.io.

00:11:01.440 | I'll just copy and paste that.

00:11:04.520 | Okay, cool.

00:11:06.920 | So come over here, I can already see I have a couple of indexes in here.

00:11:11.480 | If this is your first time using Pinecone, it will be empty,

00:11:14.400 | and that's fine because we're going to create the index in the code.

00:11:18.040 | But what you do need is your API key, right?

00:11:21.080 | So your API key is here. You copy that,

00:11:23.920 | take it over into your notebook, and you would paste it here.

00:11:26.920 | Now, again, I've stored mine in a variable.

00:11:30.600 | Then you also have your environment.

00:11:32.600 | Now, your environment is next to the API key in the console, right?

00:11:36.360 | So here, us-east1-gcp.

00:11:40.480 | Your environment is not necessarily going to be the same as mine.

00:11:43.360 | So you should check that.

00:11:45.720 | Okay, great. So that has initialized, and then we come down here,

00:11:49.920 | and what we're going to do here is initialize an index,

00:11:53.080 | which is where we're going to sort all of these embeddings.

00:11:56.160 | Now, you give your index a name. It doesn't matter what you call it.

00:11:59.800 | Okay, you can call it whatever you want.

00:12:02.240 | But there are a few things that are important here

00:12:04.240 | that we should not change.

00:12:06.880 | So dimension.

00:12:09.080 | Dimension is the dimensionality of your embedding.

00:12:12.000 | So it's coming from Cohere, right?

00:12:15.000 | This is where I mentioned before,

00:12:16.440 | there's the price advantage of using Cohere.

00:12:21.440 | When dimensionality is lower, like 768,

00:12:25.000 | it's going to be cheaper to store all of your vectors

00:12:27.680 | if you are needing to pay for that storage.

00:12:30.400 | So we need that, and our index needs to know this value.

00:12:34.520 | So it needs to know the expected dimensionality

00:12:37.320 | of the vectors we're putting into it.

00:12:39.200 | Then we have our metric, which is dot product.

00:12:42.200 | This is needed by Cohere's multilingual model.

00:12:47.520 | If you look on the, I think, the about page for the multilingual model,

00:12:51.560 | it will say you need to use dot product.

00:12:54.080 | And then these here, you can actually leave them empty.

00:12:57.040 | The default values for these are also okay,

00:13:01.040 | but I thought I'd put them in there.

00:13:03.040 | So S1 is basically the storage-optimized pod for Pinecone,

00:13:08.560 | which means you can put in about 5 million vectors in here

00:13:12.400 | for free without paying anything.

00:13:14.480 | And then there's also P1, which is like the speed-optimized version,

00:13:19.280 | which enables you to put in around 1 million vectors for free.

00:13:25.240 | And then pods is the number of those pods you need.

00:13:28.040 | So if you needed 10 million vectors, we'd say, "Okay, we need two pods here."

00:13:33.000 | Cool, but we just need one. We're not paying that much in there.

00:13:35.560 | So we'd run that.

00:13:38.160 | Then we'd connect to the index.

00:13:39.640 | We use this gRPC index, which we can also use index.

00:13:44.040 | So we could also use this, but gRPC index is just more stable,

00:13:48.040 | and it's also faster, so we're doing that.

00:13:51.040 | And then we're going to describe the index stats.

00:13:52.720 | So we're going to see what is in there.

00:13:54.240 | Now, I already created the index before.

00:13:56.760 | So for you, when you're running through this first time,

00:13:59.720 | this will actually say zero.

00:14:01.680 | For me, I've already added things in there,

00:14:03.400 | and that's why it's at 200,100.

00:14:07.640 | Now, with the embedding model and vector index itself,

00:14:11.920 | we can move on to actually indexing everything.

00:14:14.360 | So basically, we're just going to loop through our dataset,

00:14:18.720 | and we're going to do what we just did.

00:14:20.760 | So we're going to embed things with coherent,

00:14:23.760 | and then what we're going to do is with those embeddings,

00:14:26.160 | we're going to add them into Pinecone.

00:14:28.840 | Actually, I don't think I showed you how we do that,

00:14:32.360 | but it's really simple.

00:14:33.240 | It's actually just this line here.

00:14:36.640 | But let me explain what we have here.

00:14:39.760 | So batch size is the number of items

00:14:43.040 | that we're going to send to coherent

00:14:44.240 | and then up into Pinecone at any one time.

00:14:47.680 | The line limit, so this is the number of records

00:14:50.480 | from each language that we would like to include,

00:14:52.920 | that we'd like to embed and add to Pinecone.

00:14:55.320 | We have our data here, so I'm just formatting this

00:14:58.280 | so that it's a bit easier later on

00:15:00.160 | when we get to this bit here.

00:15:02.240 | And errors, and this is just so we can store a few errors,

00:15:05.440 | because every now and again, we might hit one,

00:15:07.880 | and I'll explain why.

00:15:09.960 | It's not necessary, but there are ways to avoid it, basically.

00:15:13.120 | That and not that hard, but for simplicity's sake,

00:15:15.600 | I haven't included them in here.

00:15:16.960 | So here, I'm just saying, don't go over the line limit,

00:15:21.520 | and then we're going through English and Italian one at a time.

00:15:25.920 | We get the relevant batch from our data,

00:15:28.080 | which we've created here.

00:15:29.360 | So it's actually just the iterable of the data,

00:15:32.360 | first English and Italian.

00:15:34.080 | We extract the text from that.

00:15:36.720 | We create our embeddings using that text.

00:15:39.240 | Then we just create some IDs.

00:15:41.200 | This is just an ID variable

00:15:44.040 | that was in the data up at the top here.

00:15:49.200 | ID, and also including text in there, as well.

00:15:53.800 | Then what we do is we create this metadata list of dictionaries.

00:15:58.320 | Now, each dictionary is going to contain some text,

00:16:02.040 | a title from the record,

00:16:03.920 | the URL of the record, and also the language,

00:16:06.160 | so English or Italian.

00:16:08.520 | Then what we do is we add everything like this.

00:16:13.040 | So it's pretty straightforward.

00:16:15.160 | There's nothing too complicated going on there.

00:16:19.160 | The one thing that I have added in there is occasionally...

00:16:23.800 | So we saw the text earlier on.

00:16:27.680 | They were pretty short chunks of text.

00:16:30.600 | But for some reason, not all of them are like this.

00:16:32.480 | It's kind of like a messy data set.

00:16:34.520 | So some of them are actually quite long,

00:16:37.960 | and they actually exceed the metadata limit in Pinecone,

00:16:42.240 | which is 10 kilobytes per vector.

00:16:45.080 | So basically, we can add up to around 10 kilobytes of text

00:16:51.040 | with per vector in Pinecone,

00:16:53.440 | but some of them go over that, and they will throw an error.

00:16:56.680 | So I'm actually, for now, I'm just skipping those.

00:16:58.640 | But in reality, what you do

00:17:00.840 | is you would chunk those larger chunks of text

00:17:03.920 | into smaller chunks, and then just add them individually,

00:17:07.920 | or just store your text somewhere else.

00:17:09.920 | It doesn't have to go into Pinecone.

00:17:11.600 | Right. Now, I've already run this.

00:17:13.640 | I'm not going to run it again.

00:17:16.000 | And yeah, I can just come down to here.

00:17:18.160 | I can run this.

00:17:19.240 | We have our describe index.

00:17:20.600 | It looks the same as it did before for me.

00:17:22.720 | Okay, cool.

00:17:24.440 | Now, what we're going to do, so this is the more,

00:17:26.240 | I think, more interesting part is searching.

00:17:30.760 | So to search through, what we do is we take a query,

00:17:36.240 | we embed it, and then we...

00:17:39.760 | So embed is exactly the same as what we did before

00:17:43.280 | with cohere, and then we query with that embedding, xq here.

00:17:48.800 | And we return the top three most similar items.

00:17:52.880 | And then we want to include metadata,

00:17:54.400 | which is going to contain our text, title,

00:17:57.280 | and a couple of other things.

00:17:58.880 | The URL is pretty important.

00:18:01.160 | And then we return it in this kind of format.

00:18:03.360 | We include this.

00:18:05.360 | This is a pretty good idea from Nils.

00:18:08.080 | We include the translate URL.

00:18:10.480 | That will just allow us,

00:18:11.920 | so when we're getting Italian results

00:18:13.960 | or any other language results,

00:18:15.680 | we just click on this.

00:18:16.920 | It will take us to Google Translate,

00:18:18.960 | and we can see what it actually says.

00:18:21.720 | So let's run this.

00:18:23.480 | And we can try both of these.

00:18:25.600 | I'm not even sure if they work that well

00:18:27.840 | because we don't have that much data in here,

00:18:30.600 | but we can try.

00:18:32.280 | Okay.

00:18:33.880 | I don't know any...

00:18:35.920 | Okay, yeah, so number three here.

00:18:38.560 | So this is, you know, he's famous in Italy,

00:18:42.200 | but I think less famous outside of Italy.

00:18:45.640 | So if we go to here,

00:18:48.680 | you see translation,

00:18:50.560 | and you can see, okay,

00:18:51.840 | he's one of the most important

00:18:52.960 | and prestigious personalities

00:18:54.360 | in the fight against the mafia.

00:18:55.680 | He was killed by Cosa Nostra

00:18:57.680 | together with his wife and so on and so on.

00:19:00.280 | Right, so he's super famous in Italy,

00:19:02.320 | but if you look on Wikipedia for him in English,

00:19:06.120 | I think it mentions a little bit about him,

00:19:08.440 | but there isn't really that much information there.

00:19:10.640 | So that's why we're getting, you know,

00:19:12.720 | we're just getting like Italian results here.

00:19:15.000 | And then if we go for this one as well.

00:19:17.760 | So this is another one that I think in the English Wikipedia,

00:19:20.720 | there's like a paragraph about this.

00:19:23.440 | But then if you go to the Italian Wikipedia,

00:19:27.720 | there is a ton of these.

00:19:29.680 | Now, in this, I don't have...

00:19:32.760 | Yeah, I don't have enough data in here.

00:19:35.480 | So let's switch across to the larger data set,

00:19:38.240 | and I'll show you what the results look like there,

00:19:41.000 | which are much better.

00:19:42.200 | Okay, I can ask about this one here.

00:19:46.120 | So what is the Mafia Capital case?

00:19:48.640 | Okay, and we get Mafia Capitale here.

00:19:53.040 | And if you go to translate,

00:19:55.000 | you can see, yes, that is, you know,

00:19:57.800 | that is the thing that I was talking about.

00:20:00.520 | And then if we go to Wikipedia here,

00:20:03.120 | I'm going to point out, okay,

00:20:04.480 | so you get all of this text, which is tons.

00:20:09.240 | If we go to the English version,

00:20:11.800 | okay, so I'm searching in Google here,

00:20:14.880 | Mafia Capitale, what do we get?

00:20:18.680 | Right, we get this, literally three paragraphs.

00:20:22.720 | So, you know, basically nothing.

00:20:25.160 | So you can see why it would be bringing

00:20:26.720 | the Italian stuff here rather than the,

00:20:29.200 | or why being able to search the Italian stuff is useful,

00:20:34.000 | even if you're speaking English.

00:20:36.440 | Now, another one we're going to ask,

00:20:39.520 | what is Arancino, but I'm going to spell it wrong

00:20:43.720 | just to point out the fact that it can actually handle that.

00:20:48.800 | Maybe Arancino, oh, I spot it.

00:20:52.240 | No, no, I did get it right, okay.

00:20:55.520 | So this is wrong.

00:20:57.360 | The one I did before was actually correct.

00:20:59.920 | I kind of half expected to get it wrong anyway.

00:21:02.760 | All right, so it's, we can go on here, see what it says.

00:21:07.160 | So Arancino is a speciality of Sicilian cuisine.

00:21:13.640 | Arancini di riso, it's very nice,

00:21:17.200 | if you ever have the chance to try it.

00:21:19.680 | You should have this with a pizza.

00:21:22.840 | So Arancino, pizza, and furi di zucca, it's amazing.

00:21:27.920 | It's like my favorite meal.

00:21:30.800 | Okay, so let's try one more.

00:21:34.760 | Who is Emma Maroni?

00:21:39.560 | Is that right? Yes.

00:21:41.480 | Okay, so go to here.

00:21:46.040 | And I don't actually know who this is,

00:21:49.840 | so I hope this is correct.

00:21:52.360 | It's apparently this person.

00:21:55.760 | Okay, so that's it for this introduction to Cohera.

00:21:59.680 | I feel like it was a bit longer than I had intended it to be,

00:22:03.560 | but that's fine, I'm hoping that it was at least useful

00:22:06.400 | and we kind of went through a lot of things there.

00:22:09.160 | So, yeah, I just wanted to share this.

00:22:12.520 | It's a alternative to OpenAI.

00:22:15.360 | I'm not saying it's necessarily better,

00:22:17.240 | I'm not saying it's necessarily cheaper.

00:22:19.120 | I think that is very much going to depend on your use case,

00:22:22.120 | what you're doing, and many other factors, right?

00:22:25.440 | You can train these models, for example,

00:22:27.360 | if you're able to train them,

00:22:29.400 | then you're probably going to get some pretty good performance as well.

00:22:33.800 | And I suppose one big factor here

00:22:36.600 | is actually the multilingual aspect of this model.

00:22:40.320 | At the moment, OpenAI doesn't have any multilingual models,

00:22:44.320 | or none are actually trained to do that.

00:22:46.160 | Some of them, I think, can handle multilingual queries relatively well,

00:22:50.200 | but they haven't been trained for that.

00:22:52.640 | And this can be relatively problematic,

00:22:56.760 | especially when you're dealing with multinational companies

00:22:59.880 | or just companies that are not American or English or Australian as well.

00:23:05.800 | I'm not going to forget you.

00:23:07.560 | The rest of the world speaks different languages,

00:23:10.640 | so having this multilingual model is pretty good.

00:23:16.120 | So, yeah, I mean, this is still very early days for Cohere.

00:23:21.520 | I'm pretty excited. I know they have a lot planned

00:23:24.640 | and that will be really interesting to see.

00:23:28.520 | But for now, I think we'll leave it there.

00:23:31.840 | I hope all this has been useful and interesting.

00:23:35.080 | So, thank you very much for watching

00:23:37.080 | and I will see you again in the next one.

00:23:39.400 | Bye.

00:23:40.480 | (MUSIC)

00:23:45.480 | (MUSIC)

00:23:50.480 | (MUSIC)

Cohere vs. OpenAI embeddings — multilingual search

Chapters