back to indexAI Engineer World’s Fair 2024 - Open Models track

00:00:00.000 |
to do this out of the box, but the community works around our models, and we learn from 00:00:07.000 |
the community how our models can be customized or deployed in a few seconds. So how are these 00:00:13.000 |
open source models trained? So I'll give you a very high level overview of the different 00:00:20.000 |
stages of LLM training. And typically LLMs are trained in three stages: pre-training, 00:00:26.000 |
learning from human feedback. So the idea behind pre-training is very simple. You take 00:00:32.000 |
a piece of text and you pass word by word or token by token to the large language model 00:00:42.000 |
and ask the model to predict the next token. So the idea itself is very simple. The task 00:00:51.000 |
is the next token prediction. Each token is roughly 0.75 words. The vocabulary size is 00:00:56.000 |
roughly tens of thousands of tokens or sometimes hundreds of thousands. And each token is basically 00:01:02.000 |
represented as an integer and it has an embedding associated with it. And so the task of the model 00:01:07.000 |
is to take in a sequence of embedding tokens and predict the next token. Although the concept 00:01:14.000 |
is very simple, it's in practice it's actually very hard. Why is it hard? Because it requires 00:01:21.000 |
a lot of effort in building the data sets. The data sets are huge. They are order of 00:01:26.000 |
trillions of tokens, tens of trillions of tokens. This requires pre-processing, cleaning, deduplication, 00:01:31.000 |
curation. And again a common belief that more data leads to better performance but that's not 00:01:44.000 |
necessarily the case. If you have noise in your data that can actually hurt the model performance. It 00:01:49.000 |
has a lot of investment. These models are huge. You can go up to hundreds, hundreds of billions 00:01:56.000 |
or even trillions of parameters. Each model takes tens to hundreds of millions of dollars to 00:02:02.000 |
train. And the hardest part is you don't get multiple chances to train the model. Because 00:02:11.000 |
it's so expensive, if something goes wrong in your training, it's very difficult to get the 00:02:21.000 |
investment to do another training run. Because typically for small companies, you don't get 00:02:26.000 |
that kind of budget if you do a model run and it's not successful. It becomes harder to get 00:02:32.000 |
the funding for the next run. And this is hard because the best hyperparameters for a smaller model 00:02:40.000 |
might not be the best for a larger model. Here I'm showing you some hyperparameters for LAMA1 00:02:50.000 |
model family sizes. And you might ask why are the number of layers 80 and not 82 in LAMA65B? And 00:03:03.000 |
the answer is we don't know. There's a lot of things that have been decided by intuition. And it's not 00:03:13.000 |
exact science. So you need a lot of experience and intuition working with these models to come up with 00:03:20.000 |
things that are very likely to work. But we don't -- we are still not very mature with the science of what is the best way to train the model or what's the best architecture. 00:03:31.000 |
What's the best data set mixture. So can we use this pre-trained model? So let's say if you want to use this pre-trained model and ask it to write a Python function to find whether the input number is prime or not. And the model might give you a response like this. 00:03:49.000 |
It continues the text, gives an example, and like describes the approach, but it might not give you the code. And this is because the model is trained to do this. It's trained to predict the next token. So it predicts the most likely token from the text data it's been trained on. 00:04:07.000 |
But there is a way to predict the model. If you give this input like as a Python function definition and a doc string to get the same function, the model actually produces the code. And so this shows you that model actually knows the answer, but it is not aligned with human preferences. It's not trained to interact with humans in the way humans want to. And this is why we need the next two stages. 00:04:35.000 |
And this is why we need the next two stages. So in the instruction tuning stage, instead of just a string of text, we have prompt response pairs. So here we are giving the prompt, but in the way humans want to interact with the model. For example, this prompt will write a Python function and the response is directly the code because that's what humans want as a response. 00:05:03.000 |
So the technique is very simple. Again, we are doing next token prediction, but the only difference is we are going to mask the prompt itself. We are going to do prediction only for the response. So the dataset is paired prompt response pairs. We typically use hundreds of thousands of instructions. The task is next word prediction. But just we mask the input instruction. It requires way less compute. 00:05:31.000 |
And then the last step is learning from human feedback. And here the idea is that human preferences are cheaper or easier to obtain than full human annotation. If I give you a prompt like this and two responses, it's much easier for a human to decide which response is better than to write the whole response from scratch. 00:05:59.000 |
And so this allows us to scale data faster. And there are two main techniques learning from reinforcement learning from human feedback and direct preference optimization, where we use this kind of preference data to find the model further. Just to summarize, these are the three stages. 00:06:45.380 |
but I'll show you this nice graph of performance to cost ratio, 00:06:50.380 |
which kind of shows that we really try to optimize 00:06:59.300 |
We try to get the best performance out of our models 00:07:03.140 |
So here on the x-axis, we have the active parameters, 00:07:10.780 |
And on the y-axis, we have a popular benchmark, MMLU. 00:07:29.400 |
And again, we are trying to optimize performance and speed. 00:07:33.980 |
It's fluent in 80 plus programming languages. 00:07:36.720 |
And it has both instruct and fill in the middle mode, 00:07:41.420 |
which means that you can use it for code completion 00:07:43.920 |
in your code editor, just like GitHub Copilot. 00:08:01.300 |
Lama C 70B while being a significantly smaller model. 00:08:04.540 |
So again, we are getting more performance out of a model 00:08:17.060 |
We trained it with more than 80 programming languages. 00:08:25.160 |
it tends to perform better than the other models. 00:08:33.140 |
We also have the API access available on lab platforms, 00:08:43.160 |
And here it's also free to use till I believe end of July. 00:08:47.900 |
We also have integration with VS Code and JetBrains. 00:08:52.640 |
So you can download a plugin in VS Code or JetBrains 00:08:57.800 |
and use it as a coding assistant for code completion. 00:09:09.760 |
because these are some commonly asked questions 00:09:19.240 |
So if you have a particular application in mind 00:09:28.500 |
and you could also do retrieval augmented generation 00:09:30.800 |
because commercial models typically don't allow you 00:09:38.780 |
you can do task-specific fine tuning as well. 00:09:42.460 |
You need a little bit of data and compute for this. 00:09:47.620 |
the choice is between how do you balance performance 00:09:53.160 |
Commercial models have a higher general purpose performance. 00:09:58.200 |
if you are trying to build a new application. 00:10:38.580 |
We are always looking for talented researchers, 00:11:39.120 |
and you can tell it's not cheap to do those things. 00:13:43.360 |
You go to OpenAI, you prompt the model to death, 00:13:55.260 |
but this has like very fundamental implications 00:14:05.220 |
because we can trust the API over the pieces of code. 00:14:09.260 |
And here what's on you and what you've probably witnessed, 00:14:12.560 |
you can't actually trust large language model 00:14:16.720 |
And, you know, in short is that the technology for agents 00:14:23.900 |
So the good news is that structure generation, 00:14:48.400 |
And then try to convince you that you should use it today 00:14:52.460 |
for, you know, most of the workflows that you have to deal with. 00:14:56.300 |
And sort of a very short glimpse into the near future. 00:14:59.800 |
So outline, a Python library, emphasis on library. 00:15:05.000 |
You can actually include outlines in your workflow. 00:15:07.740 |
And it's more like frameworks where you have to make your workflow, 00:15:13.320 |
So I think as a result, it's been adopted by VLLM and CDI 00:15:19.760 |
And if you use function calling in either of these libraries, 00:15:22.560 |
you're actually using outlines under the frame. 00:15:24.840 |
So I'm co-author, but outlines would be nothing without 00:15:34.640 |
I think it might be 60 August money, I don't remember. 00:15:37.140 |
And so outlines would be nothing without all these people 00:15:42.380 |
People thought we were crazy about a year ago 00:15:46.320 |
when we were talking about trusted generation. 00:15:48.660 |
But since then, pretty happy because it looks like people are sort of 00:15:53.200 |
caught up with the topic in real life that you can actually, 00:15:56.300 |
you know, you can actually do trust to that, so. 00:16:04.440 |
So usually generating text happens in three stages. 00:16:07.480 |
The first stage is that you need to model and then create it. 00:16:11.180 |
So outlines is purely focused on open source models. 00:16:14.280 |
We have integration with six different model providers, 00:16:17.260 |
consultants, Lama CPP, and also recently we added MLX, MLX. 00:16:25.600 |
but that's mostly for us to compare the results 00:16:28.140 |
that we get with OpenModel with the results by given by OpenAI. 00:16:32.480 |
The second step is to, I mean, generate text. 00:16:35.940 |
What you do is that you instantiate a generator 00:16:39.940 |
Here we just want to, you know, return a single sentence. 00:16:48.720 |
then you call the generator with your prompt. 00:16:51.420 |
Here is describe the benefits of structure generation in one sentence. 00:16:55.000 |
And you'll have to wait for 70 minutes, hopefully less. 00:17:00.280 |
Okay, now we're getting to structure generation. 00:17:06.040 |
if you ask what is the IP address of the public, 00:17:13.420 |
Then generally it will be up for a long time. 00:17:34.000 |
this is the structure that the output could tell it. 00:17:36.520 |
And as you see, you kind of remove the yapping, 00:17:38.780 |
you send them, you just call generate.reject, 00:17:49.660 |
Regular expressions are not the only way to define structure. 00:17:53.940 |
Something that people need a lot in practice is actually JSON. 00:17:57.640 |
It allows you to generate text that, you know, 00:18:06.280 |
The way you specify the structure is using JSON schema. 00:18:13.220 |
Now you might notice on the flight information, 00:18:18.500 |
you're extracting flight information from the email. 00:18:20.900 |
I could have used string as a type for origin and destination, 00:18:25.300 |
It's actually a custom type that we implemented in that line, 00:18:27.940 |
and the reason is that origin and destination 00:18:34.380 |
it's an airport code that has three letters that's specialized 00:18:37.420 |
and you can actually specify more and more structure, 00:18:40.340 |
all the structure that you have in your problem, basically. 00:18:48.260 |
So here we took, I think it's a picture from Wikipedia of a dish. 00:18:53.800 |
We tell the model what is the JSON that we expect as an output, 00:19:03.200 |
and then pass the image on the console to the generator, 00:19:10.480 |
and you can keep it benefits and structure generations, 00:19:16.020 |
Now I'm going to try to very quickly explain how it works. 00:19:20.660 |
So models themselves, what METRO and CLEO of this world are doing, 00:19:35.500 |
It's a probability distribution over the next token. 00:19:40.640 |
the first step is that you have a logic receptor, 00:19:44.640 |
You probably use this every day actually without noticing it. 00:20:11.800 |
and we say, if I add this token to the current generation, 00:20:23.880 |
What is really hard is doing that efficiently. 00:20:28.280 |
and that's what makes it different from the other libraries, 00:20:30.380 |
like guidance or an issue about the structure generation. 00:20:47.540 |
The first reason is that we expect to structure. 00:20:54.240 |
But here I just took the GSNAT data yourself. 00:21:04.500 |
You can actually see that it's highly structured. 00:21:07.080 |
It's always few, period text until a question mark, 00:21:15.620 |
and you could actually express it in outlines, 00:21:17.580 |
and just get a chance at the end if it's fixed. 00:21:21.480 |
So there's a lot of structured tests out there, 00:21:31.060 |
Of course, the second benefit is that you get valid structure, 00:21:41.460 |
It's just crazy stuff to get valid JSON as an output. 00:21:45.840 |
And here, the outline, you just sample what you want. 00:21:51.800 |
it's actually an experiment that pretty-based it. 00:21:55.740 |
They used a version of CoNNL that they modified 00:22:00.120 |
What they found is NISTOL7DV01 only gets valid JSON 00:22:06.000 |
When you had structure generation on top of it, 00:22:08.300 |
you get 99.9% and that's where that looks like a good time. 00:22:15.460 |
The nice thing is it also adds negligible overhead, 00:22:28.580 |
Here, we compared the overhead introduced by guidance 00:22:34.980 |
you know, the function of the number of generated tokens, 00:22:39.120 |
Outline stays approximately zero until the end 00:22:56.400 |
it is faster to generate text to a structure generation. 00:23:07.920 |
I don't need to ask the model to return the token. 00:23:13.920 |
only five out of ten tokens need to be generated, 00:23:23.660 |
And this is the example that we took at the beginning. 00:23:26.680 |
So here, I asked ChatGPT, like, a good model, 00:23:31.700 |
What is the public, like, of Google's public DNS servers? 00:23:51.180 |
So that's a subtle way in which it accelerates 00:24:03.320 |
So here what you're looking at is the accuracy 00:24:16.400 |
so the number of examples that you give to the model 00:24:23.440 |
for a structured, normal, one shot is worth an 8 shot. 00:24:28.360 |
But what we found with structured is that you actually, 00:24:31.760 |
is that you actually get in the same ballpark 00:24:37.660 |
which is surprising for a machine learning person. 00:24:39.760 |
Like, you would think that examples are there 00:24:45.300 |
to keep the model about the structure of the problem. 00:24:49.240 |
There's more investigations to be in this time, 00:24:55.120 |
after faster, a lot of people care about here, 00:25:03.260 |
Here, what you're looking at is the Berkeley function 00:25:07.340 |
calling leaderboard simple function benchmark, 00:25:25.020 |
So this factory is actually a pretty good model. 00:25:47.440 |
And the second thing is that we have open models 00:26:01.860 |
And that's why I'm really bullish on open models. 00:26:05.420 |
you can actually extract a lot more out of this model. 00:26:12.160 |
The work that I just showed you is what we did 00:26:17.100 |
Since then, we've generalized from regular expressions 00:26:23.080 |
Context-free grammars are used to define code. 00:26:27.440 |
I mean, and to define as well what I showed you earlier 00:26:32.220 |
So we can do the same thing structured generation 00:26:34.820 |
with no overhead with this context-free grammar. 00:26:42.600 |
like adding some semantics constraints to the generation. 00:26:52.480 |
Usually what they get wrong is they're able to call them names. 00:26:56.180 |
And internally you were able to get perfect Sexy people. 00:27:00.980 |
So I can't guarantee you that the query will be correct 00:27:12.500 |
Oh yeah, and we're also starting to bubble up computations 00:27:16.400 |
into the structure generation into the model architecture. 00:27:19.640 |
Because when you think about it, we're biasing logic. 00:27:23.600 |
the model is actually doing computations for nothing. 00:27:31.700 |
And that's all work that we actually published 00:27:33.440 |
in the blog post I think in the next episode of the week. 00:27:37.220 |
So all that to say is that if you're doing -- 00:27:42.420 |
there's a really good chance that you will be using 00:27:46.620 |
it's just a matter of time until you adopt it, I think. 00:29:24.800 |
That was a little mind-blowing that you could go all the way down into the logits and get all sorts of structure and speed out of that. 00:29:30.880 |
So, still thinking about that for days after this. 00:29:40.720 |
She does an awesome podcast on YouTube that you should definitely go check out after this. 00:29:44.860 |
But excited to hear what they're cooking at Cohere. 00:30:09.160 |
It's my first time as an engineer and honestly, I'm loving it. 00:30:15.020 |
On the way from the airport, I came from Warsaw, Poland. 00:30:18.920 |
I saw so many billboards mentioning AI in it, but I really felt like I'm at home. 00:30:25.740 |
And yeah, just meeting you here and seeing some of your faces and some of your company is really awesome. 00:30:35.280 |
So what if I told you that we have just handed you the keys to state-of-the-art model, 00:30:45.500 |
which excels at structured, advanced RAG, at sequential reasoning, and you can run it locally on your machine. 00:30:59.240 |
So it's competitive against GPT-4 Turbo, Cloud Opus, and it's much smaller. 00:31:08.780 |
We've been really hard at work at Cohere, working on our family of models. 00:31:15.020 |
And today, I'd like to talk to you about some of the stuff that we've done, the decisions that we've made when it comes to the model design, 00:31:26.200 |
and also what we're cooking when it comes to the future of the model. 00:31:32.380 |
So this year, we've been working really hard to push the boundaries of what's possible with LLM. 00:31:45.980 |
Three months ago, on March 11th, we've released Command-R. 00:31:53.260 |
Command-R is a model optimized for retrieval augmented generation, and it's scalable. 00:32:05.620 |
We followed it up with Command-R plus, and this model is optimized for tool use, 00:32:16.860 |
advanced retrieval augmented generation, and has become a very popular model in the open source community. 00:32:24.780 |
Within a few days of the release, we've climbed the LMSYS arena. 00:32:33.520 |
Your response as a community using the model has been incredible. 00:32:47.800 |
Within two weeks of the release, the model has been downloaded 00:32:52.340 |
150,000 times from HuggingFace, which is wild. 00:32:55.880 |
Folks at HuggingFace actually like the model so much, 00:33:01.820 |
especially when it comes to the tool use, that they decided to use it as a base model for HuggingChat. 00:33:21.000 |
So today, almost half a million of developers and researchers are using the R family. 00:33:31.540 |
It looks like you guys got really excited to get your hands on the model 00:33:39.320 |
and to be able to play with the weights and look under the hood. 00:33:45.420 |
We keep hearing your feedback and the love and support keeps pouring in. 00:33:52.700 |
And I've seen some super cool stuff built with R+ since then. 00:33:57.340 |
Some of my favorite ones I want to shout out here are The Coding Assistant by Daniel Sun 00:34:02.840 |
and a new generative search demo by Complexity. 00:34:10.080 |
We'll see how the tech goes, but I'll give you a sneak peek. 00:34:15.120 |
Another one that's my favorite is two Discord server bots that are powering our Discord community. 00:34:26.500 |
One of them is fine-tuned to be playful and to demo the model capabilities. 00:34:36.200 |
It's grounded in our docs, and it's focused on the information coming from the API. 00:34:41.400 |
So, I want to share the journey of building the R models. 00:34:52.440 |
And to show you that we've committed ourselves to build the top Rack tools for AI Builders. 00:35:04.420 |
So, we know firsthand that building Rack is excruciatingly hard, tough word. 00:35:12.460 |
When you set out to do that, you're going to face challenges. 00:35:20.240 |
Challenge number one is that models are highly prompt sensitive. 00:35:24.380 |
And when you want to use the model in the Rack context, you need to prompt it to not only look for the information, 00:35:34.220 |
but also know where to look, and know how to differentiate between the conversation history that the model has with the user, 00:35:45.200 |
Another problem is overcoming model's natural bias towards focusing on the beginning of the document. 00:35:54.880 |
You've seen it with multiple Rack benchmarks and evaluation tests, you know, in the haystack and whatnot, 00:36:02.920 |
that are really showing the problem of models not focusing on the most accurate information retrieval, 00:36:10.860 |
but rather becoming a little bit lazy and focusing on the beginning, mostly. 00:36:18.000 |
Another challenge is steering an ongoing battle that's happening within the model between its pre-training knowledge and what it encounters in prompts. 00:36:31.040 |
For Rack use cases, you want the model to be able to tap into the knowledge that's not baked into the model parameters. 00:36:39.960 |
And temporal information is a great example, and you're answering, when you're asking the model about who is the current president of the United States, 00:36:51.840 |
you want the model to be able to tap into the up-to-date information. 00:36:57.640 |
So through post-training, we've been able to optimize the model behavior to be able to address these and to decide when the external information is needed in the first place. 00:37:11.120 |
Sometimes it isn't, sometimes the pre-trained knowledge is enough. 00:37:15.120 |
Then operate the retrieval system smoothly to be able to run search queries successfully, retrieve the information, 00:37:26.100 |
hopefully the most accurate one, and then use that information as a grounded context for the conversation that the model is having with the user. 00:37:35.400 |
We optimized all of this for you, the model behavior, so that you don't really have to think about it. 00:37:43.820 |
It's really good at it out of the box, but it was hard work. 00:37:53.520 |
We're big on citations, we believe that allowing the user to verify where the information comes from, and whether it's trustworthy, it's really important. 00:38:03.100 |
So we're spending extra time to make these citations very fine-grained, and thanks to that you can experience low hallucination and reliable context use. 00:38:16.540 |
We tested command R and R plus on some standard RUG data sets, like Kilt, and they exhibit best-in-class performance. 00:38:25.520 |
They're small enough to be affordable, but powerful enough to cover a lot of your use cases. 00:38:33.320 |
They have a great balance of token efficiency, and to achieve this level of performance, normally you would, you would have to line up a big pipeline of LLM. 00:38:44.140 |
We've also heard from you that creating a UX and UI for RAG and Toluse is super painful. 00:38:53.720 |
It's not a small feat, and we know it firsthand because we've spent considerable amount of time working on it ourselves. 00:39:10.040 |
I think it has everything a modern UI, modern chat UI, needs to have, so you're able to have a conversation history, you're able to have fine-grained citations, you're able to upload documents there, you're able to plug it into different types of tool. 00:39:25.520 |
So, spending so much time on it and knowing how much you're struggling either way, we decided that it's going to be a good idea to open source the UI. 00:39:41.900 |
I feel like not many people know about it, but our UI is out there and you can download it and start building with it. 00:39:49.680 |
So this is a toolkit repo, that's how we call it. 00:39:53.220 |
It has plug-and-play components and source code for an interface app that we've built with Next.js. 00:40:01.260 |
It has a small SQL database for conversation history. 00:40:07.100 |
There is a model component which lets you customize how you're accessing command R models. 00:40:12.520 |
You can do it by a cloud providers, you can do it by a cohere platform, you can do it locally, you can do it by a hugging face, your pick. 00:40:21.580 |
Then there is retrieval component and here you can customize access to tools and data sources. 00:40:29.860 |
So, out of the box we've built an example data retriever built off of blockchain. 00:40:36.020 |
It has document upload and it's using web search, but honestly, you can add support for any tools and any data sources that you're interested in. 00:40:47.980 |
Lately, we've been focused on optimizing tool use, particularly in the enterprise context, that's our game. 00:40:57.980 |
It's kind of extension of this RAC formula I mentioned earlier, where we began by training the models to be really good with vector databases and retrieval systems. 00:41:08.140 |
And then it naturally progressed into broader tool use, training the model to use any tools and ideally in a zero shot context. 00:41:20.300 |
That's kind of our ideal scenario that we're working towards. 00:41:32.460 |
It's really useful for situations where you have a single action to be performed or a set of independent actions. 00:41:40.460 |
It could be searching for documents or sending out an email. 00:41:46.380 |
Multi-step on the other hand, it's really good for scenarios where you have to carry out a sequence of actions with each action building on top of the previous one. 00:41:58.540 |
So, in the same example, it would be searching for that document, being able to compare it against another document, 00:42:08.700 |
creating a summary of that comparison and then sending it out via an email. 00:42:15.980 |
In sequential reasoning in multi-step, you want the system to be able to reflect and correct errors, if there are any, on the way. 00:42:26.620 |
And we are teaching the models to retrieve the information many times over from these different data sources. 00:42:41.340 |
Most of the time when people use the term agents and multi-step, they mean the same thing. 00:42:46.220 |
It's essentially a scenario where software is performing a sequence of actions with each action building on the previous step. 00:42:54.940 |
Last week, we released multi-step API, super hyped about it. 00:43:00.700 |
We want it to be user-friendly, and so, all you need to do is you need to describe the tools that the model has on their hands, 00:43:10.300 |
what these tools do, and then some parameters. 00:43:15.740 |
After user request is made, the model is going to create a plan. 00:43:19.500 |
And it's going to figure out how to use these tools to fulfill the user request. 00:43:25.260 |
And once it calls each tool, it's going to reflect on the content, and it's going to adapt the initial plan, if it's necessary. 00:43:34.940 |
So, for example, if the model is calling an API and it returns an error, it's going to automatically retry calling it again and coming up with a new plan. 00:43:43.900 |
We've outlined this behavior in the huge multi-step preamble. 00:43:51.980 |
Essentially, it's a massive prompt that explains the model what it needs to do in order to get the job done. 00:44:05.900 |
We've trained command R and R+ to generate claims that are verifiable through citations. 00:44:12.940 |
And, again, big on citations, we really believe that when you can explain which tool has been used by the model for each response, 00:44:23.580 |
it's going to make a difference and it's going to make the system better. 00:44:29.500 |
command R+ has competitive performance to plot OPLUS, GPT-4 Turbo, but it is three to five times cheaper. 00:44:39.020 |
So, that's a massive difference when it comes to scalability and being able to use it in production. 00:44:44.860 |
We test the R family on standard complex reasoning benchmarks and command R+ is close to or on power with GPT-4 Turbo. 00:45:01.260 |
We're going to keep hammering on the multi-step and, yeah, stay tuned. 00:45:06.380 |
Can we turn the mic or here, I'll give you the sound? 00:45:21.100 |
I still have two minutes, so I can do a demo if you're interested. 00:45:30.700 |
So, this complexity, this is this new generative search engine I mentioned. 00:46:09.820 |
And here you're going to achieve the answer that's grounded in multiple sources. 00:46:16.940 |
Sources are outlined here, but also you can click on particular 00:46:25.660 |
information taken from a particular website and, yeah, and go and verify it. 00:46:46.300 |
So, we're going to give it a really hard task. 00:46:48.700 |
What are the three largest companies in the world in terms of the market cap? 00:46:58.460 |
As you can see, you can check out the model behavior step by step. 00:47:05.020 |
So, first, it's going to search for the companies. 00:47:06.940 |
Then it's going to search for a number of employees. 00:47:09.660 |
And now it's running Python to create a graph. 00:47:17.740 |
And now we're waiting for the tweet, I guess. 00:47:49.820 |
And then you can find our support and play box on the Discord server. 00:48:14.140 |
Well, AI engineer can be happening, but who knows what it is. 00:48:45.100 |
So, it doesn't -- it didn't find the AI engineer. 00:49:45.260 |
That concludes the morning session for open models. 00:49:49.900 |
We have Liquid, AI, Unsloth, and one more that I should know off the top of my head, 01:19:14.660 |
We can't fool it, chasing the love is all we know 01:19:19.880 |
See in the forest, for the trees I'm keeping watch, all that is storming 01:19:28.760 |
Waking up and turning on, keeping the light on, riding all the wrongs, keeping our sights on, everything we want 01:19:40.700 |
We catch our breath, in the middle of it all, chasing echoes 01:19:48.700 |
Sun is coming up, all the rust is coming, crystal vision 01:19:58.280 |
I'm a passion, passion, passion, passion, passion.