back to indexNo-code fine-tuning: Mark Hennings

00:00:26.240 |
fine-tuning large language models today without any code. 00:00:33.040 |
fine-tuning is training a foundation model for a specialized task. 00:00:37.800 |
Some examples of these specialized tasks are writing any kind of copy, 00:00:44.620 |
It could be scrubbing fake emails from a list, 00:00:55.840 |
detecting fraud, or flagging inappropriate content. 00:00:59.040 |
These are very common tasks that businesses do every day. 00:01:02.600 |
And something they have in common is that traditional programming or rule-based approaches 00:01:12.160 |
They perform them easily and they can capture the nuance in the text that you're working with. 00:01:19.720 |
You can do almost all of these things with a prompt. 00:01:25.720 |
It's faster and cheaper because you can train a lighter model to match the quality of what you were doing with a prompt. 00:01:32.720 |
It reduces the size of your prompts, allowing for longer completions. 00:01:36.280 |
Trending examples allow you to cover edge cases and collaborate better as a team. 00:01:41.280 |
And it's naturally resistant to prompt injection attacks. 00:01:49.840 |
Well, if you take GPT-4 and its response time per token, it's about 196 milliseconds, give or take, from the OpenAI API. 00:02:08.560 |
Well, taking an example with GPT-4 versus GPT-3.5 fine tuned, you can save 88.6%. 00:02:16.960 |
Well, then how much shorter do the prompts actually get? 00:02:19.560 |
Well, I'll give you one example because it's going to vary depending on your prompt. 00:02:23.200 |
But here's what a typical engineered prompt might look like. 00:02:26.360 |
It has some instructions saying to, you know, write a blog post on this topic, how to write it, 00:02:33.040 |
what tone to use, what to do, what not to do. 00:02:35.880 |
Well, with a fine tuned model, it learns how we write. 00:02:43.600 |
So we're just giving it the one thing that's unique about this prompt versus another prompt, 00:02:50.400 |
And in this very conservative example, it's 90% shorter. 00:02:56.280 |
Now let's talk about collaborating as a team, right? 00:03:09.400 |
Well, with fine tuning, now you can have multiple files like we're used to, where developers can 00:03:15.600 |
work on this section of code or that section of code. 00:03:21.240 |
So your training data is this layer that your team can work on and add to and edit and improve. 00:03:28.480 |
And then that feeds into the fine tune model. 00:03:30.600 |
So the main point is if you can get equal or better output, why wouldn't you fine tune a model? 00:03:36.800 |
Now, fine tuning is kind of a dev job right now. 00:03:43.160 |
If you go online and you look up how to do fine tuning, you can find articles that talk about how to spin up GPU servers for training and inference. 00:03:50.360 |
And you got to format your data with these ad hoc Python scripts and configure these parameters and then make API calls. 00:03:56.920 |
It just looks like a dev job, but if you really break it down, why can't we just automate all of that with a user interface? 00:04:10.400 |
And the bar is lower than most people think to get started doing this. 00:04:13.640 |
If you can get 20 examples of what you want your fine tune model to do, you can fine tune a model. 00:04:19.400 |
This is not traditional machine learning where you need thousands of examples to get started. 00:04:24.160 |
And the data set is this impossible barrier to get past. 00:04:28.280 |
No, this is something that you could handwrite these if you want to. 00:04:32.280 |
One way to think about this is as an extension to few shot learning. 00:04:35.720 |
Let's say you can have five examples of what you want a model to do in your prompt. 00:04:40.840 |
Well, with fine tuning, your training example data set can be as long as you want. 00:04:45.240 |
So instead of five examples, you can now have 20 or 100. 00:04:48.440 |
So it seems intuitive that with more examples, the model would be able to do closer to what we want it to do. 00:04:54.920 |
So here's what I propose for a dev lifecycle for large language models. 00:05:04.120 |
It allows us to create a prototype to validate the concept. 00:05:07.800 |
And we can also use it to create our initial data sets for fine tuning. 00:05:12.160 |
Once we have those data sets, we should fine tune a model and we should evaluate it to make sure that it actually is better than the prompt engineered version. 00:05:20.680 |
And then we can test which models we can get to perform at the same level. 00:05:25.480 |
Then the fine tuned model can go into production and from production, we can capture feedback from our users and we can log the examples. 00:05:31.840 |
And with those examples, we can continuously improve our fine tuned model because now all of a sudden we have the real examples that we can add back into our data set. 00:05:42.040 |
So in terms of roles, I think that there's a huge opportunity for people to get into prompt engineering and fine tuning who are not developers. 00:05:49.840 |
Yes, if you're a developer, you can fine tune. 00:05:53.040 |
But you shouldn't have to be the only person that can fine tune. 00:05:56.520 |
I'm a co-founder at EntryPoint and we have built the modern tooling to make this easy. 00:06:03.160 |
Here we are on the dashboard and I'm going to open the press release writer project. 00:06:11.120 |
The way I created these 20 examples for a press release generator was I went online and I found 20 press releases that looked really good. 00:06:19.960 |
They came from blog articles about the best press releases that you can write. 00:06:24.080 |
However, I didn't have input data, so my data set was incomplete. 00:06:28.240 |
But I used chat GPT 4 to take the press release and then write a list of facts that would be needed to actually have a professional writer write such a press release. 00:06:39.520 |
You know, large language models aren't great at facts. 00:06:42.960 |
So providing it the facts as the input makes sense to me that I want to give it a list of facts and then have it write something that's really polished that would be a really good first draft of a press release. 00:06:52.240 |
With this user interface, I have a lot of visibility into the data that I'm actually putting into my fine tune model, 00:06:58.040 |
which I think is really important and the way this works is that we have a structured data approach. 00:07:03.440 |
So when you import like a CSV into entry point, each column becomes a field. 00:07:09.080 |
Here I have the facts and here I have the press release and these fields you can use in a template. 00:07:14.560 |
Just like you are writing a mass email and you want to insert somebody's first name or personalize the emails with information about a contact record. 00:07:22.160 |
You can use references to these fields with the handlebars templating language and it provides a really intuitive way to easily format your output, your input and GPT 3.5 Turbo when you fine tune it you can actually use the system prompt, 00:07:39.360 |
which is where you can include instructions as well, which creates this really interesting hybrid between prompt engineering and fine tuning where you can have a small data set for fine tuning. 00:07:47.960 |
But you can also give it some instructions to help once we have a data set like this we can go and we can go to our fine tunes press the add button. 00:07:56.360 |
Select the model the platform because this is cross platform and then we count your tokens and estimate your cost for you. 00:08:04.160 |
If this is going to be a dollar so hold on tight press start and that will get started. 00:08:12.160 |
But I have some here that are already trained so let's go into one and use entry point playground 00:08:17.960 |
and see if we can actually generate a press release with our fine tune model. 00:08:21.040 |
The list of facts here I actually wrote about the AI engineer summit 00:08:26.080 |
and we'll see if we can make a press release for the AI engineer summit. 00:08:30.960 |
Let's go all right, so this fine tune model created a title here and it made it look like a press release. 00:08:39.760 |
What I found to be a really cool workflow is to actually create a list of facts and then generate an article, read the article and then get ideas from it and go back to my list of facts and refine those. 00:08:50.160 |
And then that actually becomes an iterative process to get really cool results. 00:08:56.280 |
It takes a lot of the boilerplate out of the prompt and you can just focus on what's 00:08:59.840 |
important for the results you want and the rest is taken care of by your training data. 00:09:04.360 |
Entry point has a lot of other cool features like data synthesis and tools to compare the performance of your fine tune models. 00:09:10.960 |
Unfortunately, we don't have time to go into all of that today, but I hope you will check it out. 00:09:15.240 |
It's entrypointai.com and it was a pleasure speaking to you.