No-code fine-tuning: Mark Hennings

Hey, my name is Mark Hennings. I'm a serial entrepreneur, and I'm super excited to talk to you about fine-tuning large language models today without any code. Let's begin. For our purposes today, fine-tuning is training a foundation model for a specialized task. Some examples of these specialized tasks are writing any kind of copy, emails, blog articles, product descriptions.

It could be scrubbing fake emails from a list, extracting or normalizing data, translating, paraphrasing, rewriting, qualifying a sales lead, ranking priority of support issues, detecting fraud, or flagging inappropriate content. These are very common tasks that businesses do every day. And something they have in common is that traditional programming or rule-based approaches do not work well for them, but large language models are great at them.

They perform them easily and they can capture the nuance in the text that you're working with. So why should we fine tune? I mean, prompt engineering is great, right? You can do almost all of these things with a prompt. Well, I'll tell you, fine tuning is awesome. It's faster and cheaper because you can train a lighter model to match the quality of what you were doing with a prompt.

It reduces the size of your prompts, allowing for longer completions. Trending examples allow you to cover edge cases and collaborate better as a team. And it's naturally resistant to prompt injection attacks. So let's dive into some of these. How much faster is it really? Well, if you take GPT-4 and its response time per token, it's about 196 milliseconds, give or take, from the OpenAI API.

On the same API, GPT-3.5 is 73 milliseconds. That's three times faster. How much cheaper is it? Well, taking an example with GPT-4 versus GPT-3.5 fine tuned, you can save 88.6%. Well, then how much shorter do the prompts actually get? Well, I'll give you one example because it's going to vary depending on your prompt.

But here's what a typical engineered prompt might look like. It has some instructions saying to, you know, write a blog post on this topic, how to write it, what tone to use, what to do, what not to do. Well, with a fine tuned model, it learns how we write.

So we don't need all of those instructions. It learns from our training examples. So we're just giving it the one thing that's unique about this prompt versus another prompt, which is the topic that we want to write on. And in this very conservative example, it's 90% shorter. Now let's talk about collaborating as a team, right?

Because none of us work in a vacuum. We work with other people. Imagine a GitHub repo. You have one file. Your whole code base is just one file. That's like your epic prompt. Well, with fine tuning, now you can have multiple files like we're used to, where developers can work on this section of code or that section of code.

But we're not talking about code. We're talking about training examples. So your training data is this layer that your team can work on and add to and edit and improve. And then that feeds into the fine tune model. So the main point is if you can get equal or better output, why wouldn't you fine tune a model?

Now, fine tuning is kind of a dev job right now. Okay? Let's be real. If you go online and you look up how to do fine tuning, you can find articles that talk about how to spin up GPU servers for training and inference. And you got to format your data with these ad hoc Python scripts and configure these parameters and then make API calls.

It just looks like a dev job, but if you really break it down, why can't we just automate all of that with a user interface? Is that possible? It is possible. And the bar is lower than most people think to get started doing this. If you can get 20 examples of what you want your fine tune model to do, you can fine tune a model.

This is not traditional machine learning where you need thousands of examples to get started. And the data set is this impossible barrier to get past. No, this is something that you could handwrite these if you want to. One way to think about this is as an extension to few shot learning.

Let's say you can have five examples of what you want a model to do in your prompt. Well, with fine tuning, your training example data set can be as long as you want. So instead of five examples, you can now have 20 or 100. So it seems intuitive that with more examples, the model would be able to do closer to what we want it to do.

So here's what I propose for a dev lifecycle for large language models. We start with prompt engineering. Prompt engineering is a powerful tool. It allows us to create a prototype to validate the concept. And we can also use it to create our initial data sets for fine tuning. Once we have those data sets, we should fine tune a model and we should evaluate it to make sure that it actually is better than the prompt engineered version.

And then we can test which models we can get to perform at the same level. Then the fine tuned model can go into production and from production, we can capture feedback from our users and we can log the examples. And with those examples, we can continuously improve our fine tuned model because now all of a sudden we have the real examples that we can add back into our data set.

So in terms of roles, I think that there's a huge opportunity for people to get into prompt engineering and fine tuning who are not developers. Yes, if you're a developer, you can fine tune. Absolutely. But you shouldn't have to be the only person that can fine tune. I'm a co-founder at EntryPoint and we have built the modern tooling to make this easy.

Let's take a look at how it works. Here we are on the dashboard and I'm going to open the press release writer project. Let's take a look at my 20 examples. The way I created these 20 examples for a press release generator was I went online and I found 20 press releases that looked really good.

They came from blog articles about the best press releases that you can write. However, I didn't have input data, so my data set was incomplete. But I used chat GPT 4 to take the press release and then write a list of facts that would be needed to actually have a professional writer write such a press release.

You know, large language models aren't great at facts. So providing it the facts as the input makes sense to me that I want to give it a list of facts and then have it write something that's really polished that would be a really good first draft of a press release.

With this user interface, I have a lot of visibility into the data that I'm actually putting into my fine tune model, which I think is really important and the way this works is that we have a structured data approach. So when you import like a CSV into entry point, each column becomes a field.

Here I have the facts and here I have the press release and these fields you can use in a template. Just like you are writing a mass email and you want to insert somebody's first name or personalize the emails with information about a contact record. You can use references to these fields with the handlebars templating language and it provides a really intuitive way to easily format your output, your input and GPT 3.5 Turbo when you fine tune it you can actually use the system prompt, which is where you can include instructions as well, which creates this really interesting hybrid between prompt engineering and fine tuning where you can have a small data set for fine tuning.

But you can also give it some instructions to help once we have a data set like this we can go and we can go to our fine tunes press the add button. Select the model the platform because this is cross platform and then we count your tokens and estimate your cost for you.

If this is going to be a dollar so hold on tight press start and that will get started. But I have some here that are already trained so let's go into one and use entry point playground and see if we can actually generate a press release with our fine tune model.

The list of facts here I actually wrote about the AI engineer summit and we'll see if we can make a press release for the AI engineer summit. Let's go all right, so this fine tune model created a title here and it made it look like a press release. What I found to be a really cool workflow is to actually create a list of facts and then generate an article, read the article and then get ideas from it and go back to my list of facts and refine those.

And then that actually becomes an iterative process to get really cool results. So I really enjoy fine tuning. It takes a lot of the boilerplate out of the prompt and you can just focus on what's important for the results you want and the rest is taken care of by your training data.

Entry point has a lot of other cool features like data synthesis and tools to compare the performance of your fine tune models. Unfortunately, we don't have time to go into all of that today, but I hope you will check it out. It's entrypointai.com and it was a pleasure speaking to you.

I'll see you next time. Bye. Bye. Bye. Bye. Bye. We'll see you next time.

No-code fine-tuning: Mark Hennings

Transcript