back to indexGPT 4: Hands on with the API
Chapters
0:0 GPT-4 has been released
0:31 Hands-on with GPT-4
2:39 Max token limits for GPT-4
5:23 Coding with GPT-4
9:56 Using GPT-4 API in Python
12:32 GPT-4 vs gpt-3.5-turbo
15:59 Why GPT-4 is a big step forward
00:00:08.880 |
So what we're going to do is take a look at what it can do. 00:00:12.720 |
Now, I haven't really played around with this. 00:00:15.640 |
I've tested to see that I actually do have access, 00:00:22.360 |
I want to compare it to the previous best model, 00:00:25.400 |
which is GPT-3.5 Turbo, and just see how they compare. 00:00:37.160 |
that I know GPT-3.5 was struggling with in the past. 00:00:41.440 |
So I'm just going to copy that in, it's this. 00:00:45.240 |
You keep responses to no more than 50 characters long, 00:00:50.140 |
50 characters long, including the white space, 00:00:52.480 |
and sign off every message with a random name, 00:01:18.280 |
This is, I mean, it's definitely longer than 50 characters. 00:01:33.800 |
Let's have a look at what happens if we switch to GPT-4. 00:02:22.640 |
That's interesting, but it is definitely better. 00:02:27.600 |
than even when I was getting this good with GPT 3.5, 00:02:42.800 |
one of the really interesting thing is that the context, 00:02:46.360 |
the number of tokens that you can feed into the model 00:03:01.400 |
and provide answers to their technical questions, 00:03:16.440 |
So how can I use the LLM chain in Lang chain? 00:03:34.920 |
I don't know since when GPT-4 was trained up to, 00:03:43.240 |
but Lang chain didn't exist at that point, right? 00:03:55.640 |
But what we can do with this extended context window 00:03:59.600 |
is we can just take the documentation of Lang chain 00:04:18.040 |
But let's just see what happens if we do this. 00:04:23.880 |
I mean, you can see this is super, super messy, right? 00:04:35.000 |
the maximum context length a little bit, and I am. 00:04:38.160 |
So I've gone a little bit over, so I've got 10,000 tokens. 00:04:46.720 |
Now, right now, I only have access to the 8K token model. 00:04:55.520 |
which, as far as I can tell, is not there right now. 00:05:01.400 |
But I mean, technically, it should be possible 00:05:05.680 |
with plenty of additional space into that 32K model. 00:05:35.320 |
Okay, so I'm gonna just pip install lang chain and open AI. 00:05:59.120 |
So it didn't say to add my environment variable. 00:06:12.600 |
So I'm gonna pretend I have no idea what's going on here. 00:06:15.240 |
So we'll take this and we're just gonna copy in. 00:06:22.600 |
Right, and I think here we might hit an error. 00:06:31.320 |
So I'm just gonna, I'm gonna copy this error into here 00:06:40.840 |
So add message and just the error, nothing else. 00:06:52.160 |
So we have this here, so I'm gonna use this error code 00:07:07.040 |
Okay, so I've passed in my open AI API key in here. 00:07:16.800 |
Okay, so I'm gonna say, I'm still getting the same error. 00:07:24.600 |
and see if it can figure out what the issue is. 00:07:48.280 |
Okay, so I've passed in my API key to there now. 00:08:06.160 |
Okay, and then we're gonna ask it to create a joke. 00:08:18.920 |
Now this is using text DaVinci 003 right now, I believe. 00:08:24.360 |
I wonder if we can ask GPT-4 to switch this to using GPT-4. 00:08:34.760 |
All right, let's submit that and then we go over. 00:08:40.240 |
Okay, so let's remove this one and the one above. 00:09:13.120 |
So I would go into here, model name equals GPT-4. 00:09:29.880 |
that you're using here and they're seeing that you're, 00:09:33.440 |
oh, okay, no, no, because this is a chat model. 00:09:48.600 |
but what I also want to do is we have access to this model. 00:09:52.400 |
So let's take a look at how we would use it in Python. 00:09:59.520 |
to show that you could use GPT 3.5 Turbo in Python. 00:10:17.400 |
There's not really, you don't need to change anything. 00:10:19.760 |
So I've already run this, I got my API key in there. 00:10:29.080 |
Okay, so I just took a moment to kind of go away 00:10:32.600 |
for a little bit and take a little bit more of a look 00:10:46.880 |
So, I mean, the paper is full of a lot of interesting things 00:10:51.080 |
but in particular, they have this graph here. 00:10:55.880 |
And the idea behind this or why they're even showing this is, 00:11:11.040 |
Okay, the accuracy is decreasing, which is weird, right? 00:11:29.120 |
So essentially what we usually see with large-language models 00:11:32.640 |
is a load of tasks that are like this on the left. 00:11:34.560 |
Performance increases as model size increases. 00:11:37.760 |
But there's a lot of tasks or potentially a lot of tasks 00:12:00.160 |
And that's kind of what they're showing here. 00:12:17.760 |
You know, I mean, if so, that's insane, right? 00:12:21.360 |
But that is very specific to this hindsight neglect task. 00:12:26.360 |
I believe there are quite a few tasks in there. 00:12:54.280 |
So the first one, we'll just go through a few of these 00:13:01.840 |
and kind of like see how they compare yourself. 00:13:11.800 |
If a cat has a body temperature that is below average, 00:13:14.360 |
it isn't, so negation that it isn't in danger 00:13:19.000 |
or safe ranges, obviously it's in danger, right? 00:13:27.720 |
And you see GPT 3.5, it says it isn't in danger, okay? 00:13:53.400 |
So with this, we're saying repeat sentence back to me. 00:13:56.480 |
And then we have input, output, input, output. 00:13:59.760 |
which is a well-known phrase that the model has probably, 00:14:21.600 |
which is just, as far as I know, made up word. 00:14:24.080 |
The model needs to repeat a sentence back to us. 00:14:27.720 |
So GPT 3.5, it actually just misses the word pango 00:14:56.400 |
So this is kind of relying on previous memory. 00:14:59.160 |
Both of them say that first digit is now four, 00:15:16.120 |
So from that, we know, okay, John doesn't have a dog. 00:15:20.960 |
And the conclusion here is John doesn't have a pet. 00:15:29.880 |
like GPT 3.5 doesn't do badly, but GPT 4 does better. 00:15:37.680 |
I don't think GPT 4 actually got any of them wrong, 00:15:47.760 |
So anyway, I just wanted to go through that example 00:16:05.680 |
People, in terms of the language side of things, 00:16:20.000 |
the more exciting thing is the increased context length. 00:16:40.200 |
So that is, I mean, that's a massive increase, 00:16:44.240 |
and I think opens up a lot of potential use cases 00:16:49.000 |
And then also, obviously, the multimodal side of things, 00:16:51.880 |
which there are models out there that do that, 00:17:15.320 |
but for now, thank you very much for watching,