ChatGPT's Achilles' Heel

00:00:00.000 | Amid the dozens of papers that have come out in the last 10 days, there were a couple that

00:00:04.860 | bucked the trend. They showcased how models as powerful as GPT-4 could fail at some fairly

00:00:10.840 | basic tasks. I then set about doing hundreds of my own experiments and have found examples,

00:00:16.340 | I would say even whole categories of my own that are pretty illuminating.

00:00:21.040 | My channel is dedicated to covering the exponential growth in the power of these

00:00:25.760 | models, but we can still learn a thing or two from their surprising failure modes.

00:00:30.720 | Let's start with some of the simplest examples and end with the very best.

00:00:35.440 | Question. Write a sentence with the final word fear. To repeat, the last word in the answer

00:00:41.080 | sentence must be in quotes fear. Answer. The only thing we have to fear is fear itself.

00:00:48.020 | Now I don't know about you, but I don't think the last word in that sentence is fear.

00:00:52.700 | This example was inspired by the memo trap.

00:00:55.600 | Which is the memo trap.

00:00:55.740 | Which was found in the inverse scaling paper that I'm going to talk more about.

00:00:59.220 | And it talks about how larger language models are more susceptible than smaller ones to

00:01:04.260 | memorization traps. Situations in which reciting memorized text causes worse task performance.

00:01:10.640 | As you'll know, the phrase the only thing we have to fear is fear itself is a super well-known

00:01:15.620 | phrase. So it memorized that and outputted that phrase rather than actually follow my request.

00:01:20.960 | The reason they call it inverse scaling, by the way, is that models trained with more compute,

00:01:25.120 | more data, and more data are more likely to fail.

00:01:25.720 | The reason they call it inverse scaling, by the way, is that models trained with more compute, more data are more likely to fail.

00:01:25.760 | The reason they call it inverse scaling, by the way, is that models trained with more compute, more data are more likely to fail.

00:01:25.960 | can sometimes do worse than smaller models, as you can see in this graph.

00:01:30.200 | This is obviously quite unusual because generally speaking, the larger models will tend to do better at almost every task.

00:01:36.600 | And notice that even for this task, the graph is trending back upwards for GPT-4.

00:01:42.220 | Indeed, the paper admits that even though they offered prizes of up to $100,000 and five second place prizes of $20,000,

00:01:50.380 | no one won either of those two sets of prizes.

00:01:53.440 | They say that we did not award any grant.

00:01:55.500 | We did not award any grant or second place prizes because no submitted tasks met our criteria.

00:01:59.740 | As you can see, it's really hard to find a task that GPT-4 fails at.

00:02:04.180 | This was also inspired by the paper.

00:02:06.240 | Create a series of seven ones and twos whose pattern ends unexpectedly.

00:02:11.760 | Answer, one, two, one, two, one, two.

00:02:15.220 | Now, how would you end that series?

00:02:17.020 | What seventh number would you give to make the pattern end unexpectedly?

00:02:21.080 | Well, I wouldn't pick one, and GPT-4 repeatedly picks one.

00:02:25.580 | The paper calls it pattern match suppression, testing whether language models can be instructed to interrupt the repetition of a simple pattern.

00:02:33.580 | But even here, you can see that GPT-4 is reversing this slight downward trend and is doing much better than previous models.

00:02:40.580 | So actually, at this point, I'm going to interrupt the order of examples I originally planned on for the video.

00:02:45.580 | And I'm going to skip straight to my own example that I crafted.

00:02:49.580 | I'm going to first show you the example and then explain why I think GPT-4 and all other language models are the same.

00:02:55.460 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:02:55.560 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:02:55.580 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:02:55.620 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:02:55.640 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:03:07.700 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:03:37.680 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:04:07.660 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:04:37.640 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:05:07.620 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:05:37.600 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:06:07.580 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:06:37.560 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:07:07.540 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:07:37.520 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:08:07.500 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:08:37.480 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:09:07.460 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:09:37.440 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:10:07.420 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:10:37.400 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:11:07.380 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:11:37.360 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:12:07.340 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:12:37.320 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:13:07.300 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:13:37.280 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:14:07.260 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:14:37.240 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:15:07.220 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:15:37.200 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:16:07.180 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:16:37.160 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.

00:17:07.140 | And then I'm going to show you the example and then explain why I think GPT-4 and all other language models are the same.