Back to Index

GPT 4 Got Upgraded - Code Interpreter (ft. Image Editing, MP4s, 3D Plots, Data Analytics and more!)


Chapters

0:0 Intro
0:28 3D Surface Map
1:13 QR Codes
1:27 3D Scatter Plot
3:25 Optical Character Recognition
4:0 Time Series Range Sliders & Selectors
4:59 Data Analysis
7:11 7. Video Editing
8:57 Steganography
11:34 Treemaps
12:49 Radial Bar Plots
17:10 Image Editing
21:45 Bonus: Venn Diagrams

Transcript

I just got access to the Code Interpreter plugin about 48 hours ago and have been running experiments on it non-stop since then. I've come up with about 18 examples to show you guys its power. Most of them I reckon haven't been seen before. I predict many industries will have to update overnight when it's released more widely and at the end of the video please let me know what you think and what other experiments that we can try.

First though what about this one, a 3D surface plot. Just quickly the way it works is you click this little button to the left of the text box and then you can upload many different file types like CSV files, word files, images and even short videos. Then it will automatically analyze the file type without you pressing anything and then of course you give it a prompt.

And as with all of ChatGPT it becomes a conversation. So the first 3D surface plot was decent but it was too small. So I simply said in natural language can you make it four times bigger, thank you. And of course you have seen the amazing end result even with the lighting.

Look at these shadows there. I believe this is based on a real contour map of a volcano in New Zealand and I could do a whole video just on this but I have 17 other examples to get to but this one was truly amazing. Did you know for example it can generate QR codes?

I said create a QR code that I can scan with my phone to reach the following URL and lo and behold it creates it and yes it does work. Maybe I'm easily impressed but I think that's pretty amazing. And what about a 3D scatter plot? This is truly remarkable.

I uploaded the data from Gapminder and it created this chart based on the median age of over 100 countries from 1950 I think projected to 2100. And I asked highlight the UK. This is indeed the UK's median age through those years in red. But I know what you might be thinking that is amazing that it's 3D and interactive but the blue kind of merges and it's hard to see what's going on.

I engaged in a conversation and I was able to see that the 3D scatter plot was really good. And look what it created. It picked out the 30 most populous countries and separated them off with separate colors. Look at that. That is gorgeous. Now you might have the critique that the median age is in descending order in the y-axis going from 20 down to 60.

So in a sense the median age is actually rising not falling but nevertheless that's easily amendable and that is truly an incredible diagram. And look just for fun I'm going to go into the data. Look at this. I'm traveling into the data. This is so wild. I don't know how helpful it is but I think that's just beautiful and crazy.

There are so many industries, data analytics, accounting, consultancy that this will affect. By the way it got all of this done in about a minute. I see a lot of people online talking about five seconds later. It is no way done in five seconds. You have to wait 30 seconds, a minute, sometimes much longer.

Before I move on I want to give you a killer tip that it took me quite a while to work out. So when you get access try to remember this. Say output the visualization as a downloadable file. If you don't add that phrase as a downloadable file what will happen is it often gets stuck at this stage of the code.

It'll either say fig.show or plot.show and then just stop. I found that I encountered this problem far less often if I said output a downloadable file. Next did you know that Code Interpreter can do optical character recognition? I screenshotted this text from a New York Times article I think it was and I asked OCR the text in this image and write a poem in Danish about it.

Now I don't want to exaggerate it often gets OCR wrong. I don't want to get your hopes up. It fails more often than it succeeds but when it works it can do it. Understood the text and then did a poem in Danish about the text. Now I'm going to need a Danish speaker to tell me if that was a good poem but either way it could do it.

How about this one? It can do it. I uploaded a CSV file on life expectancy data from the entire world and I just said can you pick out the US, UK and India and create a time series with range slider and selectors. Again that killer phrase output a downloadable file and here is what it came up with.

Notice how the life expectancy for all three countries rises during the 20th century and look how I can select down here interactively a range of the data and even by clicking up here a 10-year interval or 50-year interval. But here's the crazy thing I did nothing. I just uploaded the file.

There were hundreds of countries in there. You can see here all the steps that it did and if you click on the arrow you get to see the actual code but then it goes through shows its explanation and eventually gives you a link that you can simply click and get the file downloaded.

And if you weren't that impressed already here's where it gets fairly game-changing. You can edit to do the data analytics not just the visualizations. For example I said find five unexpected non-obvious insights from this data and offer plausible explanations for them. This was back to the median age data.

For the most interesting observation provide a compelling and clear visualization. Now ignore the first diagram which wasn't that good because of the x-axis but look at the insights. This is data analytics. You can see here that the original file was called median age years and it was just a table of data no analysis whatsoever.

But look what GPT-4 picked out. Insight one the global median age has been steadily increasing over time. It calculated the global median age. That wasn't included in the data. It was just country data and it says it's gone from around 22 years to over 38 years in 2023 and it's projected to continue rising to approximately 44 years by 2100.

And then it offers a cogent explanation. This trend is likely due to a combination of increasing life expectancy and decreasing fertility rates worldwide. As medical technology improves more people are living longer, birth rates are declining particularly in developed regions. It's picked this all out and then it moves on to the next insight.

The countries that have seen the most significant increases in median age are these ones and again it gives an explanation as to why their median age might have risen more than any other. For example Albania has seen significant emigration of younger people which could also lead to an older median age.

Is it me or is that kind of crazy that it crunched all the data, visualized it but then also gave really interesting analyses of the data. Now you can read the other analyses but each of them are really interesting and the final visualization which I asked for is brilliant I think.

Notice how the graph goes from green to red when you get to the future projection. I didn't ask it to do that. Now obviously in this video I'm going to focus on the flashy visuals and the cool little tricks it can do but in terms of data analytics. That is what is going to change jobs, change industries and remember this is code interpreter alpha version 1.

Look at the difference between mid Journey version 1 and now mid Journey version 5 a year later. How about basic video editing? Now there is a limit to what it can do but it can do some basic video editing if you ask it. For example I uploaded a short file and asked it to rotate the file 180 degrees and it was able to do it.

Now I'm not saying that is massively useful but it was able to do it. Here is a similar example. I uploaded an image file and then said can you zoom out from the center of the image. Now initially it did zoom in but then I clarified that I wanted it to zoom out from the center.

Just to be cheeky I also asked can you make it black and white. Oh and I also asked to add music but it couldn't add music. Anyway here is the end result. By the way it gave it to me as an mp4 file and look it zooms out from the center and it's made the image black and white.

Now because I got access so recently I honestly haven't explored the limits of what kind of video editing I can do with ChatGPT code interpreter but I will let you know when I can. Now back to visualizations. I gave it a hypothetical scenario that sounds kind of realistic. I sent 231 CVs, got 32 responses, 12 phone interviews, three follow-up face-to-face interviews and one job offer which I rejected.

I output a downloadable Sankey diagram of this data. I did then get it to change the coloring slightly but I think that's a pretty cool Sankey diagram. Look sent CVs 231 and then receive responses and you can go down 32 phone interviews, 12 face-to-face interviews and three job offers and one rejected offer.

Obviously I could have tweaked that for hours, make it more visual, make it more interactive, maybe make a gif of it but for two minutes work I think that's a pretty interesting and incredible output. Next and here is one that you might say is a little bit concerning and it's about steganography.

Now I will admit I am not at all an expert in fact I know virtually nothing about it. Essentially what it involves though is hiding a message inside an image or inside some code and GPT-4 was more than willing to play along and it encoded a secret message into an image.

There is the image by the way and if you looked at that you'd think that's totally normal that's just a silly little image right? Well apparently here's what it can do. To a casual observer it looks like a simple image with some shapes but it actually contains the hidden message "Hello World" then it provided a python function which can be used to decode the message from the image.

Now obviously this is just a silly example that is totally harmless but am I being crazy in thinking this is a somewhat concerning ability for future language models to possess especially when they reach the level of an AGI. Often OpenAI talk about future versions of GPT doing scientific research and finding things that humans wouldn't have discovered.

But let me explain that a little bit more. First of all, let me pose the scenario that it gets better than any human expert at steganography. But anyway enough from me I'll let the experts weigh in on that one. Next, did you know that GPT-4 with code interpreter can do text to speech?

Just before anyone comments though why did I write "Proceed without further question"? Because GPT-4 with code interpreter has a tendency to always ask clarifying questions and if you have access to only 25 messages every three hours you don't want to use up half or more of them on clarifying what it wants to do or saying "yes please do that".

But I found writing "proceed without further question" means it gets straight to it and essentially you get double the number of prompts for your money. Anyway as you can see I asked "turn this entire prompt starting from the beginning into a text to speech file". Now quite a few times it denied it had the ability to do this but eventually I got it to work.

It was actually when I finally gave it this prompt and it worked. I say it worked but it didn't quite work as intended. Check it out. Here is the text to speech that it came up with. "You are ChatGPT, a large language model trained by OpenAI. When you send a message containing Python code to Python it will be executed in a stateful Jupyter notebook environment.

Python will respond with the output of the execution or timeout after 120.0 seconds. Internet access for this session is disabled. Do not make external web requests or API calls as they will fail." Now thank you Stephen Hawking for that message. The only thing is it had nothing to do with my original prompt.

Now anyway when you get access to Code Interpreter play about with text to speech because it is able to do it even if it denies it. Time for a fun one. I asked "create a tree map of the letters in the following quote" and I'm not going to read it out because I am not good at tongue twisters.

Anyway I said "give each part of the tree map a different color and output a downloadable file. Proceed without further question". And here is the output and I checked it for the letter P and it was correct that there were 36 instances of the letter P in the output.

And look how it's proportional with the number of instances of the letter and the size of each rectangle. I think that is pretty insane. Okay back to something more serious. I uploaded this file which is an image of a math problem. Quite a hard one as well. And you guessed it I said "solve the math problem in this image".

It then extracted the text from the image presumably using OCR and then proceeded to solve it. And I'm going to get onto this in a second. It is better at math than Wolfram Alpha. I know that's a claim but it's far less buggy. I found Wolfram Alpha crashing very frequently.

Anyway here are the two solutions and isn't that incredible. From a photo essentially it then extracts out the math problem including the two square roots and then solves it. This is all within the same window of ChatGPT. No need for any other apps or extensions. Next it can do radial bar plots which I think are really quite beautiful.

I'm not saying this is the best one ever and I'm sure you could tweak it to make it more clear and beautiful. But look at that. The life expectancy in the US climbing from 1800 and then it goes clockwise reaching a projected almost 90 by 2100. Again I'm sure you could do a far better job than me in extracting out a more beautiful diagram.

But aren't radial bar plots just beautiful to look at. Speaking of cool diagrams how about this. I didn't even specify which visualization to do. I uploaded the same life expectancy data and I just said "what are the most advanced and technical visualizations you can do with this data". Proceed to do them.

Now honestly it picks some visualizations that I don't think are the most advanced but nevertheless it was creative. Here is what it did. It does frequently make the mistake of cluttering the axes and having far too many labels so that you can't see anything. So scrub that one out.

Not great. But what about the next few. Remember it just did this on its own. This is a heat map and you can see some really interesting things from this data. Like India starting with a much lower life expectancy than anyone else but gradually rising but still falling behind the others even in 2100.

And look at China. Look how the life expectancy drops in the 60s and 70s. I think we all know what happened there. Compare that to the US which is a gradual continual ascent. Actually aside from 2020 look how the shade gets a little darker in 2020. Obviously you guys can probably work out what happened around then but then the projections are for it to go up toward 90 by 2100.

That's a beautiful and clear heat map that I didn't even ask for it to do. Let's look at the next one. Box plot. Do you remember those from school? You get the upper end of the data, the highest one, the lowest one, the median, the first quartile and third quartile.

And it's a great way of statistically representing a set of data and it's done it for every 50th year starting in 1900. Obviously a slightly less beautiful diagram than some of the ones you've seen today. But for the statisticians in the audience you will know that this is a very useful metric for a lot of data.

The individual points above and below are typically when there are outliers in the data. I would estimate that all of these visualizations only took around two, two and a half minutes. So definitely not the 10 seconds as I said that you often see on Twitter. I mean have you ever seen GPT-4 give an answer in less than 10 seconds?

Speaking of useful I think many professionals will find the next thing that I'm about to showcase the most useful of all. Any insights that GPT-4 finds, trends, medians, analyses, whatever. You can ask it to add to the original file and then download it. Do you remember that the original file was called Median Age Years?

Well notice this file name Median Age Years with insights. It has created a downloadable new file with the insights included and look at some of the insights that I mean. You have the change from 1950 to 2100 and here is the average median age throughout the period and the change from 2023 to 2100.

Notice that the original file didn't have those columns. They were added by GPT-4 with code interpreter. And now how about data progression video files. I was honestly shocked when I saw that it could do this but I asked can you make a 256 by 256 mp4 that gradually reveals the lines as they progress on the x-axis.

This was about the median age over time. Here is what it did. And look at how the data and the chart progresses as time moves along. I was really shocked to see this. And the line in red which is going to be labeled at the end is the global median age.

And remember it calculated that. That wasn't in the original file. Now I'm not sure why it picked out these four countries. Maybe because they represent extremes. But either way I think the result is phenomenal. And I'm genuinely impressed that it did this even though I know the final result could be improved dramatically.

For example far higher resolution and maybe the global median age labeled from the start. And actually now that it's got to the end I can see why it did pick out these countries. Because niger did have the lowest median age in 2100 and it looks like Puerto Rico had the highest and the fastest aging one was Albania.

Next and this is going to shock quite a few people. What about image editing. I created this image in mid journey version 5 and then here's what I asked. I said use OpenCV to select the foreground of this image. And look what it did. It picked out the foreground.

No blue sky. Now I know it's not perfect but it's nevertheless impressive all within the window of ChatGPT. This does actually make me wonder if OpenAI and ChatGPT is eventually not now but in a few years going to swallow all other apps. Or maybe Google's Gemini. But either way one interface one website one app doing the job of all others.

And by the way of course ChatGPT is now available on iOS. But imagine you have one app. And it can do image editing, text to speech, video editing, everything data analysis. Not at GPT-4 levels but GPT-6 or GPT-7 levels. If you can get every piece of information, service and application in one interface.

A bit like now people being addicted to their smartphones. Won't people be addicted to this one interface. Again that's not going to happen now but I'm just posing it as a question to think over. For the moment though before anyone gets too carried away it does still hallucinate quite a lot.

So I uploaded this image and I asked it questions about it. And it answered and I was like wow it can do image recognition. It said this image appears to be a digital painting of a humanoid figure at a desk with a rather complex background. I was initially amazed until I realized that it probably got that from the file name.

Because when I asked it questions it got it wrong. So I said what is on the desk. I look back there's this weird kind of microphone and a bit of paper and not much else a keyboard. And look what it said. There are multiple floating holographic displays. Okay. A mouse.

Not really. A desk lamp. I can't see that. And then tools and devices. Now correct me if I'm wrong but I think most of those are incorrect. Now obviously I need to do far more experiments to see if it actually can recognize any particular images. And maybe I'm putting it down too harshly.

But at the moment it does seem to hallucinate if you ask it about too much of the detail of an image. Next you remember how one of the key weaknesses of GPT-4 is that it can't really count things. Especially not characters, words etc. And even more so it can't do division.

And some of you might be thinking well with Wolfram Alpha it can do those things. Not quite. Here is an example of the Code Interpreter plugin essentially eating Wolfram Alpha obviating it making it not obvious what the utility of it is if you've got Code Interpreter. I asked divide the number of the letter E's in this prompt by the number of the letter T's.

Now you might think Code Interpreter can improve things by doing the character counting. But it can also do the division. Notice how it counted the characters correctly compared to Wolfram Alpha and of course got the division correct as well. So if it can do advanced quadratics and do division and character counting etc.

It does beg the question what would we use Wolfram Alpha for that we can't use Code Interpreter for. I honestly might not know something that you guys know so do let me know in the comments. It also got this math question correct. And notice you get these beautiful math visuals that you don't get with the base version of GPT-4.

You get something more like this where the visuals aren't as clear. And notice the base version of GPT-4 gets the question wrong. It can't do division but with Code Interpreter it gets the question right. Next one is a quick one. Pie chart. Nothing too special but I think it is a fairly beautiful visualization.

It doesn't seem to matter how big the CSV file is that you upload. This next example was really quite fascinating. It was a word puzzle. I have tried this particular word puzzle on GPT-4 dozens of times. The reason I picked this puzzle, it's called a word ladder, is because it really struggles with the puzzle if the number of steps required is more than a certain number.

Usually about five or six steps. It gave me a really interesting border of the limits of GPT-4's planning abilities with language. Anyway, it always gets it wrong. Here is a demonstration with the base model of GPT-4. You might say, why is this wrong? But look at how it's changed from C's to Sags which is more than one letter change.

And that's typical of the kind of errors it makes. What about with Code Interpreter? Well, you can probably guess the ending given that I featured it in the video. But it gets it right. I believe it draws upon a hard-coded word set. And this does point towards the kind of puzzles that I think GPT-4 with Code Interpreter will be able to solve.

Things like crosswords and Sudokus. Okay, not exactly world changing but nevertheless I think quite fascinating. And how about Venn diagrams? The reason I picked this example is that I had to go through about 10 steps to get it to create this rather basic three-way Venn diagram. This represents the overlap between dogs, AI, and AI.

And that's why I picked this example. And I think it's a really good example. And apparently all of them are loyal companions. Well, we will see about that. But anyway, it took quite a few steps to get it right which is pretty annoying. But here's the really interesting thing.

Once I got it set up in the way that I like, all I had to do was say, use the format above to create a new three-way Venn diagram. This time for mangoes, movie heroes, and marmosets. Try to make each entry funny and use different colors. Proceed without further questions.

So it may have been a struggle to set up the format initially but once done it was so easy to iterate a new three-way Venn diagram. And actually it was better than the original. Apparently all three are adored by fans worldwide. Apparently only marmosets and movie heroes can climb up trees really fast and mangoes and marmosets can hang upside down.

That's crazy. One or two prompts iterating on a design already agreed upon. This is honestly what is likely to happen in the future with people spending hours to find the perfect data-based Venn diagram. And I think that's a really good example of how to create a three-way Venn diagram.

So you can create a three-way Venn diagram with a single piece of data visualization or piece of data analysis and then just hitting copy paste for all their other files. Perfect it once and then it does the rest for you. A quick couple of bonus ones before I finish.

You can just ask it to come up with a visualization giving it no direction at all. It came up with a distribution of prime numbers up to 10,000. Thing is I believe there's a slight mistake at the beginning because I think there's only 25 in the first 100 and 21 in the next 100.

So you probably do want to still check the outputs that Code Interpreter gives you. And that's another reason it's not going to instantly replace all data analysis and data visualization. It's not perfect and it's not fully reliable but you've got to look ahead to where things are going. I'm going to end where I started with this insane 3D surface map of a volcano.

If this is what GPT-4 can do now with the alpha version of Code Interpreter what will GPT-5 or 6 do with version 7 or 20 of Code Interpreter? I was about to speculate about that but then I got distracted with trying to get inside this volcano. It is kind of fun.

Look I'm going above and into the volcano. Let me know what you will try when you get access. I know they're rolling it out steadily and I know that some people have had access to it for about three weeks. So hopefully if you want to experiment with it you will be able to soon.

In the meantime do let me know if you have any ideas that you want me to experiment with. And thank you so much for watching. I'll see you in the next video. Bye bye. Thank you so much for watching all the way to the end.