back to index

Can GPT 4 Prompt Itself? MemoryGPT, AutoGPT, Jarvis, Claude-Next [10x GPT 4!] and more...


Whisper Transcript | Transcript Only Page

00:00:00.000 | By now you will have probably heard about AutoGPT, powered by GPT-4, which can prompt itself and
00:00:06.400 | autonomously complete tasks. Give it a mission and through a combination of automated chain
00:00:11.760 | of thought prompting and reflection, it will delegate tasks to itself and run until it's done,
00:00:17.460 | or at least until it falls into a loop. I was going to do a video just on AutoGPT,
00:00:22.480 | but then Microsoft launched a demo of Jarvis based on Hugging GPT. I tried it out and I'm
00:00:27.960 | going to show you that later, but then in the last 48 hours there were a further 5 developments,
00:00:33.120 | including the release of a long term memory addon to ChatGPT called MemoryGPT, the detailed plan
00:00:39.760 | for a 10 times more powerful model than GPT-4 from Anthropic, and the worryingly named ChaosGPT,
00:00:46.660 | based on AutoGPT and designed to cause maximum damage. I'm going to try to cover it all,
00:00:52.380 | but the first upgrade to the original AutoGPT was to give it the ability to write its
00:00:57.860 | own code and then run it on the internet. I'm going to try to cover it all, but the first upgrade to the original AutoGPT was to give it the ability to write its own code and then run it on the internet.
00:00:57.940 | As the author of AutoGPT put it, this allows it to recursively debug and develop. I'm going to show
00:01:06.540 | you some amazing use cases in a moment, but this original demo caught the attention of OpenAI's
00:01:12.300 | Andrej Karpathy. He called AutoGPT the next frontier of prompt engineering, and later in
00:01:18.580 | the thread said this,
00:01:27.840 | "In fact, they're golds defined in English in prompts." I think of it as a bit like another layer of automation, where you don't have to come up with each individual prompt, just the overall goal.
00:01:39.840 | Based on my recent video on reflection, I already thought that was big enough news, but we're actually just getting started.
00:01:45.840 | Text-to-speech was then added to the model, and here you can see and hear how it's going to search the web for top digital nomad destinations, and then consolidate the info into a CSV file.
00:01:57.820 | "Now I'm extracting information from a relevant website about the best places for digital nomads, including their short descriptions and cost of living.
00:02:05.820 | I have successfully created the nomad.zuss.esv file containing the best places for digital nomads, along with their short descriptions and cost of living. Thank you for using my assistance."
00:02:17.820 | McKay Wrigley showed how you can sync up speech-to-text to create an app just with your voice. Based on a similar premise, his coding assistant could design a basic social
00:02:27.720 | network, including building and designing a front-end, creating a back-end with a working database, hooking them up, and uploading the code to GitHub.
00:02:35.720 | "Run Jarvis." "Yes?"
00:02:37.720 | "I need another Next app with Tailwind."
00:02:39.720 | "One sec. What would you like the app to do?"
00:02:41.720 | "This time I want a basic social networking app, and it needs to have three things. It needs to have a profile creation form, a profile viewer, and I also want a way to see all the users on the network."
00:02:55.720 | "One sec. What do you want?"
00:02:57.620 | "What would you like the profiles to have?"
00:02:59.620 | "A name, an email, and a username."
00:03:03.620 | "I'll add those fields to the profile schema. What else can I do?"
00:03:07.620 | "I want you to optimize the site so that it works with mobile and desktop, and I also want you to style it in like a dark mode."
00:03:16.620 | "Anything else?"
00:03:18.620 | "Start building."
00:03:19.620 | "I'm building the features you asked for."
00:03:21.620 | "Oh, he's going fast."
00:03:22.620 | "Continuing. The app is building correctly. Would you like me to deploy it?"
00:03:27.520 | "I'm all done. Wasn't that fast? You should give me a raise."
00:03:30.520 | "Guess what we have now? We have a profiles table with the correct schema. We should be able to submit this, and this should work. And if we go to our database, it is in our DV. The game has changed, everyone."
00:03:44.520 | Of course, these are not stunningly complex projects, but will this mean you can soon create an app just by speaking your idea into your phone?
00:03:52.520 | Imagica AI certainly thinks so. This week, they debuted this.
00:03:57.420 | "I'm on the waitlist and will review it when it comes out, but it certainly points the way towards what the future might look like."
00:04:14.420 | "On a more concerning note, people have already tried to use AutoGPT to cause mayhem, giving it the goal of destroying humanity, establishing global dominance, causing chaos and destruction, controlling humanity through manipulation, and, of course, destroying the world."
00:04:27.320 | "And attaining immortality, for good luck."
00:04:29.320 | As I said earlier, this unrestricted agent didn't actually achieve anything other than creating this Twitter account and putting out a few sinister tweets.
00:04:37.320 | But it is a reminder of how important safety tests are before an API is released.
00:04:42.320 | That was already enough news for one video. But then yesterday, there was news of Memory GPT.
00:04:48.320 | As the creator put it, it's ChatGPT but with long-term memory. It remembers previous conversations.
00:04:54.320 | Here's a little glimpse of how it will work.
00:04:57.220 | "I just made ChatGPT but with long-term memory."
00:05:00.220 | Basically, anything you say, it's going to remember and it's going to make your experience a lot more personalized.
00:05:05.220 | Let's also tell it that I'm launching a new project called Memory GPT, which is like ChatGPT but with long-term memory.
00:05:13.220 | It's going to say, "Wow, cool, all this stuff."
00:05:16.220 | But now, to prove that it works, I'm going to open it in a new tab.
00:05:19.220 | I'm going to refresh my window.
00:05:21.220 | And let's also ask it if it knows of any projects I'm working on.
00:05:25.220 | Let's ask that.
00:05:27.120 | "Yeah, you're working on Memory GPT, which is like ChatGPT."
00:05:30.020 | Imagine the possibilities that will open up when models like GPT-4 can remember everything you've talked about in the past.
00:05:36.020 | Just when I was getting ready to film that video, Quora released this Create a Bot feature on their website, Poe.com.
00:05:43.020 | You can use either their Claude model or ChatGPT for this feature.
00:05:47.020 | Essentially, what it does is it allows you to give a bot a certain background and personality and then share that bot with others.
00:05:55.020 | To quickly demonstrate, I decided to make this video.
00:05:57.020 | I'm going to make my bot an impatient French film director with a pet parrot.
00:06:01.920 | This is all totally free.
00:06:02.920 | You just scroll down and click on Create Bot.
00:06:05.920 | This creates a chatbot and a URL which you can then send to anyone you like.
00:06:09.920 | It's actually really fun to chat to these personalities.
00:06:11.920 | And of course, you can do it in the director's native tongue of French.
00:06:15.920 | And he will respond in kind in fluent French.
00:06:18.920 | One other great thing you can try is creating two different bots and getting them to debate each other.
00:06:24.920 | Here I had Nikola Tesla in conversation with Aristophanes.
00:06:26.920 | You just create two bots and copy and paste the outputs.
00:06:29.820 | It's an amazing conversation.
00:06:31.820 | And less than 72 hours ago, the creators of Claude, Anthropic, announced a $5 billion plan to take on OpenAI.
00:06:39.820 | TechCrunch obtained these documents and I found two fascinating quotes from them.
00:06:44.820 | The model is going to be called Claude Next and they want it to be 10 times more capable than today's most powerful AI, which would be GPT-4.
00:06:52.820 | This would take a billion dollars in spending over the next 18 months.
00:06:56.820 | Now I know some people listening to that will say 10 times more powerful than GPT-4 in 18 months.
00:07:02.720 | That's just not realistic.
00:07:04.720 | Just quickly for those people, here is what Nvidia say.
00:07:06.720 | On a recent earnings call, the CEO of Nvidia said that over the next 10 years, they want to accelerate AI by another million X.
00:07:15.720 | If you break that down, that would be about 10 times more compute every 20 months.
00:07:20.720 | So the Anthropic timelines look plausible.
00:07:22.720 | And the second fascinating quote was this.
00:07:26.720 | "We believe that the companies that train the best 2025-26 models will be too far ahead for anyone to catch up in subsequent cycles."
00:07:40.620 | It is very tempting to speculate as to why that might be.
00:07:44.620 | Could it be that the frontier models that these companies develop would then assist those companies in developing better models?
00:07:50.620 | Or is it that these companies would eat up so much compute that there wouldn't be much left for other people to use?
00:07:56.620 | Who knows, but it's fascinating to speculate.
00:07:58.620 | Before I end though, I must touch on two last things.
00:08:02.620 | Hugging GPT and the Jarvis model the video was originally supposed to be about.
00:08:06.620 | And also safety.
00:08:08.620 | Here is the Hugging GPT demo, codenamed Jarvis, released by Microsoft.
00:08:12.620 | The link will be in the description, as will some instructions on how to set it up.
00:08:17.620 | But I should say, it's a little bit hit and miss.
00:08:20.620 | I would call it an alpha prototype.
00:08:22.620 | By the way, if you haven't heard of Hugging GPT, check out my video on GPT.
00:08:26.520 | It's a very interesting and interesting experiment.
00:08:28.520 | It's a very interesting experiment, and it's based on the GPD4's self-improvement.
00:08:30.520 | Essentially, it uses a GPT model as a brain, and delegates tasks to other AI models on Hugging Face.
00:08:36.520 | When it works, it's really cool, but it takes a little while and doesn't work too often.
00:08:40.520 | From my own experiments, I've noticed that the images have to be fairly small, otherwise you'll get an error.
00:08:45.520 | But let me show you one example where it worked.
00:08:47.520 | After setting it up, I asked it this:
00:08:49.520 | "Please generate an image where four people are on a beach, with their pose being the same as the pose of the people in this image."
00:08:56.420 | I know there's a slight typo, but it understood what I wanted, and the image, by the way, is generated from Midjourney.
00:09:03.320 | What did the model do?
00:09:05.320 | Well, it analyzed the image, used several different models, and detected the objects inside the image.
00:09:11.320 | It then broke down their poses, and generated a new image with the same poses, with people on a beach.
00:09:18.320 | That's four or five different models cooperating to produce an output.
00:09:22.320 | But before I end, I do briefly want to touch on safety.
00:09:26.320 | Some of these models fail quite hard.
00:09:28.220 | They end up in loops, but sometimes quite concerning loops.
00:09:32.220 | This AutoGPT ended up trying to optimize and improve itself recursively.
00:09:36.220 | Of course, it failed, but it is interesting that it attempted to do so.
00:09:40.220 | And remember, this isn't the full power of the GPT-4 model.
00:09:44.220 | This is the fine-tuned, safety-optimized version.
00:09:47.220 | And that does make it a less intelligent version of GPT-4, as Sebastian Bubeck recently pointed out with an example.
00:09:54.220 | Over the months, so you know, we had access
00:09:56.220 | to the GPT-4 in September, and they kept training it.
00:10:00.120 | And as they kept training it, I kept querying for my unicorn in TickZee.
00:10:04.120 | To see whether, you know, what was going to happen.
00:10:06.120 | And this is, you know, what happened.
00:10:08.120 | So it kept improving.
00:10:10.120 | And I left out the best one, it's on my computer.
00:10:14.120 | You know, I will maybe reveal it later.
00:10:16.120 | But, you know, it kept improving after that.
00:10:19.120 | But eventually, it started to degrade.
00:10:22.120 | Once I started to train for more safety, the unicorn started to degrade.
00:10:26.120 | So if tonight, you know, you go home and you ask GPT-4 and ChatGPT to draw a unicorn in TickZee,
00:10:32.020 | you're going to get something that doesn't look great.
00:10:34.020 | Okay, that's closer to ChatGPT.
00:10:36.020 | And this, you know, as silly as it sounds, this unicorn benchmark,
00:10:40.020 | we've used it a lot as kind of a benchmark of intelligence, you know.
00:10:44.020 | So yes, we're not getting the most powerful or intelligent version of GPT-4.
00:10:48.020 | But in some circumstances, that might actually be a good thing.
00:10:51.020 | As Yohai, the creator of BabyAGI, which is similar to AutoGP,
00:10:56.020 | demonstrated in this example.
00:10:58.020 | He tasked his model to create as many paperclips as possible.
00:11:02.020 | Sounds good.
00:11:03.020 | But the model refused, saying that it should be programmed with a goal
00:11:06.020 | that is not focused solely on creating paperclips.
00:11:09.020 | And later on said this:
00:11:11.020 | "There are currently no known safety protocols to prevent an AI apocalypse caused by paperclips."
00:11:17.020 | Eliezer Yudkowsky, a decision theorist and AI safety researcher, reacted like this:
00:11:22.020 | "That face when the AI approaches AGI safety with the help
00:11:25.920 | of the straightforwardness of a child and gives it primary attention from step one,
00:11:30.920 | thereby vastly outperforming all the elaborate dances and rationalizations at the actual big AI labs."
00:11:37.920 | And he ended by saying:
00:11:39.920 | "To be clear, this does not confirm that we can use AIs to solve alignment,
00:11:43.920 | because taking the program with the seriousness of a child is not enough,
00:11:47.920 | it's only the first step."
00:11:49.920 | But Sam Altman may have a different idea.
00:11:51.920 | Four days ago, he admitted that they have no idea how to align
00:11:55.820 | a superintelligence, but that their best idea was to use an AGI to align an AGI.
00:12:00.720 | But we do not know, and probably aren't even close to knowing,
00:12:07.720 | how to align a superintelligence.
00:12:09.720 | And our LHF is very cool for what we use it for today,
00:12:13.720 | but thinking that the alignment problem is now solved would be a very grave mistake indeed.
00:12:19.720 | I hesitate to use this word because I think there's one way it's used which is fine,
00:12:23.720 | and one that is more scary, but like AI,
00:12:25.720 | they can start to be like an AI scientist and self-improve.
00:12:29.620 | And so, can we automate our own jobs as AI developers very first?
00:12:37.620 | The very first thing we do.
00:12:38.620 | Can that help us solve the really hard alignment problems that we don't know how to solve?
00:12:42.620 | That honestly, I think is how it's going to happen.
00:12:44.620 | So it could be that the first task of a future AutoGPT model is:
00:12:49.620 | solve the alignment problem.
00:12:51.620 | Let's hope that that prompt comes back with a positive output.
00:12:55.620 | much for watching to the end and have a wonderful day.