back to indexThe New Bard and Crazy AI Images, Videos, and Translations
00:00:00.000 |
Just when I was going to do a video on the revolution that has happened in translation 00:00:04.980 |
dubbing, along comes a new AI image technique that blew my mind. Got ready to film that, 00:00:11.320 |
and then the new BARD comes out, which I have now tested in a dozen ways. I'm going to try to cover 00:00:17.200 |
it all and show you how you can benefit, and I'll also cover how anyone with expertise in 26 diverse 00:00:23.960 |
fields could contribute to red teaming what might become GPT-5. But let's start with BARD, 00:00:30.080 |
where less than 24 hours ago we learned of BARD extensions. It's a bit like ChatGPT plugins, 00:00:36.380 |
but it works specifically for YouTube, Gmail, Google Docs, Google Drive, etc. But if we cut 00:00:42.560 |
out some of the hype, what would we actually use this for? Well, here's one use case that I will 00:00:47.300 |
be using it for that I couldn't have done before. The image you can see is a photo I took while 00:00:53.940 |
And where previously BARD could analyze the image and tell me a bit about Mithras, 00:00:58.860 |
the god shown, it couldn't have done this. Recommend a YouTube video to learn more about 00:01:04.180 |
this figure. Notice I didn't specify Mithras, it deduced just from the image that this was Mithras 00:01:09.820 |
and then suggested a video for me. If this had been available while I was out exploring, 00:01:14.440 |
I would have probably watched that video while looking through the ruins. I must say I really 00:01:19.720 |
like the seamlessness of not having to separately analyze the image, but having to separately analyze 00:01:23.780 |
the image and then go onto YouTube and search. Instead, it all happens in one window in one app. 00:01:29.280 |
Likewise, imagine seeing a travel image you like and you want to know about flying there. This was 00:01:34.580 |
again a photo I took while traveling recently, and I didn't separately ask it what location is this 00:01:40.080 |
and how would I fly there from London. Instead, it was all one request, flight duration and typical 00:01:46.140 |
costs from London to location shown. They actually gave me additional information about the Fisherman's 00:01:51.920 |
Bastion and then found specific information about the Fisherman's Bastion. So I'm going to go into 00:01:53.760 |
flights using Google Flights with the airlines and the typical costs. And I can say that these 00:02:00.020 |
flight durations and timings are accurate. Let me show you one more cool thing before we get to some 00:02:05.360 |
of the anti-hype. I asked Bard to search my Google Drive for the recent SmartGPT doc and said write a 00:02:12.700 |
Shakespearean sonnet to summarize it. And you can almost imagine this happening for the minutes of 00:02:17.620 |
a meeting you've recently attended or anything that might be a little dry that you want to spice 00:02:22.320 |
up a little bit. Now, yes, this sonnet is a bit of a pain in the ass, but it's not a big deal. It's 00:02:23.740 |
not a big deal. It's a very simple way to get a good idea of what's going on. The sonnet doesn't 00:02:24.240 |
accurately follow the Shakespearean template, but it's really not bad. Look at this. From humble 00:02:29.100 |
prompts, thou dost create such light. Thy creators, AI explained and Josh, with foresight keen, 00:02:35.420 |
have harnessed thy power to solve the unseen. With chain of thought, research and reflection keen, 00:02:40.780 |
thou dost answer questions with precision keen. However, we are back to the problem of 00:02:46.540 |
hallucinations. If you can't fully rely on what Bard is saying, its use drops massively. Now, 00:02:53.600 |
I have successfully got Bard to read a lot of my Gmail messages. It can do that. And of course, 00:02:59.680 |
some of the messages I get are comments on my YouTube channel. But this time when I asked, 00:03:04.440 |
it simply made up feedback that I hadn't gotten. Let's be honest, you guys know YouTube. Do any 00:03:10.160 |
YouTube comments sound like any of these three? It seems very rare to me to get a YouTube comment 00:03:16.160 |
that is so neutral and polite and calm. Of course, I quickly checked and no, such a comment didn't 00:03:22.400 |
exist. This was a comment that I had to make. But I didn't get any of these three. I didn't get any 00:03:23.460 |
of these three. I was Bard hallucinating. In fact, all three were. And I just want to quickly show you 00:03:27.720 |
it can sometimes find genuine comments like this one. But the problem is, what is the utility of 00:03:33.700 |
searching for something like that if you can't rely on it? Also, sometimes you have to remind 00:03:38.420 |
it to do things that you know it can do. For example, here, it successfully went through my 00:03:42.700 |
Gmail and found my Replit bill. But when I asked how much more is my Replit monthly bill than my 00:03:48.160 |
Eleven Labs bill, it said it can't find anything. But then when I prompted again and again, 00:03:53.320 |
it first found the receipt from Eleven Labs and then compared it. But it wasn't smooth and I would 00:03:59.320 |
say at this point, not terribly useful. Likewise, this image recognition of Darth Vader was great, 00:04:05.720 |
even without naming the character, it got it. But the figures for gross revenue were unreliable. And 00:04:11.700 |
yes, I did try asking to adjust for inflation and it still got it wrong. Now you might say, what 00:04:17.080 |
about that button where you can check the validity of an answer? And I have tried that out a few 00:04:22.200 |
times. It's this button that you can press and it will tell you what you want to do. And it will 00:04:23.180 |
tell you what you want to do. And then you can press this Google button here that says double-check 00:04:25.380 |
response. After all, BARD powered by Palm II is supposed to be the only language model to 00:04:31.360 |
actively admit how confident it is in its response. And while this did work to flag up that the gross 00:04:38.080 |
revenue figure might be inaccurate, saying you should consider researching further, it doesn't 00:04:43.400 |
normally flag up too much, to be honest. Here, there was just a single green line for one answer 00:04:48.760 |
to this mathematical problem, which it got wrong. This, by the way, is a 00:04:53.040 |
classic mathematical question that I have tested GPT-4 on many times, and it also normally gets it 00:04:58.680 |
wrong. But remember that thing about BARD being super honest when it doesn't know an answer. I 00:05:03.220 |
would have been really impressed if this was full of orange and red, saying I'm really not sure. 00:05:08.140 |
And I even directly prompted BARD asking this, how confident are you in this response? And it said, 00:05:13.920 |
I am very confident in my response. I then pointed out where it was wrong, 00:05:18.160 |
and it did admit the error. Anyway, I don't want to be too harsh. I do think these extensions 00:05:22.900 |
are a significant improvement for BARD. Imagine you're on holiday and you see a castle like this 00:05:28.960 |
in the distance. All I had to do was take a photo and write YouTube this. It realized what the castle 00:05:35.380 |
was and gave me YouTube videos for it. So if I was walking along, I could watch that getting hyped 00:05:41.040 |
about the castle I'm about to see. It goes without saying that, of course, I have been following the 00:05:45.740 |
news, but I am still working my way through the science paper. And it does seem like that it's more 00:05:51.840 |
of a step forward. I don't know if it's a step forward, but I'm still working my way through 00:05:52.760 |
the science paper. I'm not sure if it's a step forward than a sea change, as the article in 00:05:56.120 |
Nature points out. Of course, every step forward is welcome, though, in the fight against things 00:06:00.760 |
like cystic fibrosis and cancer. I'll have more interesting bits on this in the future if I can 00:06:06.060 |
find them. But now we move on to HeyGen Avatar 2.0, which I briefly demoed in the last video, 00:06:12.420 |
and it shocked a lot of people. I wanted to quickly show you some of its capabilities 00:06:17.020 |
in other languages. And I think this kind of technology is not only going to revolutionize 00:06:22.620 |
movie dubbing, it's also maybe as soon as next year going to make us question all historical 00:06:27.880 |
footage we see. Let's start with Oppenheimer in Spanish. 00:06:31.660 |
It was obvious that the world would not be the same. Only a few dared to laugh at the situation. 00:06:44.600 |
I remembered the phrase from the sacred Hindu texts, the Bhagavad Gita 1, which says, 00:06:52.600 |
Ahora me he convertido en la muerte, el destructor de mundos. 00:06:57.880 |
Supongo que todos pensamos de una forma u otra. 00:07:07.000 |
Du musst sagen, dass ich ein Mensch bin, verdammt nochmal. Mein Leben ist wertvoll. 00:07:12.180 |
Also möchte ich, dass du jetzt sofort aufstehst und handelst. 00:07:16.500 |
Bitte steht alle von euren Stühlen auf. Du sollst jetzt sofort aufstehen. 00:07:22.460 |
Zum Fenster gehen, es öffnen und deinen Kopf herausstrecken und schreien. 00:07:27.200 |
Ich bin so wütend und werde das nicht länger hinnehmen. 00:07:33.960 |
Or who will make a humanoid robot on a large scale? 00:07:36.420 |
This is a good form factor because the world is designed for a humanoid form factor. 00:07:41.200 |
These things will be able to operate our machines. They will be able to sit in chairs. 00:07:46.720 |
They will be able to drive cars. The world is designed for humans where you can invest in it. 00:07:54.440 |
Or let's rewrite history and make Kennedy's famous speech in French. 00:07:59.440 |
Because I have made a promise to you and the almighty God, the same Solomon that our ancestors prescribed nearly a century and a quarter ago. 00:08:13.440 |
And two more quick ones before we move on. These are the last two languages it can do. 00:08:20.440 |
Umiejętności, które sprawiają, że jestem twoim koszmarem. 00:08:31.440 |
Jednak jeśli nie zdecydujesz się uciekać, będę cię szukać. 00:08:35.440 |
Znajdę cię i zabiję, bez względu na to, jak bardzo będziesz się starać ukryć. 00:08:40.440 |
Mi è venuto in mente qualcosa. Mi sono addormentato in un sonno tranquillo e non pensato più a te. 00:08:50.420 |
Sei solo un bambino. Non hai idea di quello di cui parli. Grazie. Va bene. Non sei mai stato fuori da Boston. 00:09:00.420 |
I should note that this only works for translation. You can't just get someone to say anything, not without a consent form. 00:09:07.420 |
That's why I don't think it will be as early as next year when a politician in a major election is able to claim that they lost due to deepfake imagery. 00:09:17.420 |
I think it will be very close and I could see it happening in 2020. 00:09:20.400 |
2025 but not 2024. The guardrails are still quite tight and the technology isn't quite good enough. 00:09:27.500 |
I'm of course predicting this on the wonderful Metaculous website. Not only are they sponsoring 00:09:32.840 |
the video but it's actually free to sign up using the link in the description. As I mentioned in my 00:09:38.600 |
last video you can see the aggregated forecast of hundreds of AI questions and I think my audience 00:09:44.680 |
is going to be better informed than average so do sign up and start making your own forecasts. 00:09:50.520 |
I genuinely want our community to be among the best informed on the planet in AI progress. 00:09:56.600 |
Speaking of which I think many of you might be genuinely interested in some of the guests 00:10:01.100 |
and topics I've got lined up with one of the most famous professors who argues that GPT models can't 00:10:06.840 |
reason versus the recent news that GPT 3.5 instruct can play chess and you may even hear the result of 00:10:14.660 |
against GPT 3.5. But now it's time for this the inventive AI image generation you saw at the start 00:10:22.060 |
of the video. I'm going to show you not only how you can create it yourself but also how you can 00:10:27.140 |
turn it into a video. Hopefully you can see the AI explained in this image and maybe you like a bit 00:10:33.420 |
of movement like this. Here are some of the other generations if you prefer things to be a little 00:10:38.940 |
bolder. And can you spot the letters AGI in this somewhat quirky video? 00:10:44.000 |
I just love how two of the characters just yeet their way out of the video randomly. 00:10:48.500 |
There's two quick tools I want to show you. The first one is fusion art where you get 25 free credits. 00:10:55.080 |
What I've been doing is to create bold text in Adobe Express. I download the output and then click 00:11:02.080 |
upload black and white image. I've changed it to 16 by 9 and then you can basically get creative 00:11:08.120 |
with the prompt. You can see one of the other generations I came up with here. Then what I did 00:11:13.340 |
is I took those images to Runway Gen 2. You can sign up for free and then you get 45 seconds of 00:11:20.060 |
generation. You can see here what some other people have been doing. This is from Stealthy 00:11:24.920 |
the Time Traveler. This by the way is the free and unlimited hugging face space that you can 00:11:33.200 |
use alternatively. And let me just give you some tips before I move on. The higher up 00:11:37.820 |
you make the illusion strength the more you can see the original illusion. In this case 00:11:42.680 |
that might be the letters AI explained or in other cases it might be the background shapes or spirals. 00:11:48.740 |
But one big benefit of reducing the illusion strength is that you get more interesting 00:11:53.420 |
lifelike images. If you want different outputs you can also change the prompt of course. Or if 00:11:58.640 |
you want to keep the prompt the same and get different outputs just click on advanced options 00:12:03.740 |
and just randomly change the seed and you'll get a different output. If you ever find the 00:12:08.840 |
resolution a little low there are free websites where you can upscale it. If you want to change the 00:12:12.020 |
resolution you can also change the font size and upscale images linked in the description. And 00:12:14.900 |
this kind of stuff is only going to get crazier when it moves into the world of 3D as you can see 00:12:20.420 |
here. But if you find testing for dangerous capabilities more compelling than generating 00:12:25.340 |
AI images how about the OpenAI red teaming network announced less than 24 hours ago. 00:12:31.100 |
Essentially if you are a domain expert in any of the 26 areas I'm about to show you, 00:12:36.560 |
you can join up and get paid. And that's even if you can commit to as little as 00:12:41.360 |
5-10 hours in a year. Honestly if I didn't have a bit of a conflict of interest with the channel I 00:12:47.180 |
would definitely have signed up to this. Getting paid to get early access to what could be GPT-5 00:12:52.520 |
and red team it seems like a sweet deal to me. If you are a subject matter in any of these areas 00:12:58.520 |
then do check out the link in the description. At the very least it should be highly interesting 00:13:04.100 |
work. That's it for today a lot of diverse topics covered and a lot more to come. Do let me know in 00:13:09.740 |
the comments if you found any of this video helpful. If you found this video helpful please 00:13:10.700 |
like, share and subscribe. As always have a wonderful day.