back to indexThe AI News You Might Have Missed This Week - Zuckerberg to Falcon w/ SPQR
00:00:00.000 |
Here are seven developments in AI that you might have missed this week from chat GPT avatars to 00:00:06.040 |
open source models on an iPhone and alpha dev to Zuckerberg's projections of super intelligence. 00:00:12.800 |
But first something a little unconventional with a modicum of wackiness embodied VR chess. 00:00:20.480 |
This robot on my left is being controlled by a human in a suit over there and this robot on my 00:00:24.960 |
right is being controlled by a human over there. They both have feedback gloves they have VR 00:00:30.360 |
headsets and they're seeing everything the robot sees. Now specifically today we're looking at 00:00:35.520 |
avatars robot avatars to be precise. They can play chess but they can do much more they can 00:00:40.720 |
perform maintenance rescue operations and do anything that a human can do with its hands and 00:00:45.280 |
eyes. Could this be the future of sports and things like MMA where you fight using robotic 00:00:50.800 |
embodied avatars? But for something a little less intense we have a robot that can do a lot more 00:00:54.940 |
than just playing chess. We have this robot chef who learned by watching videos. 00:00:58.560 |
It does make me wonder how long before we see something like this at a McDonald's near you. 00:01:14.780 |
But now it's time to talk about something that is already available which is the HeyGen plugin 00:01:20.500 |
in chat GPT. It allows you to fairly quickly create an avatar 00:01:24.920 |
of the text produced by chat GPT and I immediately thought of one use case that I think could take 00:01:31.140 |
off in the near future. By combining the Wolfram plugin with HeyGen I asked chat GPT to solve this 00:01:37.700 |
problem and then output an explainer video using an avatar. A quick tip here is to tell chat GPT 00:01:44.480 |
the plugins that you want it to use otherwise it's kind of reluctant to do so. As you can see 00:01:49.800 |
chat GPT using Wolfram was able to get the question right but for some people it's a little bit 00:01:58.360 |
The retail price of a certain kettlebell is $70. This price represents a 25% profit over the wholesale cost. 00:02:06.520 |
To find the profit per kettlebell sold at retail price we first need to find the wholesale cost. 00:02:12.320 |
We know that $70 is 125% of the wholesale cost. 00:02:17.460 |
Next we have Runway Gen 2 which I think gives us a glimpse of what the future of text video will be like. 00:02:24.880 |
A long long time ago at Lady Winterbottom's lovely tea party which is in the smoking ruins and ashes 00:02:31.920 |
of New York City. A fierce woman ain't playing no games and is out to kick some butts against the 00:02:37.520 |
unimaginable brutal merciless and scary lobby boy of the delightful Grand Budapest Hotel. 00:02:42.740 |
And everything seems doomed and lost until a super handsome man arises the true hero and 00:02:50.100 |
great mastermind behind all of this. Now of course that's not perfect and as you can see 00:02:54.860 |
from my brief attempt here there is lots to work on. But just remember where Midjourney was a year 00:03:00.380 |
ago to help you imagine where Runway will be in a year's time. And speaking of a year's time if AI 00:03:06.140 |
generated fake images are already being used politically imagine how they're going to be used 00:03:11.140 |
or videos in a year's time. But now it's time for the paper that I had to read two or three times 00:03:16.340 |
to grasp and it will be of interest to anyone who is following developments in open source models. 00:03:22.340 |
I'm going to try to skip the jargon as much as possible 00:03:24.840 |
and just give you the most interesting details. Essentially they found a way to compress large 00:03:30.180 |
language models like Llama or Falcon across model scales. And even though other people had done this 00:03:35.740 |
they were able to achieve it in a near lossless way. This has at least two significant implications. 00:03:41.140 |
One that bigger models can be used on smaller devices even as small as an iPhone. And second 00:03:47.660 |
the inference speed gets speeded up as you can see by 15 to 20 percent. In translation that means the 00:03:53.880 |
output from the language model is going to be as small as an iPhone. And the inference speed 00:03:54.820 |
of the language model comes out more quickly. To the best of my understanding the way they did this 00:03:59.080 |
is that they identified and isolated outlier weights. In translation that's the parts of 00:04:04.360 |
the model that are most significant to its performance. They stored those with more bits 00:04:08.800 |
that is to say with higher precision. While compressing all other weights to three to four 00:04:14.320 |
bits. That reduces the amount of RAM or memory required to operate the model. There were existing 00:04:20.020 |
methods of achieving this shrinking or quantization like round to nearest or 00:04:24.800 |
GPTQ. But they ended up with more errors and generally less accuracy in text generation as 00:04:30.360 |
we'll see in a moment. SPQR did best across the model scales. To cut a long story short they 00:04:36.320 |
envisage models like Llama or indeed Orca which I just did a video on. Existing on devices such as 00:04:42.320 |
an iPhone 14. If you haven't watched my last video on the Orca model do check it out because it shows 00:04:47.540 |
that in some tests that 13 billion parameter model is competitive with ChatGPT or GPT 3.5. 00:04:54.440 |
So a lot of people have said that this is a bad thing. But in fact it's not. It's a bad thing. 00:04:54.780 |
Imagining that on my phone which has 12 gigs of RAM is quite something. Here are a few examples 00:05:00.600 |
comparing the original models with the outputs using SPQR and the older form of quantization. 00:05:06.780 |
And when you notice how similar the outputs are from SPQR to the original model just remember 00:05:12.080 |
that it's about four times smaller in size. And yes they did compare Llama and Falcon at 00:05:18.440 |
40 billion parameters across a range of tests using SPQR. Remember that this is the base Llama 00:05:24.760 |
model accidentally leaked by Meta not an enhanced version like Orca. And you can see the results for 00:05:30.900 |
Llama and Falcon are comparable. And here's what they say at the end. SPQR might have a wide-reaching 00:05:36.360 |
effect on how large language models are used by the general population to complete useful tasks. 00:05:42.160 |
But they admit that LLMs are inherently a dual-use technology that can bring both significant benefits 00:05:48.420 |
and serious harm. And it is interesting the waiver that they give. However we believe that the marginal 00:05:54.080 |
impact of the LLMs on the LLMs is not necessarily a good thing. So we're not going to be able to 00:05:54.740 |
make a decision about whether or not to include LLMs in the SPQR model. We're going to have to 00:05:55.240 |
make a decision about whether or not to include LLMs in the SPQR model. So we're going to have to 00:05:55.740 |
make a decision about whether or not to include LLMs in the SPQR model. So we're going to have to 00:05:56.240 |
be positive or neutral. In other words our algorithm does not create models with new 00:06:01.200 |
capabilities and risks. It only makes existing models more accessible. Speaking of accessible 00:06:06.560 |
it was of course Meta that originally leaked Llama. And they are not only working on a rival 00:06:12.340 |
to Twitter apparently called Project 92 but also on bringing in AI assistance to things like WhatsApp 00:06:19.360 |
and Instagram. But Mark Zuckerberg the head of Meta who does seem to be rather influenced by 00:06:24.720 |
Jan LeCun's thinking does have some questions about autonomous AI. 00:06:29.780 |
My own view is that where we really need to be careful is on the development of autonomy and how we think about that. 00:06:39.240 |
Because it's actually the case that relatively simple and unintelligent things that have runaway autonomy 00:06:45.240 |
and just spread themselves or you know it's like we have a word for that it's a virus. Can be simple 00:06:50.340 |
computer code that is not particularly intelligent but just spreads itself and does a lot of harm. A 00:06:54.700 |
lot of what I think we need to develop when people talk about safety and responsibility is really the 00:06:59.740 |
governance on the autonomy that can be given to systems. It does seem to me though that any model 00:07:05.580 |
release will be fairly quickly made autonomous. Look at the just two week gap the release of GPT-4 00:07:11.820 |
and the release of AutoGPT. So anyone releasing a model needs to assume that it's going to be made 00:07:17.660 |
to be autonomous fairly quickly. Next Zuckerberg talked about super intelligence and compared it to 00:07:23.800 |
a corporate model. So what does that mean? Well it's a very simple thing. It's a very simple 00:07:24.680 |
operation. You still didn't answer the question of what year we're going to have super intelligence. I'd like to hold you to that. No I'm just kidding. But is there something you could say about the timeline as you think about the development of AGI super intelligence systems? Sure. So I still don't think I have any particular insight on when like a singular AI system that is a general intelligence will get created. But I think the one thing that most people in the discourse 00:07:54.660 |
that I've seen about this haven't really grappled with is that we do seem to have organizations and 00:08:01.120 |
structures in the world that exhibit greater than human intelligence already. So one example is a 00:08:08.920 |
company. But I certainly hope that Meta with tens of thousands of people makes smarter decisions than 00:08:15.480 |
one person. But I think that that would be pretty bad if it didn't. I think he's underestimating a 00:08:20.000 |
super intelligence which would be far faster and more impressive I believe 00:08:24.640 |
than any company. Here's one quick example from DeepMind where their alpha dev system sped up 00:08:30.080 |
sorting small sequences by 70 percent. Because operations like this are performed trillions of 00:08:35.360 |
times a day this made headlines. But then I saw this. Apparently GPT-4 discovered the same trick 00:08:41.840 |
as alpha dev and the author sarcastically asked can I publish this on nature? And to be honest 00:08:47.320 |
when you see the prompts that he used it strikes me that he was using GPT 3.5 the original chat GPT 00:08:54.620 |
Not GPT-4. Anyway back to super intelligence and science at digital speed. When you hear the 00:09:00.060 |
following anecdote from Demis Hassabis you might question the analogy between a corporation and a 00:09:05.820 |
super intelligence. Alpha fold is a sort of science of digital speed in two ways. One is that it can 00:09:11.580 |
fold the proteins in you know milliseconds instead of taking years of experimental work right. So 200 00:09:17.340 |
million proteins he times that by a PhD time of five years that's like a billion years of PhD time 00:09:22.900 |
right by some measure that has been done. So it's a super intelligence and science at digital speed 00:09:25.880 |
Billions of years of PhD time in the course of a single year of computation. Honestly AI is going 00:09:32.560 |
to accelerate absolutely everything and it's not going to be like anything we have seen before. 00:09:37.640 |
Thank you so much for watching and have a wonderful day.