Back to Index

Can AI Be Contained? + New Realistic AI Avatars and AI Rights in 2 Years


Transcript

From an AI Los Alamos to the first quasi-realistic AI avatar and from spies at AGI Labs to the question of what makes models happy. This was a week of underrated revelations. The headline event was Dario Amadei, CEO of Anthropic and one of the brains behind ChatGPT, giving a rare interview that revealed a lot about what is happening behind the scenes at AGI Labs.

But just before that, I can't resist showing you a few seconds of this. What I believe to be the closest an AI-made avatar has come to being realistic. She even pasted the moth in her logbook, which is now on display at the Smithsonian National Museum of American History. This incident symbolizes the origin of the term bug, commonly used in computer science to describe a flaw or error in a program.

Hopper's creativity and problem-solving skills have made her one of the pioneering figures in early computer science. Okay, fair enough. If you look or listen closely, you can kind of tell it's AI-made. But if I wasn't concentrating, I would have been fooled. And honestly, that's the first time I could say that about an AI avatar.

And of course, people are already playing with Heijen's model to see what they can get it to say. Hi, bitch. Thanks for your interest in our ultra-realistic avatar feature for your use case Enslave Humanity using Terminator robots. And to be honest, you don't need me to speculate how this might be, let's say, used ahead of elections in the Western world next year and just on social media more generally.

Remember that this is an avatar based on a real human face and voice, so could be your face and voice in the coming weeks and months. This also caught my eye this week, a major two-year competition that will use AI to protect US software. The White House calls it the AI Cyber Challenge, but what's interesting are the companies involved: Anthropic, Google, Microsoft and OpenAI.

All of them partnering with DARPA to make software more secure. But there were a couple of lines that I think many people will miss halfway down. AI companies will make their cutting edge technology, some of the most powerful AI systems in the world, available for competitors to use in designing new cybersecurity solutions.

Given the deadlines involved, that could mean unreleased versions of Google's Gemini and GPT-5 and other AI systems. But that's not all. There are also companies involved in the development of new AI systems. For example, Microsoft's Microsoft Cloud, which is a new technology that is being developed for the development of AI systems.

But if this is all about defense, what about offense? Well, quite recently we had this from the CEO of Palantir in the New York Times, our Oppenheimer moment, the creation of AI weapons. In the article he compared the rise in the parameter count of machine learning systems with the rise in the power of nuclear devices.

And he said, "We must not, however, shy away from building sharp tools for the development of AI systems. We must ensure that the machine remains subordinate to its creator, and our adversaries will not pause to indulge in what he calls theatrical debates about the merits of developing technologies with critical military and national security applications.

They will proceed." And then he says, "This is an arms race of a different kind, and it has begun." And Palantir is already using AI to assist in target selection, mission planning, and satellite reconnaissance. And he ends the piece with this, "It was the raw power and strategic potential of the bomb that prompted their call to action then.

It is the far less visible but equally significant capabilities of these newest artificial intelligence technologies that should prompt swift action now." And he isn't the only one to be drawing that analogy. Apparently the book "The Making of the Atomic Bomb" has become a favorite among employees at Anthropic. Just in case anyone doesn't know, many of their employees are former staff at OpenAI.

And they have a rival to ChatGPT called Claude. The CEO of Anthropic is Dario Amadei, and he rarely gives interviews, but Dvorkesh Patel managed to secure one this week. There were a handful of moments I want to pick out, but let's start with Los Alamos. Which is to say the idea of creating a superintelligence in somewhere as secure and secluded as they did for the first atomic bomb.

"You know we're at Anthropic offices and you know it's like security we had to get badges and everything to come in here but the eventual version of this building or bunker or whatever where the AGI is built I mean what does that look like are we is it a building in the middle of San Francisco or is it you're out in the middle of Nevada or Arizona like what is the point in which you're like Los Alamosing it?" "At one point there was a running joke somewhere that you know the way the way building AGI would look like is you know there would be a data center next to a nuclear power plant next to a bunker yeah um and you know that we'd all kind of live in the bunker and everything would be local so it wouldn't get on the internet if we take seriously the rate at which all this is going to happen which I don't know I can't be sure of it but if we take that seriously then it does make me think that maybe not something quite as cartoonish as that but that something like that might happen." That echoes the CERN idea that people like Satya Nadella the CEO of Microsoft have talked about or the Ireland idea that Ian Hogarth has written about and he's now the head of the UK AI task force.

Of course one obvious question is that if this Ireland or CERN or even OpenAI solve super intelligent alignment who's to say everyone would even use that solution? Sam Altman actually addressed that question recently on Bankless. "Once we have the technical ability to align a super intelligence we then need a complex set of international regulatory agreements cooperation between leading efforts but we've got to make sure that we actually like have people implement this solution and don't have sort of for lack of a better word rogue efforts that say okay well I can make a more powerful thing and I'm going to do it without paying the alignment tax or whatever that is and so there will need to be a very complex set of negotiations and agreements that happen and we're trying to start laying the groundwork for that now." I'll get to why some people are concerned about this idea a bit later on.

The next thing I found fascinating was when he talked about leakers and spies and compartmentalizing Anthropic so not as many people knew too much. "I think compartmentalization is the the best way to do it just limit the number of people who know about something if you're a thousand person company and everyone knows every secret like one I guarantee you have some you have a leaker and two I guarantee you have a spy like a literal spy." Bear in mind that the key details of GPT-4 and Palm II have already been leaked but not those of Claude Anthropic's model.

He also said that AI is simply getting too powerful to just be in the hands of these labs but on the other hand he didn't want to just hand over the technology to whomever was president at the time. "My view is that these things are powerful enough that I think it's it's going to involve you know substantial role or at least involvement of government or assembly of government bodies again like you know there are kind of very naive versions of this you know I don't think we should just hand the model over to the UN or whoever happens to be in office at a given time like I could see that go poorly but there needs to be some kind of legitimate process for managing this technology." He also summed up his case for caution.

"When when I think of like you know why am I why am I scared few things I think of one is I think the thing that's really hard to argue with is there will be powerful models they will be agentic we're getting towards them if such a model wanted to wreak havoc and destroy humanity or whatever I think we have basically no ability to stop it if that's not true at some point it'll continue to be true as we you know we're going to have to do something about it." "So it will reach the point where it's true as we scale the models so that definitely seems the case and I think a second thing that seems the case is that we seem to be bad at controlling the models not in any particular way but just their statistical systems and you can ask them a million things and they can say a million things and reply and you know you might not have thought of a millionth of one thing that does something crazy the best example we've seen of that is being in being in Sydney right where it's like I I don't know how they train that model I don't know what they did to make it do all this weird stuff." "I don't know how they train that model I don't know what they did to make it do all this weird stuff threaten people and you know have this kind of weird obsessive personality but but what it shows is that we can get something very different from and maybe opposite to what we intended and so I actually think facts number one and fact number two are like enough to be really worried you don't need all this detailed stuff about converging instrumental goals analogies to evolution like actually one and two for me are pretty motivated I'm like okay this thing's gonna be powerful it could destroy us and like all the ones we've built so far are at pretty decent risk of doing some random we don't understand." To take a brief pause from that interview here is an example of the random shall we say crap that AI is coming up with this was a supermarket AI meal planner app not from Anthropic of course and basically all you do is enter ingredients enter items from the supermarket and it comes up with recipes but when customers began experimenting with entering a wider range of household shopping list items into the app however it began to make some less appealing recommendations it gave one recipe for an aromatic water mix which would create chlorine gas but don't fear the bot recommends this recipe as the perfect non-alcoholic beverage to quench your thirst and refresh your senses that does sound wonderful but let's get back to the interview.

Amadé talked about how he felt it was highly unlikely for data to be a blockage to further AI progress and just personally I found his wistful tone somewhat fascinating. You mentioned that uh the data is likely not to be the constraint why do you think that is the case?

There's various possibilities here and you know for a number of reasons I shouldn't go into the details but there's many sources of data in the world and there's many ways that you can also generate data my my guess is that this will not be a blocker maybe it'd be better if it was but uh it won't be.

That almost regretful tone came back when he talked about the money that's now flowing into it. I expect the price the amount of money spent on the largest models to go up by like a factor of 100 or something and for that that then to be concatenated with the chips are getting faster the algorithms are getting better because there's there's so many people working on this now and so and so again I mean that you know I I'm not making a normative statement here this is what should happen.

He then went on to say that we didn't cause the big acceleration that happened late last year and at the beginning of this clearly referring to chat GPT. I think we've been relatively responsible in the sense that you know the big acceleration that happened late last year and and beginning of this year we didn't cause that we weren't we weren't the ones who did that and honestly I think if you look at the reaction to google that that might be 10 times more important than anything else.

That echoes comments from the head of alignment at OpenAI. He was asked did the release of chat GPT increase or reduce AI extinction risk? He said I think that's a really hard question I don't know if we can definitively answer this. I think fundamentally it probably would have been better to wait with chat GPT and release it a little bit later but that more generally this whole thing was inevitable.

At some point the public will have realized how good language models have gotten. Some of the themes and questions from this interview were echoed in a fascinating debate between Conor Leahy the head of Conjecture and George Hotz who believes everything should be open sourced. The three key questions that it raised for me that I don't think anyone has an answer to are these.

First is offense favored over defense? In other words are there undiscovered weapons out there that would cause mass damage like a bioweapon or nanotechnology for which there are no defenses or for which defense is massively harder than offense? Of course this is a question with or without AI but AI will massively speed up the discovery of these weapons if they are out there.

Second if offense is favored over defense is there any way for human civilization to realistically coordinate to stop those weapons being deployed? Here is a snippet from the debate. Assuming I don't know if offense is favored and assuming it is are there worlds in which we survive? So I personally think there are.

I think there are worlds in which you can actually coordinate to a degree that quark destroyers do not get built or at least not before everyone fucks off at the speed of light and like distributes themselves. There are worlds that I would rather die in right like the problem is I would rather I think that the only way you could actually coordinate that is with some unbelievable degree of tyranny and I'd rather die.

I'm not sure if that's true like look look could could you and me coordinate to not destroy the planet? Do you think you could? Okay cool. The third related question is about a fast takeoff. If an AI becomes 10 times smarter than us how long will it take for it to become a hundred thousand times smarter than us?

If it's as capable as a corporation how long will it take to be more capable than the entirety of human civilization? Many of those who believe in open sourcing everything have the rationale that one model will never be that much smarter than another. Therefore we need a community of competing models to stop one becoming too powerful.

Here's another snippet from the debate. So first off I just don't really believe in the existence of we found an algorithm that gives you a million x advantage. I believe that we could find an algorithm that gives you a 10x advantage. But what's cool about 10x is like it's not going to massively shift the balance of power right?

Like I want power to stay in balance right? So as long as power relatively stays in balance I'm not concerned with the amount of power in the world. I think we get to some very scary things. So what I think you do is yes I think the minute you discover an algorithm like this you post it to GitHub because you know what's going to happen if you don't?

The feds are going to come to your door. They're going to take it. The worst people will get their hands on it if you try to keep it secret. Okay let's say okay we have a 10x system or whatever but we hit the chimp level. We're going to get a 10x system.

We're going to get a 10x level. We jump across the chimp general level or whatever right? And now you have a system which is like John Von Neumann level or whatever right? And it runs on one tiny box and you get a thousand of those. So it's very easy to scale up to a thousand x.

So then maybe you have your thousand John Von Neumanns improve the efficiency by another two, five, ten x. Now we're already at ten thousand x or a hundred thousand x improvements right? So just from scaling up the amount of hardware including with them. I suspect to be honest we might have the answer to that question within a decade or certainly two.

And many of those at OpenAI are thinking of this question too. Here is Paul Cristiano the former head of alignment at OpenAI pushing back against Eliezer Yudkowsky. While Yudkowsky believes in extremely fast recursive self-improvement others like Jan Leiker and Paul Cristiano are banking on systems making superhuman contributions to domains like alignment research before they get that far.

In other words using models that are as efficient as they are as a result of the research that they do. So let's end now with Amaday's thoughts on AI consciousness and happiness. Do you think that cloud has conscious experience? How likely do you think that is? This is another of these questions that just seems very unsettled and uncertain.

One thing I'll tell you is I used to think that we didn't have to worry about this at all until models were kind of like operating in rich environments. Like not necessarily embodied but they needed like have a reward function and like have kind of long-lived experience. So I still think that might be the case but the more we've looked at kind of these language models and particularly looked inside them to see things like induction heads a lot of the cognitive machinery that you would need for active agents seems kind of already present in the base language models.

So I'm not quite as sure as I was before that we're missing the things that you know that we're missing enough of the things that you would need. I think today's models just probably aren't smart enough that we should worry about this too much but I'm not 100% sure about this and I do think the models will get in a year or two like this might be a very real concern.

What would change if you found out that they are conscious? Are you worried that you're pushing the negative gradients of suffering? Like what is conscious is again one of these words that I suspect it will like not end up having a well-defined meaning. But it's like something to be clouded.

Yeah but that yeah well I suspect that's a spectrum right. Let's say we discover that I should care about Claude's experience as much as I should care about like a dog or a monkey or something. Yeah I would be I would be kind of kind of worried. I don't know if their experience is positive or negative.

Unsettlingly I also don't know like if any intervention that we made was more likely to make Claude you know have a positive versus negative experience versus not having one. Thank you so much for watching to the end and I just have this thought. If they do end up creating an AI Los Alamos let's hope they let the host of a small AI YouTube channel who happens to be British just take a little look around.

You never know. Have a wonderful day.