Back to Index

‘Everything is Going to Be Robotic’ Nvidia Promises, as AI Gets More Real


Transcript

The CEO of NVIDIA revealed that he wants his company to become ultimately one giant AI. Even if that feels a little ways away, he did showcase in the last couple of days a string of capabilities that are possible now with AI. Yes, we're going to hear three big promises about the future of AI, but we're going to see a host of demos of things that are possible right now.

I'll bring in clips from some recent interviews I've conducted, and we'll hear from the chief of staff of one prominent AI company predicting the end of employment as we know it in three to five years, which I think is a tad overstated. Speaking of which, you'll also see some AI fails as a spam campaign flops hard.

So what about those three promises I mentioned from the CEO of NVIDIA, which looks set to become the largest company in the world if current trends hold? Well, first we heard and saw that NVIDIA anticipates robots revolutionizing industry. That's still pretty general though, right? So how about the prediction that everything is going to be robotic?

Let me talk about what's next. The next wave of AI is physical AI. AI that understands the laws of physics. AI that can work among us. Of course, when I say robotics, there's a humanoid robotics that's usually the representation of that. Everything is going to be robotic. All of the factories will be robotic.

The factories will orchestrate robots and those robots will be building products that are robotic. Robots interacting with robots, building products that are robotic. And of course, we don't just have robots building robots. We have artificial intelligence improving artificial intelligence. Here is Jason Huang on a separate, less reported occasion, promising to turn NVIDIA into one giant AI.

We can't design a chip anymore without AI. At night, our AIs are exploring design spaces vast and wide that we would never do ourselves because it costs too much money to explore it. We can't write software without without AI anymore. We have to explore all the, you know, the design space of optimizing compilers is too large.

We use AIs to file bugs. So our bug, you know, our bugs database actually tells you what's wrong with the code, who's likely involved and activates that person to go fix it. You know, and so I think we, I want everybody, every organization or company to use AI very aggressively.

I want to turn NVIDIA into one giant AI. But it's well past time that I become a bit more concrete about what models can do right now, today. Here is a 30 second clip from NVIDIA that actually undersold what AI is capable of. Multimodal LLMs are breakthroughs that enable robots to learn, perceive and understand the world around them and plan how they'll act.

And from human demonstrations, robots can now learn the skills required to interact with the world using gross and fine motor skills. But how was that underselling the capabilities of AI? It looked pretty impressive, right? Well, they focused on AI learning from human demonstrations. But if you've watched my Dr.

Eureka video recently, you'll know that it's not just about LLMs coming up with high level plans and then relying on human demonstrations to exercise fine grained robotic control, in this case of a robot dog. LLMs are actually really good at programming the robo dog to, in this case, stay balanced on a moving rolling yoga ball.

And I spoke with Jason Ma, the lead author of the Dr. Eureka paper, which was made in collaboration with NVIDIA, about how that will only accelerate. Robot capabilities will be bootstrapped by large language models. And I think that's the most interesting thing of using LLM for robotics, honestly. Like there's a lot of work in using large language models for robotics in the high level planning category.

I can plan the sequence of tasks the robot needs to do, but I think fundamentally the bottleneck for robotics is still like the low level of physical control, right? LLM can tell the robot to cook some food, but if the robot can't even pick up a knife properly, it's not going to work.

But I think a lot of Eureka where my work is focused on how do we use this highly capable reasoning, coding, text models, multimodal models to supervise the low level learning. So the robots can do the very complex tasks in the first place. And I think that will only accelerate.

The key edge that AI has is that it can iterate thousands and thousands of times in parallel in simulation until it's got a program it's happy with. And dipping back into the virtual world for a moment, how about the long awaited promise of being able to interact live with video game characters?

And speaking of realism, before I get to the latest clips from NVIDIA, here's me speaking six weeks ago about how good lip syncing was getting. Using just a single photo of you, we can now get you to say anything. I have to remind myself that these aren't projections. This is what is currently possible.

Imagine that accuracy of lip syncing on a digital human of this level of realism. I do wonder sometimes how many decades away we are from a time where you could be speaking to someone and not be entirely certain in the real world whether or not they are embodied AI.

I might previously have said that's a hundred years away, but now I think it might be in my lifetime. But I'm off track because I promised more demos of things that are possible with AI today. So how about a weather report that's localized to your building, your pavement? But we are not stopping there.

The next frontier is hyperlocal forecasting down to tens of meters where the effects of city infrastructure are taken into account. When combined with weather simulation windfields, it can model the airflow around buildings. We expect to predict phenomena such as downwash, where strong winds funnel down to street level, causing damage and affecting pedestrians.

NVIDIA Earth 2, an excellent example of a digital twin that fuses AI, physics simulations, and observed data can help countries and companies see the future and respond to the impact of extreme weather. Or what about a coffee shop which is staffed by dozens of robots with just one or two humans to oversee things?

Wait, that's happening right now. All of these things feel futuristic and far away until they actually happen. And how about a sound effect generator that can generate any sound? Well, that is now possible today with Eleven Labs. Actually, I'm going to test it with something like a robot being crushed.

Let's see if it comes up with something interesting or not. So far, about five, six, seven seconds. Not too bad. And how is it? Whoa. Not perfect, obviously, but if you feel that all of this is in the future, let me bring you a video from a graphic designer who lost his job recently to AI.

He just lost my job and I lost it to AI, which is very unfortunate. I think many people joke about the, you know, the fact that, oh, AI is going to take all our jobs and we're all going to get replaced. And especially within my industry, which is graphic design.

And it turns out basically all of the material that I've provided over the past six years is now being fed to AI and templated. So a design that would take me 30 minutes now takes AI 30 seconds as it's been trained on all my templates. Essentially, I think it just literally reuses my templates and then they can input the hex codes they want the email or the website designed to be, drag and drop in the client's logo, upload the client's font and boom, it will generate my template by using their brand assets.

It's a reminder that even though almost all AI needs human generated training data to get started, they don't necessarily need more of it to keep going. Or to put it another way, this is the worst that AI, embodied or not, will ever be. Which is probably why some people, including the chief of staff to the CEO of Anthropic, makers of the Claw chatbots, think that this will massively impact the short-term outlook on employment.

This article by that chief of staff, Avital Balwit, came out just two weeks ago. While I think the outlook isn't quite this stark, here's what she had to say. She predicted these next three years might be the last few years that I work. I stand at the edge of a technological development that seems likely, should it arrive, to end employment as I know it.

And she makes the point that would have been relevant to that graphic designer we just heard from. The economically and politically relevant comparison on most tasks is not whether the language model, or I would say the embodied AI, is better than the best human. It's whether they are better than the human who would otherwise do that task.

Doesn't have to be perfect in other words, just has to be a bit cheaper. She makes the somewhat common prediction by now that things like copywriting, tax preparation and customer service will be heavily automated. But let me give you two examples how the future is a bit more unpredictable than it can sometimes seem.

First, I remember the frenzied reporting on this report from the think tank, the IPPR, here in Britain. According to the headlines at least, they were warning of an AI jobs apocalypse. But the very next day I contacted the lead author, Carsten Jung, and we had a detailed discussion for AI Insiders.

First, he said head on that he was disappointed by the media's coverage. No, I'm not fully happy with how this is being covered, both our report but in general, because it can sound very scary. And I think just scaring people doesn't necessarily lead to incremental, thoughtful policy progress. When people talk about jobs apocalypse, I think some people might just switch off and throw up their hands and say, oh God, we're all doomed.

Whereas what we try to do in the report is actually to say there's a range of scenarios and it's not some kind of external event like a pandemic that's like happening to us and it's all doom and gloom, but it's actually a thing that totally depends on decisions by policymakers, but also by organisations that implement AI.

Then we discussed how a more likely medium term outcome is wage inequality. In short, low wages for many, but not for those who utilise AI to boost their productivity. So those that remain in work, their productivity will be hugely aided by AI. So you have this wage inequality aspect.

But then of course, and I think this is also Sam Altman's point, is that profits are likely going to go up. So we have lower labour costs, AI likely is able to do things more cheaply. So profits will go up. So those that own companies will have higher returns.

And so wealth inequality will likely go up. And the second cautionary tale about how AI's impacts say on jobs can sometimes be overhyped actually comes from open AI itself, albeit unintentionally. When we're talking about good things, we talk about customer service being revolutionised and productivity accelerating. But when the focus is on people using AI for nefarious purposes, suddenly the AI is kind of useless.

This was a report released by open AI a few days ago about how some bad actors were trying to generate disinformation campaigns en masse. Open AI terminated those accounts, but gave a summary of the impact of these campaigns using the GPT models. There was no significant audience increase due to our services.

Later on in the report, they say this. So far, these operations from places like Russia, Israel and China do not appear to have benefited from meaningfully increased audience engagement or reach as a result of our services. They basically describe how these guys came up with a load of spam, but people weren't buying it.

For the most part, it was because the spam just wasn't very good. I don't know, it might be me, but I just find it a little bit ironic that when we're talking about a negative use of the technology, the party line is that the models are kind of useless.

Of course, what we really need are better benchmarks. And so I was pleased to see this initiative from Scale AI. They describe these benchmarks and leaderboards that can't be gamed, are uncontaminated and unbiased. According to these benchmarks, at least, GPT 4.0 is not a million miles ahead of other models.

This initiative reminds me, at least, how we should always benchmark models on our own use cases, because leaderboards chop and change quite a lot. Notice how the table on the left is quite different to the one that OpenAI put out on release. That initiative, by the way, isn't the only reason to be optimistic about benchmarks, which I covered in this video on Patreon.

In short, though, I think just about the only thing we can all agree on is that the future is about as unpredictable as it has ever been. In terms of at least referring to AI in academic papers, you can see the recent exponential increase across virtually every field. How this all actually plays out, though, in the real world, in society, with jobs, with embodied physical AI, we simply don't know.

Thank you, though, for being here with me as we watch it all unfold. Have a wonderful day.