Sora is Out, But is it a Distraction?

After a 10 month wait, OpenAI have released Sora to paying users. With just a prompt, it can generate videos of up to 20 seconds in lower resolutions and 10 seconds at 1080p if you can fork out $200 a month. I've tested it and read the system card. The user interface is really quite beautiful, even if the videos themselves operate under entirely new rules of physics.

But I just can't help wondering if OpenAI want us to focus on releases like Sora rather than some quietly broken promises. First things first, Sora is available in almost every country aside from those of the EU and the UK. And for those of you who've noticed my accent, you might be wondering how am I using it and I'm using a VPN.

No, don't worry, I'm not going to do an awkward segue to a VPN sponsorship. As mentioned at the start, you do need to be a paying user of ChachiBT to get Sora and the standard $20 tier only gets 1000 credits and the meaning of that we'll see in just a second.

ChachiBT+ is also capped at 720p resolution, not 1080 and just 5 seconds of duration. Pro at $200 a month gets you 10,000 credits, but you also get to download without the watermark. Now you can see how quickly you would use up those 1000 credits from the $20 tier. Look, 720p, 5 seconds, that's 60 credits.

Dare I say it, even on the $200 tier, you can eat through your credits quite quickly. In fact, just prepping for this video, I've probably used up like 80% of my allowance. The allowances, by the way, don't roll over to the next month. By now, I'm sure that almost everyone watching this video has seen at least a dozen other videos from other creators, so I'm going to keep this one short.

The first is just a standard generation and it has remembered to pick out the shard, but I wouldn't say this shadow is being cast over the shard. Next for the budding content creators out there, how about a futuristic YouTube intro? I'm not going to use it for the channel, but I think that's pretty cool.

How about the storyboard feature? And honestly, I think the user interface is pretty amazing. It's very Apple-like, very sleek and clean. But of course, the central problem with all Sora generations, indeed, all of generative AI, including ChachiBT, is that it doesn't really follow physics. It hallucinates. So I prompted Sora with an image I featured in a previous video and then had the ending be that the sign falls off the turtle.

The turtle doesn't notice and walks off screen. The sign, by the way, is supposed to stay on the ground. Here was the resulting video. Now when you're watching this video, remember that if you buy the $200 tier just for Sora, this video alone used up more than 5% of your allowance.

Translated, it costs you, or in this case me, $10. For those listening to the podcast version of this video, the sign definitively does not fall off the turtle. The next topic that I just have to get to if any of you are considering getting Sora is the refusals. Any prompt mentioning something proprietary, in this case an Arsenal shirt, just gets blocked.

In this case, as with the VPN, there is one way of slightly getting around that. What you can do is generate the relevant image in a different image generator, something like Ideagram or Midjourney. Then you feed it in as an image prompt and, lo and behold, Sora will generate the video based on that image.

Notice again, though, that this hedgehog is supposed to be scoring a goal rather than just staring at the potato. Also, did you notice that massive mangling at the beginning? God knows what that is. Oh yeah, and the potato levitates. So far in my use of image prompts, it has been pretty hit or miss.

I tried to make the image logo for my AI Insiders on Patreon pop out. So I fed in the image and the prompt as part of a storyboard. And the prompt was the robot, and this is around four seconds in, it should kick in. The robot holding the AI Insiders book, looks at the book, then brings it closer to the screen.

You can move those prompts earlier or later in the timeline to adjust when they kick in. The only problem is, as you can see, the robot instantly transforms into a different robot, albeit holding the right book. So that's really cool, but not quite what I was intending. Big plug for my $9 a month Patreon, where I release videos like this one on the media's misreporting over O1's quote escape.

My favorite creation had to be this one, a hedgehog showing off its vegetables. I did prompt it with an image, but I think the resulting 1080p video is crisp and clear. You might know that your prompt can be a video, not just an image or some text. And so I got Sora to extend this scene both backwards in time and forwards in time.

For Kling AI, I used the motion brush tool to get the turtle to move, but Sora quite literally took the turtle in a different direction. Obviously, I'm not the best text-to-video prompter, but you can see some hand-selected videos in the featured section on the product page. Some of these are pretty incredible.

This is 1950's Suburban Bliss, generated just 40 minutes ago, 1080p of course. This is a drone shot of a container ship on docks, loading containers. Not bad at all. If you were just looking for a bit of cuteness in a quick short video at say 720p, there are competitors though, like Runway and Pika.

But yes, overall, Sora is the best and especially at higher resolutions, I wouldn't say it's that close. Just don't bank on it being reliable as Sam Altman found out tonight when they were doing the live demo. In neither of the videos did the crane catch a fish. Then there are rivals like Google DeepMind who've produced Veo, their most capable generative video model.

The only problem is, it's not accessible to anyone hardly, and definitely not me. So that, if you will, is my initial review of Sora. The best currently available, although very pricey with a limited amount of generations and hardly any adherence to physics. The system card, by the way, even though I read it in full, doesn't tell us that much, so I've just picked out five highlights.

After covering these highlights, I'll touch on why I think there is an element of distraction in play. First is the team keeps repeating "we made Sora to understand and simulate the real world", a capability we believe will be an important milestone for achieving AGI. Here's my cynical theory on that.

OpenAI was funded by people like Elon Musk as a non-profit to create AGI that benefits humanity. It is really quite hard to link the creation of an entertaining video generator to creating AGI. It's great for signing up new subscriptions, and indeed new signups are currently blocked because it's so popular.

And great, of course, for building a ton of revenue, but really quite hard to link to the creation of AGI. Call me cynical, but Sora is miles away from understanding the real world, if it ever could, and certainly further than a model like O1 in ChatGPT. Just seems like a justification for why they're doing Sora.

For all the lawyers watching, they again don't go into where they get the data from, I covered this back in February of this year. They just say "mostly collected from industry standard machine learning datasets". They are very well aware that they are a big target for lawsuits, which is why they're not going to be any more specific.

One interesting bit of technical detail is that they customise their own GPT to achieve high precision on the moderation for certain topics. And the reason that they could afford that latency is because we're all waiting anyway for the video generation, so they use that window of time to run this precision targeted GPT.

What that model can do, by the way, is identify third party content as well as deceptive content. I tried clipping out a section of Lord of the Rings and feeding it in as a video prompt and it was flagged immediately. Next, in case you were wondering or intending to do this, you can't actually ask for a video generation in the style of a living artist.

They thought about allowing this, but then opted for the conservative approach. Also, you can't use as an image prompt, a photo or video of a real person. They say given the potential for abuse, we are not initially making that capability available to all users. I know that's not many highlights from a long report, but that is as much as I got.

And I think you've got to admit that that was at least a fairly decent segue to this video sponsors 80,000 hours. One of their recent podcasts from the 27th of November was covering what one critic calls OpenAI's theft of the millennium. As you can see, they are super highly rated, but they also cover other topics too.

I'm also subscribed to their YouTube channel and tend to listen when I'm doing long drives or long walks. They also produce a career guide, so do check them out with the links in the description. Just as I end though, what's that distraction element that I mentioned at the beginning?

For some of you, this will be the most important part of the video. Well, we have all of these so-called 12 days of shipmas, OpenAI releasing product after product, one after another. And all of us are starting to think about Christmas and breaks to come, so that would be a good time to bury bad news.

Just one example would be a coincidence, but three, let me know what you think. First, on December 2nd, OpenAI opened the door to doing ads. That doesn't sound like too big of a deal though, right? Their CFO just said that they're looking into ads. But remember, Sam Altman has said in the past, "I think that ads plus AI is sort of uniquely unsettling to me." On Lex Friedman, he also called them a last resort.

That's just the warm-up though, because on December 6th, in the Financial Times, we got this. This story for me, if it turns out to be true, is far more, quote, "unsettling". OpenAI have promised that when they reach AGI, their commercial contract with Microsoft would be void. All the profits thereafter would go to their non-profit to distribute to everyone.

That was key because as Sam Altman himself has said, AGI would break capitalism. It's even in their very definition of AGI that it can do half of the world's jobs. They made this commitment repeatedly, I've covered it many times before on the channel and it's in their bloody charter.

Well apparently, they are now discussing removing that provision. AGI, according to this article, might well be misused then for commercial purposes. Just for a moment, I want you to picture Microsoft having a monopoly over AGI. You might not believe they would do so, and you might not even believe AGI is coming, but just imagine if those two things happen.

Why are they trying to ditch this provision that shuts Microsoft out of its most advanced models? A provision they've promised repeatedly to uphold? Well, they are seeking to unlock billions of dollars of future investment. Just in case you think it's me making up all of these clauses, the FT directly quotes them many times.

According to OpenAI's own website, AGI is explicitly carved out of all commercial and IP licensing agreements. But there's a problem with that, dear viewer. That would limit the potential profit and value for Microsoft. After all, they've pumped in $13 billion into OpenAI. This would disincentivise the big tech group from further investment.

Oh no. By the way, this isn't just rumour, this is according to multiple people familiar with the conversations. Altman said, apparently, "We've left ourselves some flexibility because we don't know what will happen." Again though, the FT quotes from OpenAI's own history. They told anyone investing in them to consider their investments in the spirit of a donation, with the understanding, as I've quoted before on this channel, that it may be difficult to know what role money will play in a post-AGI world.

All that seems out of the window though, and there's only one thing they're promising. OpenAI's non-profit, which does still exist, they say, will continue to exist and thrive and receive full value for its current stake. We don't know the exact figure, but I think their current stake is in the region of 30% of the current value of OpenAI.

In other words, that non-profit will be bunged $50, maybe $80 billion, and receive quote full value. Yes, that will surely give it an enhanced ability to pursue its mission, but that's not the same as Microsoft not owning AGI. That's the same Microsoft, by the way, that was reported just today in the information was boasting about how much labor costs you could save if you adopt AI.

They're being explicit about making sales for their co-pilot service based on showing the example to their customers of how they laid off 10,000 people last year. That's according to three current sales employees. We've been able to improve our throughput per customer service agent by 12% using co-pilot. That's real money.

That means we don't have to hire as many people, Microsoft's Spatero said. Is that the kind of company that you want controlling AGI? Oh, and by the way, I haven't even got to the third thing that OpenAI are potentially trying to distract us from. They have pivoted to work inside the military-industrial complex, albeit with some small caveats.

As MIT reports at the start of this year, OpenAI's rules for how armed forces might use its technology were unambiguous. No one could use their models for weapons development or military or warfare. Then that changed in January to "don't use our technology to harm yourself or others" by "developing or using weapons" or "destroying property".

That "destroying property" went out of the window quite quickly. In October, they changed the terms to "you can only use it to protect people and deter adversaries". Now, though, their technology will be deployed directly on the battlefield. To help, apparently, the US and allied forces defend against drone attacks.

Now, you can argue the rights and wrongs of this, but you've got to admit it's a shift in policy. As one analyst said, "Defensive weapons are indeed still weapons. They can often be positioned offensively, subject to the locale and aim of a mission." OpenAI, MIT says, has long pontificated about how to steward AI responsibly, and they will now work in a defence tech industry that plays by an entirely different set of rules.

In that system, when your customer is the US military, tech companies do not get to decide how their products are used. And according to the Washington Post, even their own employees are pushing back on the deal and asking for more transparency from leaders, wherever we heard that before. Those OpenAI employees apparently want assurances that their technology won't be directed against human piloted aircraft.

It's a great point, right? Because a defensive weapon could still be used against humans if those humans are piloting offensive aircraft. One other OpenAI employee said that "Defensive use cases still represented militarization of AI" and noted that the fictional AI system Skynet, which turns on humanity in the Terminator movies, was also originally designed to defend against aerial attacks on North America.

OpenAI executives quickly acknowledged the concerns. It should immediately be pointed out that they are not the only ones with people like Anthropic and Meta changing their policies to allow military use of their technology. If you were looking for a purely positive review of Sora, I'm sorry to have disappointed you.

The videos can be amazing if you ignore the physics and the user interface is exemplary. But it's quite pricey for the number of credits you get. And I just wonder if these 12 days of releases might be distracting us a little bit from other news about OpenAI. As always though, let me know what you think.

Thank you so much for watching to the end, and have a wonderful day.

Sora is Out, But is it a Distraction?

Chapters

Transcript