back to index

The State of Generative Media - Gorkem Yurtseven, FAL


Chapters

0:0 Introduction to Generative Media
3:49 Evolution of Generative Image Models
5:11 Impact on Industries
10:50 The Rise of Generative Video
14:23 Future of Generative Media
16:33 Conclusion and Call to Action

Whisper Transcript | Transcript Only Page

00:00:00.000 | It's so nice to see a generative media track in the AI conference, AI engineering conference this year.
00:00:22.000 | My company, I work at this company called file.ai.
00:00:27.000 | We call ourselves a generative media platform.
00:00:30.000 | And this is a term that's been around for a while.
00:00:34.000 | But we kind of owned it and we called it the name of our company, generative media platform.
00:00:42.000 | And the way we define it at least is it's a generative video, audio or image.
00:00:50.000 | And our company is seeing all these kinds of models using our inference engine.
00:00:57.000 | And we are partnering up with some closed source model providers as well.
00:01:01.000 | So I've been doing this for a couple of years, but it is a really, really new market.
00:01:07.000 | And throughout the talk, I'm going to walk you through how we got here and a little bit of the history and what's next.
00:01:18.000 | I remember in 2022 when Sam Altman started tweeting about Delhi too.
00:01:26.000 | I was working at home.
00:01:28.000 | It was end of COVID.
00:01:30.000 | I know COVID took a little longer in San Francisco, but I remember sitting on the floor, could not believe my eyes.
00:01:38.000 | People were tweeting at him and he was tweeting back pretty high definition images of incredible things that people were tweeting.
00:01:48.000 | Like looking back to it, obviously it all looks kind of bad quality, but I remember at the time I was, I thought this was the most incredible technology ever.
00:02:01.000 | And I was, I was in the industry.
00:02:03.000 | I knew what was going on, not as much as today, but I thought at that time OpenAI was so far ahead of anything else.
00:02:13.000 | And it's going to be so, so hard for normal people to catch up to this technology.
00:02:19.000 | I was, I was, I remember this was one of the biggest WTF moments of my life.
00:02:26.000 | But then you can tell me, hey Gerkem, this was, this was all going to happen.
00:02:31.000 | There was other AI waves before, before this last big wave.
00:02:38.000 | There was a GAN breakthrough that people did similar things using GANs.
00:02:43.000 | Deep dream from Google went through a phase.
00:02:47.000 | And then there was even a viral consumer AI application of it called Prism.
00:02:53.000 | People uploaded their selfies and they were able to change their avatars.
00:02:57.000 | But the capabilities and the applications of the technology was not nearly as much as what generative media can be used today.
00:03:08.000 | Not only the previous AI wave, generative media or being able to create art with computers has been around.
00:03:17.000 | Kind of since the computers has been around.
00:03:20.000 | And this guy, Harold Cohen, this is a recreation of his project.
00:03:24.000 | But basically he created these massive computers to draw on these huge canvas to create art similar to how a human would drive.
00:03:36.000 | And then we have computer graphics and generative graphics, things like that.
00:03:41.000 | Throughout the years, people tried to generate visuals and art using different computing technologies all along.
00:03:53.000 | Right after Sam Altman's tweet, playing field evened out really, really quickly.
00:03:59.000 | So Deli2 was April 6.
00:04:02.000 | Right after that, Midjourney released their initial model in beta as a Discord bot.
00:04:09.000 | And then very quickly after that, Stable Diffusion open sourced their model, which was a huge, huge thing.
00:04:17.000 | People now were able to run a technology similar to Deli2 in their homes, in their home GPUs.
00:04:23.000 | People started building services around it.
00:04:27.000 | And then STXL came out.
00:04:29.000 | And then now there's many different image models, open and closed source.
00:04:35.000 | And most recently, Flux was released early in the summer last year.
00:04:44.000 | And with all this playing field evening out, the marginal cost of creation is approaching zero.
00:04:53.000 | And I'm very careful when I choose my words here.
00:04:56.000 | I'm not saying marginal cost of creativity.
00:04:59.000 | It's marginal cost of creation.
00:05:01.000 | I think the storytelling is still really important.
00:05:05.000 | Creativity is still really important.
00:05:07.000 | But once you have that set up, creating that next new thing is approaching zero.
00:05:14.000 | And we believe this is going to have huge impacts on different kinds of industries and markets.
00:05:21.000 | So anything from social media, advertising, marketing, fashion, obviously film and movies, gaming and e-commerce is going to be transformed by generative media.
00:05:35.000 | And this transformation is going to continue until all content one way or the other is impacted by AI.
00:05:45.000 | So if you've been following, software has been eating media all along.
00:05:50.000 | YouTube, just from basically ads, is generating more revenue than any other media company except Disney.
00:05:59.000 | This is pretty remarkable.
00:06:01.000 | And with Disney revenue, there is clearly non-media revenue in there.
00:06:05.000 | They have parks, they have cruise ships, they have other things.
00:06:09.000 | So it's not too hard to say YouTube right now is one of the highest revenue generating media companies in the world.
00:06:15.000 | And it is happening through ads.
00:06:18.000 | And whenever ad industry is impacted by technology, it usually grows in volume.
00:06:25.000 | So we believe the same thing is going to happen with generative media and ads.
00:06:31.000 | We believe ad industry is going to be the first industries to be impacted at a large scale by generative media.
00:06:39.000 | We believe the size of the industry is going to increase.
00:06:43.000 | So it's really funny.
00:06:44.000 | Since 2000, every ad spend has been increased.
00:06:50.000 | So the industry grew three times since 2000.
00:06:54.000 | But all that growth happened in software ads.
00:06:58.000 | So we believe something similar is going to happen with AI-driven ads.
00:07:03.000 | And the ad industry is going to grow.
00:07:06.000 | And most of that growth is going to come from AI.
00:07:09.000 | And there are several different ways how this can happen.
00:07:14.000 | We believe ads themselves are hyper-personalized.
00:07:18.000 | So this might mean you are generating many different versions of the same ad,
00:07:23.000 | but maybe 10,000 different demographics really quickly.
00:07:27.000 | Or it can also mean it's targeted towards a certain individual.
00:07:32.000 | If you are coming from a certain website, then the ad can be generated on the fly.
00:07:36.000 | And then it could also be interactive in ways that I just mentioned.
00:07:43.000 | Things generated on the fly.
00:07:45.000 | And that might mean many different things in the industry.
00:07:49.000 | One other thing why I think generative media fits the ad industry very well is the abundance of content.
00:07:56.000 | For example, I probably won't watch a blockbuster movie every single day.
00:08:03.000 | So even if we have a thousand more movies this year, I probably have to sit down and watch a movie a day to go through all of them.
00:08:11.000 | But I probably won't be able to do that.
00:08:13.000 | But ads, there can be kind of unlimited content.
00:08:17.000 | Every time I'm grabbing my phone, I'm seeing ads.
00:08:20.000 | On TV, there are ads all the time.
00:08:22.000 | And it doesn't matter if the ad is different.
00:08:25.000 | Like maybe there needs to be some consistency.
00:08:28.000 | But ad industry can actually survive with a lot more content and things can get a lot creative.
00:08:38.000 | So we were ahead of the time a little bit.
00:08:42.000 | Last year, we did an ad promo with A24's Civil War movie.
00:08:48.000 | And it was one of those interactive ideas that I was talking about.
00:08:53.000 | So if you've seen the movie, it's about an imaginary civil war in the US.
00:09:00.000 | And they had this campaign of these little green toy soldiers.
00:09:04.000 | And they created a live marketing site where you could put a selfie.
00:09:09.000 | And then we created a little toy soldier with your selfie and your description.
00:09:16.000 | And they put this on Times Square.
00:09:18.000 | People were able to display their own faces on these little green toy soldiers.
00:09:23.000 | So AI is going to help us create experiences like this that are interactive and personalized.
00:09:30.000 | The other trend we are watching really closely is e-commerce.
00:09:36.000 | If you've been paying attention to it, e-commerce is growing about 1% every year.
00:09:43.000 | Getting a percentage of the US retail industry.
00:09:47.000 | So this is a trend that's happening with or without AI.
00:09:51.000 | And we believe generative media is going to play a big, big part on e-commerce's growth as well.
00:09:58.000 | And it's already there are many companies trying to define how people shop online.
00:10:07.000 | And because online shopping is very visual, AI can add a lot of interactivity to the experience.
00:10:14.000 | In fact, it's one of the earliest product market fits I've seen in generative media.
00:10:20.000 | This has been happening for a couple months, maybe a year.
00:10:23.000 | That virtual try-on is one of the clearest product market fits that I see in the AI industry.
00:10:31.000 | Many different retailers, e-commerce websites are adapting this technology.
00:10:36.000 | Many different startups are being built on it.
00:10:39.000 | So I believe this is going to be everywhere.
00:10:42.000 | Every retailer, every e-commerce website is a potential generative media user.
00:10:49.000 | And then there is video.
00:10:52.000 | So when Samatman tweeted Dali 2, I thought OpenAI was so far ahead and no one was able to catch up.
00:10:59.000 | People caught up incredibly fast.
00:11:02.000 | So this time, he did the same trick with Sora when Sora was released a year and a half ago, basically.
00:11:09.000 | And this time around, maybe Sora was even more impressive than Dali 2 in terms of how far ahead things look like.
00:11:18.000 | But this time around, I was incredibly excited that researchers at OpenAI was able to actually do things like this.
00:11:28.000 | And from the past experience, I know that if this is possible in one place, people are going to be able to do similar things in others.
00:11:38.000 | So I was incredibly excited when Samatman started tweeting about Sora because I knew very soon a technology like this was going to be everywhere.
00:11:48.000 | In fact, it started happening.
00:11:50.000 | So this is a little snapshot of our company's revenue, which I think is a good proxy of the entire market.
00:11:59.000 | Early this year in October, we barely had any video model usage in the platform.
00:12:06.000 | And in February, this this went all the way up to 18 percent.
00:12:10.000 | And I didn't get time to update it.
00:12:13.000 | But I looked yesterday.
00:12:15.000 | It's around 30 percent today.
00:12:17.000 | So it is growing really fast, even though it's expensive, even though it still doesn't work as well.
00:12:23.000 | Video models are going to completely take over the generative media market.
00:12:29.000 | And I have some predictions about how much bigger the video market is going to be compared to the image market.
00:12:37.000 | So rough math, but we believe video models are 20x more compute intensive.
00:12:44.000 | And let's say if it's 5x more engaging and it's going to impact more industries because it's going to be more useful to the industry.
00:12:53.000 | We believe all said and done, the video market is going to be generated.
00:12:58.000 | The video market is going to be 100x to 250x bigger than the image generation market.
00:13:05.000 | And we are just scratching the surface here.
00:13:09.000 | I believe the image generation market has a ton of growth.
00:13:13.000 | That's going to happen in the next couple of years as well.
00:13:16.000 | But video is growing much, much faster than that.
00:13:19.000 | And when all said and done, it's going to be a much bigger market.
00:13:24.000 | And yeah, video models are leveling up as well.
00:13:29.000 | You probably have all seen the newest model from DeepMind, from Google, VO3.
00:13:37.000 | We keep adding new capabilities into the video models.
00:13:42.000 | First, it was consistency.
00:13:44.000 | And then now with sound, really the things people are generating with it is incredible.
00:13:52.000 | And every time a new capability is added, it unlocks a different use case in the industry.
00:13:58.000 | So it's not on our platform yet, but I'm very curious to see how people are going to start creating using VO3
00:14:08.000 | and what different use cases it's going to unlock in the ad industry or the e-commerce industry.
00:14:15.000 | So that is very interesting to see.
00:14:18.000 | So where is the video market going?
00:14:23.000 | We believe there is so, so much to improve.
00:14:27.000 | We are going to have faster and cheaper video generation until video generation basically becomes real time.
00:14:36.000 | So generating one second of video in one second.
00:14:39.000 | So you'll be able to stream generated content to the user.
00:14:45.000 | And this is going to have very different implications on how people interact with this technology.
00:14:52.000 | Everything potentially becomes interactive.
00:14:56.000 | The line between games and movies gets blurred.
00:15:02.000 | So how is this going to impact social apps?
00:15:06.000 | How it's going to impact live events?
00:15:08.000 | People, like if you play Fortnite or similar games, people are already having live events there.
00:15:15.000 | Is it going to become more lifelike?
00:15:18.000 | Our parents, like you know, people who are not used to playing video games are going to be part of this experience.
00:15:27.000 | I'm really curious about the future of this technology.
00:15:31.000 | And then image models are not done yet as well.
00:15:37.000 | There's been a lot of different improvements in the past couple of months on the image models as well.
00:15:46.000 | Flux Context and GPT 4.0 introduced new editing capabilities, better text rendering capabilities.
00:15:55.000 | At one point, people thought, okay, maybe this is as good as image models are going to get.
00:16:01.000 | But with these new releases and new capabilities, it is opening up to more use cases in the industry.
00:16:10.000 | Whenever we see a technological shift like this happening, we see a lot of different, more mature players in the industry picking up these technologies.
00:16:22.000 | So we believe something similar is going to happen with Flux Context and GPT 4.0.
00:16:28.000 | And it's going to blend into more of the enterprise use cases people are trying to do.
00:16:37.000 | And then this is pretty much it.
00:16:41.000 | We are hiring.
00:16:43.000 | So please visit our website, fall.aii/careers.
00:16:48.000 | We are hiring machine learning engineers, inference engineers, product engineers, all sorts of positions.
00:16:55.000 | And I'll be hanging around the rest of the day today.
00:16:58.000 | So find me, talk to me.
00:17:00.000 | Would love to discuss whatever related to generative media or about the industry in general.
00:17:07.000 | Thank you so much.
00:17:08.000 | Thank you so much.
00:17:11.060 | We'll see you next time.