Using AI to Build an Infinite Game: Jeff Schomay

Hi, everyone. I'm Jeff Shoumei, and I want to share with you an interesting generative AI project that I recently did. Not too long ago, I made a game with 100% AI-generated content. It's a simple game where you're wandering around lost in the forest, and you go from scene to scene having encounters that impact your vigor and your courage.

And the idea is that you want to find your home before you run out of courage. There's 16 scenes in a 4x4 grid, and so if you play a few times, you will have seen them all. Now, my favorite part of making this game was generating each scene and just seeing what AI would come up with.

And I thought, wouldn't it be cool to share that experience with the player? What if every time they went to a new scene, it was generated fresh for them, and every game would be unique and different this way? It would be a game of infinite exploration. That sounded so cool that I wanted to try to do it.

Now, the first thing that I would need to do is to generate each scene and have a consistent way of doing that. My scene definitions are JSON objects that describe what the scene is when you first find it, as well as when you come back to it later, and how that impacts your stats.

So I started out by using OpenAI's completion endpoint and doing some prompt engineering. This is the prompt that I used. This is a very detailed prompt. It's rather long, but it worked really well. Most of the time, I would get scenes that had the right JSON format and the content was good.

It was fitting. It was varied. It was interesting. So I was happy with this. But I wanted to make it even more reliable. And I decided to fine-tune a model. I used OpenAI's fine-tuning endpoint, and they recommend 50 to 100 examples. I generated 50 examples, just like these, and used them to fine-tune.

Now, the key is I shortened the prompt. I simplified it. I took out any of the JSON and just generally described what I wanted, hoping that that information would be embedded in the training data. And I tried this out. I wasn't sure if it would work. And I tried it.

It only cost about a dollar or two. That includes generating all the examples and doing the fine-tuning. And when I tried it, I was very happy to find that it worked perfectly. Even though I didn't mention the JSON at all, it came out perfect because of what was in the examples.

And that meant I had less tokens in the prompt, which is faster and cheaper and just easier to work with. So I was really pleased with how this worked. The next step was to make the images. Now, I used a tool called Leonardo. Leonardo not only lets you generate images.

They also let you create your own image models. And this is great for a game because it means that you can have stylistically consistent images, which is exactly what I needed. So I spent a while using all the different parameters that Leonardo offers and working with the prompt to try and find an image that looked right and that I liked.

It turned out that using the description directly from the scene as the prompt made nice pictures, which I was surprised about since it had like second person and said things other than what was in there, but it worked out great. Now, the tricky part with fine-tuning an image model is that you need consistent images that have like the parts that should be the same are the same in all of your training data, but the parts that you want to vary need to be varied.

Otherwise, it will overfit and all of your images will look the same. But if you don't have that consistency between them, then it won't really know what you want and you won't get that good stylistic consistency. This was really tricky, especially in my case. I needed the perspective and the scale to be consistent from scene to scene.

Obviously, I needed them all to be set in the forest and I wanted to have this overall tone and texture that looked the same. Some of my scenes have people in them, some have animals, some have buildings, some have nothing, and so it was hard to get that variety.

I ended up having to train a couple of models with different parameters, different sets of images, but I eventually found one that worked out. And to test it out, I generated a lot of images. I mean, a whole bunch. And you can see they all have similar features like the zigzag path down the middle.

Obviously, the trees and look and everything looks the same. And yet, there's plenty of variety. Each one is unique and different, but still feels cohesive, which I am very pleased about. So now I had everything I needed to put it together and make the game. I made a simple asset server that had an AI pipeline, starting by requesting a new scene from OpenAI's endpoint using my custom model.

Once I get that, I validate the JSON to make sure that it's got all the keys it needs. If it's good, I take the description and I send that to Leonardo. Leonardo makes an image from my custom model, gives it back to me. I put it all together and send it off.

Now, did this work? Well, let me show you. Here is an example scene that was created. And I'm very happy with it. I made a simple preview server so that I could scroll through a bunch of these scenes that I generated to make sure they worked. And it looked good.

So I made some changes to the game to request images each time the player went to a new scene. Now there was a problem here. It takes 10, 20, sometimes 30 seconds to do this, and that wouldn't be good for the play experience. So what I did is I added some caching.

I pre-fill a bunch of these scenes, and then as scenes are taken out of it, I fill it back up again once it gets below a certain threshold. And that way, there's always a scene that's ready to go. With that, the game was ready, and I'm going to share it with you right now.

Now keep in mind, everything that we see has never been seen before and will never be seen again. So this is the game. You always start out at this lamppost, and you have to wander around and find your way home. Your stats are in the bottom left corner. As your vigor goes down, your speed goes down as well.

And as the courage goes down, the viewport will get smaller and smaller. Let's look around and explore. We're going to move down. Here's the first generated scene. This looks really cool. This is like you encounter a soft blue pulsating light coming from the organic formation scattered around the glade.

Your fear and tiredness lift, and you feel rejuvenated, and the vigor goes up, but I'm already at full. So that's really cool. Let's head off in this direction now. I won't read all of these, but this looks like a cool campfire scene, which is really neat. And I'm going to head down.

And what have we got here? There's a large dark cave over here at the end of the path somewhere, and it's daunting, so my courage is going down. Let's head this way instead. And now we've gotten into some fog, foggy trees, and... hard to see. Let's go back. This is like a really windy road that we're going through.

Let's head down. Oh, I'm back where I started. Well, this is the game, and it would continue on and on and on until you find your way home, and then you can just play again, and it would be different every time. That's great. I just have a few closing thoughts.

One thing is that these images are low resolution. They're 512 pixels, and I could make them a higher resolution by adding an AI upscaler to my pipeline. It would add more time, so it's a trade-off. Also, I could get more creative with adding something to the prompt to make a scene.

For example, I could let the user select a theme, or maybe even get the time of day or the current weather at the location of where the user is set, and then the scenes could be generated to match where they are for a very immersive experience. And of course, I can use this same process on other projects.

That's all. I hope that you found this interesting and enjoyed watching it as much as I enjoyed putting it all together. Thank you so much.

Using AI to Build an Infinite Game: Jeff Schomay

Chapters

Transcript