Back to Index

Robots as professional Chefs - Nikhil Abraham, CloudChef


Transcript

NICHIL CHOI: Hey, everyone, I'm NICHIL. I'm the co-founder and CEO of Cloud Chef. Today I'm going to tell you guys how we took a general purpose robot that was not meant for cooking. It was just a robot with two hands, how we trained it or put it through culinary school.

And it's now a professional chef that's working in various different kitchens doing actual real work like a chef. So before we get into that, a quick thing about Cloud Chef. Our mission basically is to make high-quality, nutritious food affordable to everyone. And the only way we know how to do it at this point is by automating all commercial labor or all commercial kitchen labor with what we call culinary intelligent robots, robots that can act, sense, and reason, and behave in the real world like a chef.

So you guys probably all have seen the Tesla Optimus dancing. And I've seen it, too. And the immediate question that comes to mind is, maybe this is how a robot chef will be. where, you know, it's in the kitchen, it's beating down equipment, whatnot, right? But it turns out that those guys are a little too expensive.

They're not really there yet. There are lots of problems with humanoids. But on the other hand, if you look at form factors like these, they're also general purpose. They're basically just two hands on a mobile base that can move around and actually do all the work that a regular chef would be able to do and as compared to humanoids, these are now way cheaper than human labor.

Like, humanoids, if you plot them on this curve, you'll probably not even see them at this point because of, like, how unreliable they are, how much maintenance it requires. But these wheeled robots with two hands, no problem. Way cheaper than a human, way cheaper than any human chef. But what's missing is actually, you know, software.

And what we did was we took that, we took this robot, like I said, we put it through culinary school, and now we have chef-like robot labor. So commercial facilities can hire this robot, pay it hourly wages like $12 an hour. It'll always show up. No overtime, no turnover, no calling in sick.

And just, or even better than a human, it plugs in place into any arbitrary novel kitchen. So it learns new recipes from one expert demonstration, and it is robust to ingredient variation, appliance variation, and can cook on arbitrary portion sizes. This is actually a task that's actually harder for humans to do, too.

Like, when I say we put these robots through culinary school, what we actually do is, like, what is culinary school for a robot? It needs to learn all the motion primitives that come with human beings. So how do you pick something? How do you stir a pot? And to do that, we have these robot foundation models that we fine-tune.

We have tele-operation to fall back for all these edge cases. But that's not enough. Like, you still need the robot to understand food. Like, are the onions brown enough? Are the onions brown enough if you're cooking steak? Is it, like, shrinking well enough if you're cooking shrimp? Do you-- can you sense when the shrimp is done?

And these are, like, ingredients that vary seasonally, daily. Like, onions today might require seven minutes to saute. It'll require, like, nine minutes to saute tomorrow. And we basically have thermal and visual embeddings that are specific to cooking that help us reason through these, like, unseen environments. And we've basically modeled recipes as state machines based on these embedding models at the core.

And now, even if-- even after you have this, the next thing that you need is-- it needs to adapt to any new kitchen that it has never seen before. So it needs to be able to see a recipe once, understand what to do, and interact with real humans in a workflow and actually do work.

So we put our culinary understanding to the test. And we, at this point, do better than even expert chefs in their cuisine of training. And, like, if you take-- so basically, we evolved more than 1,000 recipes across a mix of cuisines. We got given-- the task was given live cooking data, an expert demonstration, and a text recipe.

Can you estimate where in the cooking process you are? Can you track progress like a human being? We put it through this expert human chefs who get paid more than $150,000 a year still perform worse than our tiny model that's doing perception in this case. And, in fact, when we put, like, state-of-the-art models like Gemini 2.5 or O3, they actually perform way worse than our own models.

And that's partly because they don't have any thermal modality. And the thing is, thermal modality does not have internet-scale data. So what we did is we went and installed sensors in active commercial kitchens, collected hundreds of thousands of-- collected data worth hundreds of thousands of live cooked meals in various kitchen environments across various different recipes, cuisines, and seasons.

So we collected this private data, we trained a model, we scraped a bunch of public data, trained some self-supervised models on that. And a combination of this is basically what our culinary system banks on. And it is what, like I said, is now way better than human chefs at just decision-making during cooking.

But motor skills, on the other hand, it's not as good as a human, but it is getting there. So we, again, put it through all these different evals, sauteing. It's almost as fast as the human cook, picking and pouring, slightly less fast, grilling, stirring. So it's all a bunch of evals that we did on top of motor skills.

And this is how-- in fact, our system is right now about 95% autonomous, 5% teleoperated. And it's way faster and way more reliable than just teleop or just foundation models. And basically, the robot comes into a kitchen, like I said, looks at a recipe once from a chef, and it's just able to do it.

So for example, here, it's cooking a recipe from a two Michelin star chef who is based out of San Francisco. And basically, while it's cooking, it's looking at how the onions are browning. It's comparing it to how brown the onions were getting when the chef was cooking it. It takes it to the right amount of brownness.

It knows exactly what to do for the next recipe, where the ingredients are kept. It's not preprogrammed to know where the ingredients are, what kind of variation you'll find. It is doing all that reasoning within the system itself. So if we go further under the recipe, we'll see how it's cooking the chicken.

It's basically getting clean readings every single-- every few minutes. And at the end of it, I will basically show you what happens. And at the end of it, you have actual-- so these are actually recipes that go into the stomachs of actual real customers. So the robot's cooking at various different facilities at this point.

It's deployed in the real world. And yeah, so it's deployed in the real world. It's being used in all these sorts of kitchens. On the right, you can see it cook recipes. And our in-house kitchen on the left, it's also like CCTV footage of the robot doing some operation.

I'm not even sure. I just pulled it off of the CCTV before getting on stage and just pulled it up here. And this is video from a couple of months ago where the robot's doing regular cooking like a human being. And outside of our own facilities, this is how, for example, the robot's working at one of our customer's facilities, doing chicken wings.

It's basically fetching the chicken wings from some place kept to the side, it waits for the cook to be done. Now it'll basically collect the cooked chicken, put it inside a bowl, and goes ahead, sauces it, and mixes it like a human being. And while doing this, a robot has-- a robot is practically a weighing scale itself.

So it knows exactly what amount of ingredients it has put in. It knows how much it has stirred. And yeah, so basically, we are Cloud Chef. Like I said, at this point, we are hiring-- we are a very small team. We are growing super fast. And we are looking for people in software, ML, and robotics.

If you know anyone, please reach out to me. My email address is nikio@cloudchef.co. And yeah, thank you. If any of you have any questions, I'm happy to take it. Thank you. So for us, success means two things. One is, how good is the robot at understanding what's happening in the cooking process?

So a very simple intuition for that is, OK, if you give the entire cooking feed to a human being, and if you give the entire cooking video and infrared feed to our system, which estimates state better? Because once you have a cooked recipe, you can use that as labeled data to understand, OK, if the system predicts that this is 40% done, was it actually 40% done or was it actually 50% done?

That's actually a supervised learning signal that we can get after we have data from recreations. Like any food recreation from any chef with thermal and RGB footage, we're able to do that. The other part is motions, like how fast is the robot able to do physical motions as compared to a human being, which I said, we are not as good as human beings yet.

It's basically a data problem. The more data we get, the better and faster we get at doing any individual task inside the kitchen. Does that answer your question? Yeah, so for the end taste, so the thing that we realized is, as a professional, no professional chef is cooking to chemical, to consistency that can be measured in any chemical way.

So our competition is not getting chemical level consistency every single time. It's about, it's about getting consistency to a degree that is better than a chef can do a second time. So a common benchmark that we do is we get a chef to cook a recipe once and then we get our our system, we get our robot recreated that recipe a couple of times and then we do blind taste tests.

And so those are more unscalable evals that we do in-house, which act as a higher signal to, okay, actually the end product that we get is better than what chefs are able to do. No, it's basically just hand, uh, two hands on a mobile base, uh, with some cameras and stuff on it.

It shows up at the kitchen. You basically interact with it like a human being and that's, the form factor. There's no additional screens, et cetera. Those are just for video's sake. Uh, it depends. Uh, ideally humans don't need to, but today in some deployments, humans do end up doing it.

But our idea is that because the robot, we, because we have joint work, we, because we have joint work data from like all the different motors from the robot, the robot itself is a weighing scale. So when it picks up something, it already knows how heavy it is. So that is one thing that we've worked a lot, uh, uh, worked on a lot, uh, worked on a lot when we are able to work on arbitrary unseen appliances because our sensing stack is so good and we're able to work on it, and we're able to work on arbitrary unseen appliances because our sensing stack is so good.

So that is one thing that we've worked a lot, uh, worked on a lot when we are able to work on arbitrary unseen appliances because our sensing stack is so good. And, uh, uh, the other thing is almost all appliances inside kitchens are controlled using knobs. So the motion primitive that the robot needs is to know how to turn a knob and then, uh, our control systems take care from there.

Yeah, sir. That is how it is where it can move automatically. Yes. Yeah. So, uh, right now for most motions, we are anywhere between like 80 to 95% the speed of a human being and ideally there's nothing stopping robots from being even faster than human beings. It's mostly just a data problem.

Right now, the reason why it's not as fast as human beings is because the data that we collect on these robots are done by human beings who teleoperate the robot. And because human beings teleoperating the robot are not as intuitive at teleoperating the robot as their own bodies, they're not as fast as so the data is kind of slow.

And then over time we expect with RL and stuff, it will be faster. Yeah. So, uh, there's nothing stopping a robot from doing that either. It's just, we don't have data. We haven't gone out, collected data for those tasks yet. So it's just something on our roadmap. We're very much blank to that.

Where can we eat this? Oh, you can eat this in Palo Alto. So if you're in San Francisco, if you order from Wingstar, that's a customer of ours who uses it. So you'll get it from there. If you're in Palo Alto, you can order from India's top 20 and you can eat it from there as well.

Or if you're in Menlo Park, you can go to that. Uh, high-end Indian restaurant called Elan and some of the food there is also cooked by it. We've, uh, we've asked questions around like, uh, chopping and food preparation and whatnot and like, uh, speed of the robot. But in terms of, uh, throughput in the actual process, uh, how much of that even matters?

Like, uh, you know, how much of the energy already goes in, you know, throughout the day into prep versus the like, uh, 90%, you know, percent or 80%. Like, does that matter? This is not a manufacturing facility. Uh, when it comes to servicing, like how much of the economic value is already taken care of because you have the tele operator in the back to make sure things are insured.

Have you guys found that meaningful or is that not a big deal at all and not a, like, is that trivial essentially at this point? Great question. So basically what, uh, the quick answer to that is about 50% of the labor costs inside any kitchen is line cooking labor.

And that's where we are going at first. And the advantage there, I mean, uh, speed does play a factor, but there's another, uh, variable that we have in our control, which is, we are able to speed up recipes, uh, more than any human being is able to do. Because we know exactly like we've had several instances where we've recorded risk.

Like we've observed the chef in motion and realized that, Oh, this process that takes them 20 minutes to do can actually be done in 14 minutes. So if the robot does even like 10% slower, it doesn't really matter. That's how it works. Yeah. Unlike a human being who, uh, after working 40 hours a week, uh, goes into overtime treasury, robot can work for like one minute.

168 hours. Like there's nothing stopping robot from working for 24 seven. The practical constraint is most facilities don't operate 24 hours. So the robot will operate as long as the facility is operating. And then there are some tasks that you can do overnight. So once we get into cutting, chopping, et cetera, the robot will just be doing that overnight before the actual stuff comes in.

Sorry, mine was kind of related to before. So did this, did you find new bottlenecks and things like dishwashing or cross contamination stuff that you maybe weren't expecting to deal with this process? So dish washing, et cetera, not that much. And even for things that cross contamination, we just put small gloves on the robot and then like our customers switch that out every day.

They were these are washable, small silicone. I can pull up the video on that. But basically that's how we take care of it. And then, uh, for things like dishwashing, those are not tasks that we are envisioning doing in the short term. We want to do more of the tasks that actually add to the quality of the food that's being put out.

So that's why we are mostly focused on line cooking for now, maybe sometime later, like prepping, chopping, et cetera. Last question. Oh, yeah. Sorry. Um, so there was, uh, it learns from chefs, right? The recipes from chefs. Is it able to modify steps of a recipe to cook things faster?

So that is still an experimental phase. There are cuisines in which we are able to do this really well, but we aren't yet able to do this across all cuisines. So for cuisines for the thermodynamics modeling of what's happening in the process is straightforward. It is much more easier to, uh, basically like, uh, uh, like speed recipes up, uh, do minor variations, et cetera.

And there are some cases where it's not that easy. It's a little, it, it is still like experimental territory. We're still working on that. Uh, uh, in the current version, it alerts somebody in the facility that the robot needs ingredients to work and then they, they take care of it.

Hopefully once there are enough robots in the facility, they'll just talk to each other and, uh, uh, thank you so much. We'll see you next time. We'll see you next time. Bye. Bye. We'll be right back.