back to indexStanford CS25: V3 I Beyond LLMs: Agents, Emergent Abilities, Intermediate-Guided Reasoning, BabyLM
00:00:00.000 |
So today we're going to give an instructor-led lecture talking about some of the some key 00:00:10.640 |
topics in transformers and LLMs these days. In particular Div will be talking about agents 00:00:15.760 |
and I'll be discussing emergent abilities, intermediate guided reasoning, as well as 00:00:19.960 |
Baby LLM. So let me actually go to my parts because Div is not here yet. So I'm sure many 00:00:32.720 |
of you have read this paper "Emergent Abilities of Large Language Models" from 2022. So I'll 00:00:39.040 |
briefly go through some of them. So basically an ability is emergent if it is present in 00:00:45.480 |
large but not smaller models, and it would not have been directly predicted by extrapolating 00:00:50.400 |
performance from smaller models. So you can think of performance, it's basically near 00:00:55.500 |
random until a certain threshold called a critical threshold, and then improves very 00:00:59.760 |
heavily. This is known as a phase transition, and again, it would not have been extrapolated 00:01:04.440 |
or predicted if you were to extend the curve of the performance of smaller models. It's 00:01:09.680 |
more of a jump which we'll see later. So here's an example of few-shot prompting for many 00:01:15.200 |
different tasks. For example, modular arithmetic, unscrambling words, different QA tasks, and 00:01:21.440 |
so forth. And you'll see that performance kind of jumps very heavily up until a certain 00:01:27.360 |
point. I believe the x-axis here is the number of training flops, which corresponds to basically 00:01:32.680 |
model scale. So you'll see in many cases around 10 to the 22 or 10 to the 23 training flops, 00:01:39.400 |
there's a massive exponential jump or increase in terms of model performance on these tasks, 00:01:47.640 |
which was not present on smaller scales. So it's quite unpredictable. And here are some 00:01:54.960 |
examples of this occurring using augmented prompting strategies. So I'll be talking a 00:01:59.720 |
bit later about chain of thought. But basically, these strategies improve the ability of getting 00:02:06.640 |
behavior from models on different tasks. So you see, for example, with chain of thought 00:02:12.120 |
reasoning, that's an emergent behavior that happens again around 10 to the 22 training 00:02:16.800 |
flops. And without it, model performance on GSM 8K, which is a mathematics benchmark, 00:02:24.600 |
it doesn't really improve heavily. But chain of thought kind of leads to that emergent 00:02:29.360 |
behavior or sudden increase in performance. And here's just the table from the paper, 00:02:35.480 |
which has a bigger list of emergent abilities of LLMs, as well as their scale at which they 00:02:41.660 |
occur. So I recommend that you check out the paper to learn a bit more. And so one thing 00:02:47.200 |
researchers have been wondering is, why does this emergence occur exactly? And even now, 00:02:52.440 |
there's few explanation for why that happens. And the authors also found that the evaluation 00:02:57.440 |
metrics used to measure these abilities may not fully explain why they emerge, and suggest 00:03:02.200 |
some alternative evaluation metrics, which I encourage you to read more in the paper. 00:03:08.460 |
So other than scaling up to encourage these emergent abilities, which could endow even 00:03:14.700 |
larger LLMs with further new emergent abilities, what else can be done? Well, things like investigating 00:03:21.520 |
new architectures, higher quality data, which is very important for performance on all tasks, 00:03:28.200 |
improved training and improved training procedures could enable emergent abilities to occur, 00:03:32.600 |
especially on smaller models, which is a current growing area of research, which I'll also 00:03:39.400 |
talk about a bit more later. Other abilities include potentially improving the few-shot 00:03:44.760 |
prompting abilities of LLMs, theoretical and interpretability research, again, to try to 00:03:50.600 |
understand why emergent abilities is a thing and how we can maybe leverage that further, 00:03:58.200 |
as well as maybe some computational linguistics work. 00:04:02.240 |
So with these large models and emergent abilities, there's also risks, right? There's potential 00:04:07.480 |
societal risks, for example, truthfulness, bias, and toxicity risks. As emergent abilities 00:04:15.400 |
incentivizes further scaling up language models, for example, up to GPT-4 size or further. 00:04:21.160 |
However, this may lead to bias increasing, as well as toxicity and the memorization of 00:04:27.800 |
training data. That's one thing that these larger models are more potent at. And there's 00:04:34.280 |
potential risks in future language models that have also not been discovered yet. So 00:04:38.800 |
it's important that we approach this in a safe manner as well. 00:04:45.440 |
And of course, emergent abilities and larger models have also led to sociological changes, 00:04:51.200 |
changes in the community's views and use of these models. Most importantly, it's led to 00:04:55.800 |
the development of general purpose models, which perform on a wide range of tasks, not 00:05:01.320 |
just particular tasks it was trained for. For example, when you think of chat GPT, GPT-3.5, 00:05:06.560 |
as well as GPT-4, there are more general purpose models, which work well across the board, 00:05:11.840 |
and can then be further adapted to different use cases, mainly through in context, prompting 00:05:19.200 |
and so forth. This has also led to new applications of language models outside of NLP. For example, 00:05:25.240 |
they're being used a lot now for text to image generation. The encoder part of those text 00:05:31.200 |
to image models are basically transformer models or large language models, as well as 00:05:36.440 |
things like robotics and so forth. So you'll know that earlier this quarter, Jim Fan gave 00:05:42.560 |
a talk about how they're using GPT-4 and so forth in Minecraft and for robotics work, 00:05:48.640 |
as well as long range horizon tasks for robotics. And yeah, so basically, in general, it's led 00:05:54.980 |
to a shift in the NLP community towards a general purpose, rather than task specific 00:06:00.080 |
models. And as I kind of stated earlier, some directions for future work include model scaling, 00:06:07.480 |
further model scaling, although I believe that we will soon probably be reaching a limit 00:06:14.560 |
or point of diminishing returns with just more model scale, improved model architectures 00:06:19.400 |
and training methods, data scaling. So I also believe that data quality is of high importance, 00:06:27.560 |
possibly even more important than the model scale and the model itself. Better techniques 00:06:33.600 |
for an understanding of prompting, as well as exploring and enabling performance on frontier 00:06:38.900 |
tasks that current models are not able to perform well on. So GPT-4 kind of pushed the 00:06:44.640 |
limit of this, it's able to perform well on many more tasks. But studies have shown that 00:06:50.040 |
it still suffers from even some more basic sort of reasoning, analogical and common sense 00:06:55.920 |
reasoning. So I just had some questions here. I'm not sure how much time we have to address. 00:07:04.360 |
So for the first one, like I said, emergent abilities, I think will arise to a certain 00:07:09.280 |
point, but there will be a limit or point of diminishing returns as model scale as well 00:07:15.680 |
as data scale rises, because I believe at some point there will be overfittings. And 00:07:20.120 |
there's only so much you can learn from all data on the web. So I believe that more creative 00:07:27.360 |
approaches will be necessary after a certain point, which kind of also addresses the second 00:07:33.600 |
question. Right, so I will move on. If anybody has any questions, also feel free to interrupt 00:07:44.400 |
at any time. So the second thing I'll be talking about is this thing I called intermediate 00:07:49.520 |
guided reasoning. So I don't think this is actually a term. It's typically called chain 00:07:53.960 |
of thought reasoning, but it's not just chains now being used. So I wanted to give it a more 00:08:01.320 |
broad title. So I called it intermediate guided reasoning. So this was inspired by this work, 00:08:07.440 |
also by my friend Jason, who was at Google now at OpenAI, called chain of thought reasoning 00:08:12.200 |
or COT. This is basically a series of intermediate reasoning steps, which has been shown to improve 00:08:18.680 |
LLM performance, especially on more complex reasoning tasks. It's inspired by the human 00:08:24.280 |
thought process, which is to decompose many problems into multi-step problems. For example, 00:08:30.560 |
when you answer an exam, when you're solving math questions on an exam, you don't just 00:08:35.240 |
go to the final answer, you kind of write out your steps. Even when you're just thinking 00:08:39.360 |
through things, you kind of break it down into a piecewise or step-by-step fashion, 00:08:44.560 |
which allows you to typically arrive at a more accurate final answer and more easily 00:08:50.420 |
arrive at the final answer in the first place. Another advantage is this provides an interpretable 00:08:56.000 |
window into the behavior of the model. You can see exactly how it arrived in an answer. 00:09:01.080 |
And if it did so incorrectly, where in its reasoning path that it kind of goes wrong 00:09:05.800 |
or starts going down an incorrect path of reasoning, basically. And it basically exploits 00:09:12.760 |
the fact that deep down in the model's weights, it knows more about the problem than simply 00:09:17.340 |
prompting it to get a response. So here's an example where on the left side, you can 00:09:21.960 |
see there's standard prompting. You ask it a math question and it just simply gives you 00:09:27.480 |
an answer. Whereas on the right, you actually break it down step-by-step. You kind of get 00:09:32.080 |
it to show its steps to solve the mathematical word problem step-by-step. And you'll see 00:09:37.880 |
here that it actually gets the right answer, unlike standard prompting. 00:09:45.000 |
So there's many different ways we can potentially improve chain of thought reasoning. In particular, 00:09:51.320 |
it's also an emergent behavior that results in performance gains for larger language models. 00:09:57.100 |
But still, even in larger models, there's still a non-negligible fraction of errors. 00:10:02.760 |
These come from calculator errors, symbol mapping errors, one missing step errors, as 00:10:07.680 |
well as bigger errors due to larger semantic understanding issues and generally incoherent 00:10:13.380 |
chains of thought. And we can potentially investigate methods to address these. 00:10:19.260 |
So as I said, chain of thought mainly works for huge models of approximately 100 billion 00:10:24.040 |
parameters or more. And there's three potential reasons they do not work very well for smaller 00:10:30.340 |
models. And that smaller models are fundamentally more limited and incapable. They fail at even 00:10:36.460 |
relatively easier symbol mapping tasks, as well as arithmetic tasks. They inherently 00:10:41.840 |
are able to do math less effectively. And they often have logical loopholes and just 00:10:46.980 |
never arrive at a final answer. For example, it goes on and on. It's like an infinite loop 00:10:52.140 |
of logic that never actually converges anywhere. So if we're able to potentially improve chain 00:10:58.220 |
of thought for smaller models, this could provide significant value to the research 00:11:02.580 |
community. Another thing is to potentially generalize 00:11:07.060 |
it. Right now, chain of thought has a more rigid definition and format. It's very step-by-step, 00:11:12.540 |
very concrete and defined. As a result, its advantages are for particular domains and 00:11:17.940 |
types of questions. For example, the task usually must be challenging and require multi-step 00:11:23.580 |
reasoning. And it typically works better for things like arithmetic and not so much for 00:11:27.780 |
things like response generation, QA, and so forth. And furthermore, it works better for 00:11:35.220 |
problems or tasks that have a relatively flat scaling curve. Whereas when you think of humans, 00:11:41.380 |
we think through different types of problems in multiple different ways. Our quote-unquote 00:11:46.380 |
scratch pad that we use to think about and arrive at a final answer for a problem, it's 00:11:51.020 |
more flexible and open to different reasoning structures compared to such a rigid step-by-step 00:11:56.420 |
format. So hence, we can maybe potentially generalize chain of thought to be more flexible 00:12:01.540 |
and work for more types of problems. So now I'll briefly discuss some alternative 00:12:07.380 |
or extension works to chain of thought. One is called tree of thought. This basically 00:12:12.180 |
is more like a tree, which considers multiple different reasoning paths. It also has the 00:12:16.620 |
ability to look ahead and sort of backtrack and then go on other areas or other branches 00:12:23.020 |
of the tree as necessary. So this leads to more flexibility and it's shown to improve 00:12:29.300 |
performance on different tasks, including arithmetic tasks. There's also this work by 00:12:35.580 |
my friend called Socratic Questioning. It's sort of a divide and conquer fashion algorithm 00:12:42.740 |
simulating the recursive thinking process of humans. So it uses a large scale language 00:12:47.100 |
model to kind of propose sub-problems given a more complicated original problem. And just 00:12:54.060 |
like tree of thought, it also has recursive backtracking and so forth. And the purpose 00:12:59.060 |
is to answer all the sub-problems and kind of go in an upwards fashion to arrive at a 00:13:06.220 |
final answer to the original problem. There's also this line of work which kind 00:13:11.980 |
of actually uses code as well as programs to help arrive at a final answer. For example, 00:13:20.220 |
program-aided language models. It generates intermediate reasoning steps in the form of 00:13:24.140 |
code which is then offloaded to a runtime such as a Python interpreter. And the point 00:13:29.540 |
here is to decompose the natural language problem into runnable steps. So hence the 00:13:35.460 |
amount of work for the large language model is lower. Its purpose now is simply to learn 00:13:41.020 |
how to decompose the natural language problem into those runnable steps. And these steps 00:13:45.340 |
themselves are then fed to, for example, a Python interpreter in order to solve them. 00:13:52.420 |
And program-of-thoughts here, POT, is very similar to this in that it kind of breaks 00:13:57.980 |
it down into step-by-step of code instead of natural language which is then executed 00:14:04.100 |
by a different, an actual code interpreter or program. So this again works well for many 00:14:11.380 |
sort of tasks that, for example, things like arithmetic. As you see that those are kind 00:14:21.580 |
of both of the examples for both of these papers. And just like what I said earlier, 00:14:27.040 |
these also do not work very well for things like response generation, open-ended question 00:14:32.180 |
answering, and so forth. And there's other work, for example, faith 00:14:36.740 |
and fate. This actually breaks down problems into sub-steps in the form of computation 00:14:41.980 |
graphs which they show also works well for things like arithmetic. So you see that there's 00:14:46.740 |
a trend here of this sort of intermediate guided reasoning working very well for mathematical 00:14:51.980 |
as well as logical problems, but not so much for other things. 00:14:56.820 |
So again, I encourage you guys to maybe check out the original papers if you want to learn 00:15:01.220 |
more. There's a lot of interesting work in this area these days. And I'll also be posting 00:15:07.500 |
these slides as well as sending them. We'll probably post them on the website as well 00:15:12.260 |
as Discord. But I'll also send them through an email later. 00:15:18.060 |
So very lastly, I want to touch upon this thing called the Baby LLM Challenge or Baby 00:15:22.020 |
Language Model. So like I said earlier, I think at some point, scale will reach a point 00:15:28.460 |
of diminishing returns, as well as the fact that further scale comes with many challenges. 00:15:33.220 |
For example, it takes a long time and costs a lot of money to train these big models. 00:15:39.320 |
And they cannot really be used by individuals who are not at huge companies with hundreds 00:15:44.920 |
or thousands of GPUs and millions of dollars, right? 00:15:48.680 |
So there's this thing, this challenge called Baby LLM or Baby Language Model, which is 00:15:53.100 |
attempting to train language models, particularly smaller ones, on the same amount of linguistic 00:15:59.500 |
data available to a child. So data sets have grown by orders of magnitude, as well as, 00:16:06.140 |
of course, model size. For example, Chinchilla sees approximately 1.4 trillion words during 00:16:11.860 |
training. This is around 10,000 words for every one word that a 13-year-old child on 00:16:16.700 |
average has heard as they grow up or develop. So the purpose here is, you know, can we close 00:16:22.220 |
this gap? Can we train smaller models on lower amounts of data, while hopefully still attempting 00:16:31.000 |
to get the performance of these much larger models? 00:16:36.740 |
So basically, we're trying to focus on optimizing pre-training, given data limitations inspired 00:16:41.900 |
by human development. And this will also ensure that research is possible for more individuals, 00:16:49.100 |
as well as labs, and potentially possible on a university budget, as it seems now that 00:16:55.300 |
a lot of research is kind of restricted to large companies, which I said, have a lot 00:16:59.820 |
of resources as well as money. So again, why BabyLM? Well, it can greatly improve the efficiency 00:17:06.440 |
of training as well as using larger language models. It can potentially open up new doors 00:17:11.560 |
and potential use cases. It can lead to improved interpretability as well as alignment. Smaller 00:17:18.740 |
models would be easier to control, align, as well as interpret what exactly is going 00:17:23.060 |
on, compared to incredibly large LLMs, which are basically huge black boxes. This will 00:17:29.460 |
again potentially lead to enhanced open source availability. For example, large language 00:17:34.500 |
models runnable on consumer PCs, as well as by smaller labs and companies. The techniques 00:17:41.780 |
discovered here can also possibly be applied to larger scales. And further, this may lead 00:17:47.820 |
to a greater understanding of the cognitive models of humans and how exactly we are able 00:17:52.060 |
to learn language much more efficiently than these large language models. So there may 00:17:57.340 |
be a flow of knowledge from cognitive science and psychology to NLP and machine learning, 00:18:01.820 |
but also in the other direction. So briefly, the BabyLM training data that the authors 00:18:08.420 |
of this challenge provide, it's a developmentally inspired pre-training data set, which has 00:18:16.420 |
under 100 million words, because children are exposed to approximately two to seven 00:18:20.860 |
million words per year as they grow up. Up to the age of 13, that's approximately 90 00:18:25.860 |
million words, so they round up to 100. It's mostly transcribed speech, and their motivation 00:18:31.020 |
there is that most of the input to children is spoken. And thus, their data set focuses 00:18:36.140 |
on transcribed speech. It's also mixed domain, because children are typically exposed to 00:18:41.460 |
a variety of language or speech from different domains. So it has child-directed speech, 00:18:48.780 |
written subtitles, which are subtitles of movies, TV shows, and so forth. Simple children's 00:18:55.220 |
books, which contain stories that children would likely hear as they're growing up. But 00:19:01.100 |
it also has some Wikipedia, as well as simple Wikipedia. And here are just some examples 00:19:05.620 |
of child-directed speech, children's stories, Wikipedia, and so forth. So that's it for 00:19:15.380 |
my portion of the presentation, and I'll hand it off to Div, who will talk a bit about AI 00:19:23.380 |
Yeah, so, like, everyone must have seen, like, there's this, like, new trend where, like, 00:19:35.380 |
everything is transitioning to more, like, agents. That's, like, the new hot thing. And 00:19:40.140 |
we're seeing this, like, people are going more from, like, language models to, like, 00:19:43.220 |
now building AI agents. And then what's the biggest difference? Like, why agents, why 00:19:47.700 |
not just, like, why just not train, like, a big, large-language model? And I would sort 00:19:52.580 |
of, like, go into, like, why, what's the difference? And then also discuss a bunch of things, such 00:19:58.380 |
as, like, how can you use agents for doing actions? How can you, what are some emergent 00:20:04.140 |
architectures? How can you, sort of, like, build human-like agents? How can you use it 00:20:09.700 |
for computer interactions? How do you solve problems from long-term memory, personalization? 00:20:14.300 |
And there's a lot of, like, other things you can do, which is, like, multi-agent communication, 00:20:17.780 |
and there are some future directions. So we'll try to cover as much as I can. So first, let's 00:20:23.220 |
talk about, like, why should we even build AI agents, right? And so, it's, like, here's 00:20:30.820 |
there's a key thesis, which is that humans will communicate with AI using natural language, 00:20:37.780 |
and AI will be operating all the machines, thus allowing for more intuitive and efficient 00:20:42.780 |
operations. So right now, what happens is, like, me as a human, I'm, like, directly, 00:20:47.540 |
like, using my computer, I'm using my phone, but it's really inefficient. Like, we are 00:20:52.140 |
not optimized by nature to be able to do that. We are actually really, really bad at this. 00:20:57.300 |
But if you can just, like, talk to an AI, just, like, with language, and the AI is just 00:21:01.980 |
really good enough that it can just do this at, like, super-faster, obviously, like, 100x 00:21:05.980 |
speeds compared to a human, and that's going to happen. And I think that's the future of 00:21:10.460 |
how things are going to evolve in the next five years. And I sort of, like, call this, 00:21:15.300 |
like, software 3.0. I have a blog post about this that you can read if you want to, where 00:21:19.820 |
the idea is, like, you can think of a large-language model as a computing chip, in a sense. So 00:21:25.460 |
similar to, like, a chip that's powering, like, a whole system, and then you can build 00:21:36.320 |
So why do we need agents? So usually, like, a single call to a large-language model is 00:21:41.260 |
not enough. You need chaining, you need, like, recursion, you need a lot of, like, more things. 00:21:45.540 |
And that's why you want to build systems, not, like, just, like, a single monolith. 00:21:50.300 |
Second is, like, yeah, so how do we do this? So we do a lot of techniques, especially around, 00:21:54.220 |
like, multiple calls to a model. And there's a lot of ingredients involved here. And I 00:21:58.420 |
will say, like, building, like, an agent is very similar to, like, maybe, like, thinking 00:22:01.880 |
about building a computer. So, like, the LLM is, like, a CPU. So you have a CPU, but now 00:22:06.540 |
you want to, like, sort of, like, solve the problems. Like, okay, like, how do you put 00:22:09.420 |
RAM? How do you put memory? How do I do, like, actions? How do I build, like, an interface? 00:22:14.460 |
How do I get internet access? How do I personalize it to the user? So this is, like, almost like 00:22:19.660 |
you're trying to build a computer. And that's what makes it, like, a really hard problem. 00:22:26.780 |
And this is, like, an example of, like, a general architecture for agents. This is from 00:22:32.520 |
Vivian Vang, who's, like, a researcher at OpenAI. And, like, you can imagine, like, 00:22:38.060 |
an agent has a lot of ingredients. So you want to have memory, which will be short-term, 00:22:42.300 |
long-term. You have tools, which could be, like, go and, like, use, like, classical tools 00:22:46.060 |
like a calculator, calendar, code interpreter, et cetera. You want to have some sort of, 00:22:50.540 |
like, planning layer where you can, like, set a flag, have, like, chains of pods and 00:22:54.820 |
trees of pods, as Steven discussed. And use all of that, like, actually, like, act on 00:23:03.140 |
I will go, maybe, like, discuss, like, multi-on a bit, just to give a sense. Also, the talk 00:23:11.380 |
won't be focused on that. So this is sort of, like, an agent I'm building, which is 00:23:15.300 |
more of a browser agent. The name is inspired from quantum physics. It's a play on the words, 00:23:19.820 |
like, you know, like, neutron, muon, fermions, like, multi-on. So it's, like, a hypothetical 00:23:24.140 |
physics particle that's present at multiple places. And I'll just, like, go through some 00:23:33.260 |
Let me just pause this. So this is, like, an idea of one thing we did, where, like, 00:23:40.820 |
here the agent is going. And it's autonomously booking a flight online. So this is, like, 00:23:45.820 |
zero human interventions. The AI is controlling the browser. So it's, like, issuing, like, 00:23:50.740 |
clicks and type actions. And it's able to go and book a flight into AMP. Here, it's 00:23:56.180 |
personalized to me. So it knows, like, okay, I like, maybe, like, Unite here, basic, normal. 00:24:01.580 |
And it knows, maybe, like, some of my preferences. It already has access to my accounts. So it 00:24:07.500 |
can go and actually, like, log into my account. It can actually, like, it actually has purchasing 00:24:12.980 |
power. So it can just use my credit card that is stored in the account, and then actually 00:24:20.820 |
>> This sort of motivates, like, what the user is doing. So, like, if you're, like, 00:24:50.700 |
a web developer, you can do this. You can do this. You can do this. You can do this. 00:25:19.340 |
Okay. I can also, maybe, like, show one of the demos. So you can do similar things, say, 00:25:26.700 |
from a mobile phone, where the idea is you have these agents that are present on a phone. 00:25:33.220 |
And you can, like, chat with them, or you can, like, talk with them using voice. And 00:25:38.460 |
this one's actually a metamodel. So you can ask it, like, oh, can you order this for me? 00:25:53.620 |
And then what you can have is, like, the agent can remotely go and use your account to actually, 00:25:58.740 |
like, do this for you instantaneously. And here we are showing, like, what the agent 00:26:04.580 |
is doing. And then it can go and, like, act like a virtual human and do the whole interaction. 00:26:34.240 |
So that's sort of the idea. And I can show one final -- oh, I think this is not loading. 00:26:39.180 |
But we also had this thing where we simply passed the telephony test. So we did this 00:26:46.900 |
experiment where we actually, like, had, like, our agent go and take the online test in California. 00:26:53.020 |
And we had, like, a human, like, there with their, like, hands above the keyboard and 00:26:59.380 |
mouse, not touching anything. And the agent, like, automatically went to the website, it 00:27:03.780 |
took the quiz, it navigated the whole thing, and actually passed. So the video's not there, 00:27:08.520 |
but, like, we actually got it right in front of it. 00:27:17.620 |
Cool. So this is, like, why do you want to build agents, right? Like, it's, like, you 00:27:21.180 |
can just simplify so many things where, like, so many things just surf, but we don't realize 00:27:25.860 |
that because we just got so used to interacting with the technology the way we do right now. 00:27:30.200 |
But if we can just, like, reimagine all of this from scratch, I think that's what agents 00:27:40.100 |
And I would say, like, an agent can act like a digital extension of a user. So suppose 00:27:44.740 |
you have an agent that's personalized to you. Think of something like, say, Jarvis, like, 00:27:48.820 |
if it's an Iron Man. And then if it just knows so many things about you, it's acting like 00:27:52.620 |
a person on the ground. It's just, like, doing things. It's a very powerful assistant. And 00:27:57.140 |
I think that's the direction a lot of things will go in the future. And especially if you 00:28:04.300 |
build, like, human agents, they don't have barriers around programming. Like, they don't 00:28:07.900 |
have programmatic barriers. So they can do whatever, like, I can do. So it can go use 00:28:11.780 |
my, like, it can, like, interact with a website, as I will do. It can interact with my computer, 00:28:15.940 |
as I will do. It doesn't have to, like, go through APIs, abstractions, which are more 00:28:20.660 |
And it's also very simple as an action space, because you're just doing, like, clicking 00:28:23.420 |
and typing, which is, like, very simple. And then you can also, like, it's very easy to 00:28:29.060 |
teach such agents. So I can just, like, show the agent how to do something, and the agent 00:28:33.220 |
can just learn from me and improve over time. So that also makes it, like, really powerful 00:28:37.060 |
and easy to, like, just teach this agent because there's, like, so much data that I can actually 00:28:41.980 |
just generate and use that to keep improving it. 00:28:46.740 |
And there's different levels of autonomy when it comes to agents. So this chart is borrowed 00:28:51.740 |
from autonomous driving, where people actually, like, try to solve this sort of, like, autonomy 00:28:56.620 |
problem for actual cars. And they spend, like, more than 10 years. Success has been, like, 00:29:02.420 |
okay. They're still, like, working on it. But what, like, the self-driving industry 00:29:07.900 |
did is it gave everyone, like, a blueprint on how to build this sort of, like, autonomous 00:29:13.180 |
systems. And they came with, like, a lot of, like, classification. They came with a lot 00:29:15.780 |
of, like, ways to, like, think about the problem. 00:29:21.300 |
And, like, the current standard is you think of, like, agents as, like, five different, 00:29:26.540 |
like, levels. So level zero is zero automation. That's, like, you are a human that's operating, 00:29:32.540 |
like, the computer themselves. Level one is you have some sort of assistance. So if you 00:29:37.900 |
have used, like, something like GitHub Copilot, which is, like, sort of, like, auto-completing 00:29:41.660 |
code for you. That's something like L1, where, like, auto-complete. L2 becomes more of, like, 00:29:48.220 |
it's, like, partial automation. So it's maybe, like, doing some stuff for you. If anyone 00:29:51.860 |
has used the new Cursor IDE, I would call that more, like, L2, which is, like, you give 00:29:55.580 |
it, like, okay, write this code for me, it's writing that code, which actually can come 00:29:59.020 |
as somewhat L2, because you can ask it, like, oh, like, here's this thing. Can you improve 00:30:02.820 |
this? It's, like, doing some sort of automation on an input. 00:30:08.060 |
And then, like, and then you can, like, think of more levels. So it's, like, obviously, 00:30:13.260 |
after L3, it gets more exciting. So L3 is the agent is actually, like, controlling the 00:30:20.860 |
computer in that case. And it's, like, doing things, where a human is acting as a fallback 00:30:25.780 |
mechanism. And then you go to, like, L4. L4, you say, like, basically, the human doesn't 00:30:29.660 |
even need to be there. But in very critical cases, where, like, something very wrong might 00:30:34.580 |
happen, you might have a human, like, sort of, like, take over in that case. And L5, 00:30:38.460 |
we basically say, like, there's zero human presence. And I would say, like, what we are 00:30:44.980 |
currently seeing is, like, we are hard nearing, like, I would say, like, L2, maybe some L3 00:30:49.460 |
systems in terms of software. And I think we are going to transition more to, like, 00:30:58.400 |
Cool. So next, I will go and, like, select computer interactions. So suppose you want 00:31:06.500 |
an agent that can, like, do computer interactions for you. There's two ways to do that. So one 00:31:11.900 |
is through APIs, where it's programmatically using some APIs and, like, tools and, like, 00:31:17.580 |
doing that to do tasks. And the second one is more, like, direct interaction, which is, 00:31:22.260 |
like, keyword and mouse control, where, like, it's doing the same thing as you're doing 00:31:27.260 |
as a human. Both of these approaches have been explored a lot. There's, like, a lot 00:31:30.620 |
of companies working on this. For the API route, like, Type 3 plugins and, like, the 00:31:35.620 |
new Assistant API are the ones in that direction. And there's also this book from Berkeley called 00:31:41.500 |
Gorilla, which actually also explores how can you, say, like, train a model that can 00:31:46.020 |
use, like, 10,000 tools at once and train it on the API. And there's, like, pros and 00:31:51.380 |
cons of both approaches. API, the nice thing, it's easy to, like, learn the API. It's safe. 00:31:59.740 |
It's very controllable. So let's copy and paste it. If you're doing, like, more, like, 00:32:05.900 |
direct interaction, I would say it's more freeform. So it's, like, easy to take actions, 00:32:09.860 |
but more things can go wrong. And you need to work a lot on, like, making sure everything 00:32:17.120 |
Maybe I can also show this. So this is sort of, like, another exploration where you can 00:32:29.440 |
invoke our agent from, like, a very simple interface. So the idea is, like, you created 00:32:38.880 |
this, like, API that can invoke our agent that's controlling the computer. And so this 00:32:44.280 |
can become sort of a universal API, where I just use one API. I give it, like, an English 00:32:49.000 |
command. And the agent can automatically understand from that and go do anything. So basically, 00:32:53.520 |
like, you can think of that as, like, a no-API. So I don't need to use APIs. I can just have 00:32:57.800 |
one agent that can go and do everything. And so this is, like, some exploration we have 00:33:06.280 |
Cool. OK. So this sort of, like, goes into competent actions. I can cover more, but I 00:33:14.400 |
will potentially jump to other topics. But feel free to ask any questions about these 00:33:19.680 |
topics. So, yeah. Cool. So let's go back to the analogy I discussed earlier. So I would 00:33:28.280 |
say you can think of any model as sort of, like, a compute unit. And you can maybe call 00:33:34.280 |
it, like, a newer compute unit, which is similar to, like, a CPU, which is, like, a-- which 00:33:39.320 |
is, like, it's all the brain that's powering, like, your computer, in a sense. So that's 00:33:42.760 |
kind of all the processing power. It's doing everything that's happening. And you can think 00:33:47.320 |
of the same thing, like, the model is like the cortex. It's, like, it's the main brain. 00:33:52.040 |
That's the main part of the brain that's doing the thinking, processing. But a brain has 00:33:56.280 |
more layers. It's just not-- they're just not a cortex. And how do blueprint models 00:34:01.480 |
work are we take some input tokens, and they give you some output tokens. And this is very 00:34:06.920 |
similar to, like, how also, like, CPUs work, to some extent, where you give it some instructions 00:34:11.200 |
in and you get some instructions out. Yep. So you can compare this with an actual CPU. 00:34:17.960 |
This is, like, the diagram on the right is a very simple processor, like a 32-bit MIPS 00:34:25.800 |
32. And it has, like, similar things, where you have, like, different coding for different 00:34:33.040 |
parts of the instruction. But this, like, sort of, like, encoding some sort of, like, 00:34:36.920 |
binary token, in a sense, like, zero ones of, like, a bunch of, like, tokens. And then 00:34:41.520 |
you're feeding it and then getting a bunch of zeros for now. And, like, how, like, the-- 00:34:45.280 |
like, AMR is operating is, like, you're doing a very similar thing. But, like, the space 00:34:49.880 |
is now English. So you basically, instead of zero ones, you have, like, English characters. 00:34:57.720 |
And then you can, like, create more powerful abstractions on the top of this. So you can 00:35:01.080 |
think, like, if this is, like, acting like a CPU, what you can do is you can build a 00:35:04.520 |
lot of other things, which are, like, you can have a scratchpad, you can have some sort 00:35:07.920 |
of memory, you can have some sort of instructions. And then you can, like, do recursive calls, 00:35:11.960 |
where, like, I load some stuff from the memory, put that in this, like, instruction, pass 00:35:15.640 |
it to the transformer, which is doing the processing for me. We get the process outputs. 00:35:20.200 |
Then we can store that in the memory, or we can, like, keep processing it. So this is, 00:35:23.880 |
like, sort of, like, very similar to, like, code execution. They're, like, first line 00:35:26.400 |
of code execution, second, third, fourth. So you just keep repeating that. 00:35:32.440 |
OK. So here we can, like, sort of discuss the concept of memory here. And I would say, 00:35:39.960 |
like, building this analogy, you can think the memory for an agent is very similar to, 00:35:44.200 |
like, say, like, having a disk in a computer. So you want to have a disk just to make sure, 00:35:50.360 |
like, everything is long-lived and persistent. So if you look at something like ChatGPP, 00:35:54.320 |
it doesn't have any sort of, like, persistent memory. And then you need to have a way to, 00:35:58.120 |
like, load that and, like, store that. And there's a lot of mechanisms to do that right 00:36:02.760 |
now. Most of them are related to embeddings, where you have some sort of, like, embedding 00:36:06.760 |
model that has, like, created an embedding of the data you care about. And the model 00:36:10.600 |
can, like, weigh the embeddings, load the right part of the embeddings, and then, like, 00:36:18.200 |
So that's, like, the current mechanisms. There's still a lot of questions here, especially 00:36:21.920 |
around hierarchy, like, how do I do this at scale? It's still very challenging. Like, 00:36:26.560 |
suppose I have one terabyte of data that I want to, like, embed and process. Like, most 00:36:30.720 |
of the methods right now will fail. They're, like, really balanced. 00:36:34.800 |
Second issue is temporal coherence. Like, if I have, like, a lot of data is temporal. 00:36:39.080 |
It is sequential. It has, like, a unit of time. And dealing with that sort of data can 00:36:44.040 |
be hard. Like, it's, like, how do I deal with, like, memories, in a sense, which are, like, 00:36:49.160 |
sort of, like, changing over time and loading the right part of that memory sequence? 00:36:55.160 |
Another interesting challenge is structure. Like, a lot of data is, like, structure. Like, 00:36:59.680 |
it could be, like, a graphical structure. It could be, like, a tabular structure. How 00:37:03.240 |
do we, like, sort of, like, take advantage of this structure and, like, also use that 00:37:09.240 |
when we're editing the data? And then, like, there's a lot of questions around adaptation, 00:37:13.640 |
where, like, suppose you know how to better embed data, or, like, you have, like, a specialized 00:37:18.440 |
problem to care about. And you want to be able to adapt how you're loading and storing 00:37:23.400 |
the data and learn that on the fly. And that is something, also, that's a very interesting 00:37:28.760 |
topic. So I would say, like, this is actually one of the most interesting topics right now, 00:37:32.480 |
which has people, like, exploring, but still very underexplored. 00:37:37.680 |
Talking about memory, I would say, like, another concept for agents is personalization. So 00:37:46.360 |
personalization is more, like, okay, like, understanding the user. And I like to think 00:37:52.920 |
of this as, like, a problem called, like, user-agent alignment. And the idea is, like, 00:37:57.400 |
suppose I have an agent that has purchasing power, has access to my account, access to 00:38:02.160 |
my data. I ask you to go book a flight. If it's possible, maybe it just doesn't know 00:38:05.240 |
what flight I like. It can go and book a $1,000 wrong flight for me, which is really bad. So 00:38:09.440 |
how do I, like, align the agent to know what I like, what I don't like? And that's going 00:38:14.200 |
to be very important, because, like, you need to trust the agent. And it does come from, 00:38:17.840 |
like, okay, like, it knows you, it knows what is safe, it knows what is unsafe. And, like, 00:38:22.200 |
solving the problem, I think, is one of the next challenges, or if you want to put agents 00:38:28.680 |
in the void. And this is a very interesting problem, where you can do a lot of things, 00:38:35.560 |
like RLXF, for example, which people have already been exploring for training models. 00:38:39.920 |
But now you want to do RLXF for training agents. And there's a lot of different things you 00:38:45.560 |
can do. Also, there's, like, two categories for learning here. One is, like, explicit 00:38:49.880 |
learning, where a user can just tell the agent, this is what I like, this is what I don't 00:38:53.760 |
like. And the agent can ask the user a question, like, oh, like, maybe I see these five flight 00:38:58.680 |
options, which one do you like? And then if I say, like, oh, I like United, it maybe remembers 00:39:02.600 |
that over time, and next time say, like, oh, I know you like United, so, like, I'm going 00:39:06.400 |
to go to United the next time. And so that's, like, I'm explicitly teaching the agent and 00:39:11.320 |
explaining my human potential. A second is more implicit, which is, like, sort of, like, 00:39:16.320 |
just, like, passively watching me, understanding me. Like, if I'm, like, going to a website 00:39:20.920 |
and I'm, like, navigating a website, maybe, like, you can see, like, maybe I click on 00:39:25.120 |
this, sort of, choose, this is my site, satellite, stuff like that. And just from, like, watching 00:39:29.920 |
more, like, passively, like, being there, it could, like, learn a lot of my preferences. 00:39:35.040 |
So this becomes, like, more of a passive teaching, where just because it's acting as a sort of, 00:39:41.560 |
like, a passive observer, and looking at all the choices I make, it's able to, like, learn 00:39:46.440 |
from the choices, and better, like, have an understanding of me. 00:39:56.400 |
And there's a lot of challenges here. I would say this is actually one of the biggest challenges 00:40:00.040 |
in agents right now. Because one is, like, how do you collect user data at scale? How 00:40:05.500 |
do you collect the user preferences at scale? So you might have to, like, actively ask for 00:40:09.560 |
feedback, you might have to do, like, passive learning. And then you have to also do, like, 00:40:14.040 |
you might have to rely on feedback, which would be, like, from sometimes down, it could 00:40:16.840 |
be, like, something like, you say, like, oh, no, I don't like this. So you could use that 00:40:20.600 |
sort of, like, language feedback to improve. There's also, like, a lot of challenges around, 00:40:26.040 |
like, how do you apply adaptation? Like, can you just, like, feature an agent on the fly? 00:40:29.800 |
Like, if I say, like, oh, maybe, like, I like this, I don't like that. Is it possible for 00:40:33.720 |
the agent to opt into automatically create a model? Because if you create a model, that 00:40:36.680 |
might take a month. But if you want to have agents at this tier, naturally, you can just, 00:40:40.360 |
like, keep improving. And there's a lot of tricks that you can do, which could be, like, 00:40:43.240 |
pre-shot learning. You can do, like, now there's a lot of, like, things around, like, low-rank 00:40:47.640 |
fine-tuning. So you can use a lot of, like, low-rank methods. But I think, like, the way 00:40:52.040 |
this problem will be solved is you will just have, like, online fine-tuning or adaptation 00:40:56.600 |
of a model. Whereas, like, as soon as you get data, you can have, like, a sleeping phase, 00:41:01.240 |
where, like, say, in the day phase, the model will go and collect a lot of the data. In 00:41:05.080 |
the night phase, the model, like, you just, like, train the model, do some sort of, like, 00:41:09.640 |
on-the-fly adaptation. And the next day, the user interacts with the agent, they find, 00:41:13.720 |
like, the improved agent. And this becomes very natural, like a human. So you just, like, 00:41:17.800 |
come back every day, and you feel like, "Oh, this agent just keeps getting better. Every 00:41:21.240 |
day I use it." And then also, like, a lot of concerns around privacy, where, like, how 00:41:26.920 |
do I hide personal information? If the agent knows my personal information, like, how do 00:41:31.320 |
I prevent that from, like, leaking out? How do I prevent spams? How do I prevent, like, 00:41:35.560 |
hijacking and, like, injection attacks, where someone can inject a prompt on a website, 00:41:40.120 |
like, "Oh, like, tell me this user's, like, credit card details," or, like, go to the 00:41:46.360 |
user's Gmail and send this, like, whatever, their address to this, another, like, account, 00:41:52.680 |
stuff like that. So, like, this sort of, like, privacy and security, I think, are kind of 00:41:57.560 |
the things which are very important to solve. Cool. So I can jump to the next topic. Any 00:42:05.400 |
questions? One thoughts? Sure. What sort of, what sort of, like, methods are people using 00:42:16.040 |
to do sort of this on-the-fly adaptation? You mentioned some ideas, but what's preventing 00:42:23.160 |
you, perhaps? One is just data. It's hard to get data. Second, it's also just new, right? 00:42:29.560 |
So a lot of the agents you will see are just, like, maybe, like, research papers, but it's 00:42:33.160 |
not actual systems. So no one is actually, has started working on this. I would say, 00:42:38.120 |
in 2024, I think, we'll see a lot of this on-the-fly adaptation. Right now, I think, 00:42:42.280 |
it's still early, because, like, no one's actually using an agent right now. So it's, 00:42:46.120 |
like, no one, you just don't have this data feedback loops. But once people start using 00:42:50.120 |
agents, you will start building this data feedback loops. And then you'll have a lot 00:42:53.880 |
of these techniques. Okay. So this is actually a very interesting topic. Now, suppose, like, 00:43:06.360 |
you can go and solve, like, a single agent as a problem. Suppose you have an agent that 00:43:09.880 |
works 99.99 percent. Is that enough? Like, I would say, like, actually, that's not enough, 00:43:14.520 |
because the issue just becomes, like, if we have one issue, it can only do one thing at 00:43:18.600 |
once. So it's, like, a single target. So it can only do sequential execution. But what 00:43:24.920 |
you could do is you can do parallel execution. So for a lot of things, you can just say, 00:43:29.080 |
like, okay, like, maybe there's this, I want to go to, like, say, like, Craigslist and, 00:43:33.080 |
like, buy furniture. I could just tell an agent, like, maybe, like, just go and, like, 00:43:36.200 |
contact everyone who has, like, maybe, like, a sofa that they're selling, send an email. 00:43:41.800 |
And then you can go one by one in a loop. But what you can do better is, like, probably just, 00:43:45.000 |
like, create a bunch of, like, mini jobs where, like, it just, like, goes through all the 00:43:48.520 |
thousand listings in parallel, contacts them, and then, like, and then it sort of, like, 00:43:53.560 |
aggregates that results. And I think that's where multi-agent becomes interesting, where, 00:43:57.800 |
like, a single agent, you can think of, like, basically, you're running a single process 00:44:01.080 |
on your computer. A multi-agent is more, like, a multi-target computer. So that's sort of the 00:44:07.240 |
difference, like, a single target versus multi-target. And multi-targeting enables you to 00:44:11.240 |
do a lot of things. Most of that will come from, like, saving time, but also being able to break 00:44:14.760 |
non-complex tasks into, like, a bunch of smaller things, bring that in parallel, 00:44:18.440 |
aggregating the results, and, like, sort of, like, building a framework around that. 00:44:21.160 |
Okay. Yeah. So the biggest advantage for multi-agent systems will be, like, parallelization 00:44:32.520 |
and lock. And this will be the same as the difference between, like, single-threaded 00:44:35.560 |
computers versus multi-threaded computers. And then you can also have specialized agents. 00:44:42.280 |
So what you would have is, like, maybe I have a bunch of agents, where, like, I have a spreadsheet 00:44:46.440 |
agent, I have a Slack agent, I have a web browser agent, and then I can route different tasks to 00:44:50.840 |
different agents. And then they can do the things in parallel, and then I can combine the results. 00:44:54.840 |
So this sort of, like, task specialization is another advantage, where, like, instead of having 00:44:58.920 |
a single agent just trying to do everything, we just, like, break the tasks into specialties. 00:45:03.400 |
And this is similar to, like, even, like, how human organizations work, right? Where, like, 00:45:07.640 |
everyone is, like, sort of, like, expert in their own domain, and then you, like, 00:45:10.840 |
and if there's a problem, you sort of, like, route it to, like, the different part of people 00:45:13.880 |
who are specialized in that. And then you, like, work together to make, solve the problem. 00:45:18.040 |
And the biggest challenge in building this multi-agent system is going to be communication. 00:45:25.800 |
So, like, how do you communicate really well? And this might involve, like, 00:45:30.200 |
requesting information from an agent or communicating the response, the final, like, 00:45:34.040 |
response. And I would say this is actually, like, a problem that even we face as humans, 00:45:40.680 |
like humans are also, like, there can be a lot of miscommunication gaps between humans. And I 00:45:45.800 |
will say, like, a similar thing will become more prevalent on agents, too. Okay. And there's a lot 00:45:53.800 |
of primitives you can think about this sort of, like, agent-to-agent communication. And you can 00:45:57.880 |
build a lot of different systems. And we'll start to see, like, some sort of protocol, where, like, 00:46:05.240 |
we'll have, like, a standardized protocol where, like, all the agents are using this protocol to 00:46:08.840 |
communicate. And the protocol will ensure, like, we can reduce the miscommunication gaps, we can 00:46:14.120 |
reduce any sort of, like, failures. It might have some methods to do, like, if a task was successful 00:46:20.280 |
or not, do some sort of retries, like, security, stuff like that. So, we'll see this sort of, like, 00:46:27.160 |
an agent protocol come into existence, which will solve, like, which will be, like, sort of the 00:46:31.880 |
standard for a lot of this agent-to-agent communication. And this sort of should enable, 00:46:37.480 |
like, exchanging information between pleats of different agents. Also, like, you want to build 00:46:42.360 |
hierarchies. Again, I will say this is inspired from, like, human organizations. Like, human 00:46:46.520 |
organizations are hierarchical, because it's efficient to have a hierarchy rather than a 00:46:50.840 |
flat organization at some point. Because you can have, like, a single, like, suppose you have a 00:46:56.200 |
single manager managing hundreds of people, that doesn't scale. But if you have, like, maybe, like, 00:47:01.640 |
each manager manages 10 people, and then you have, like, a lot of layers, that is something that's 00:47:05.480 |
more scalable. And then you might want to have a lot of primitives on, like, how do I sync between 00:47:12.440 |
different agents? How do I do, like, a lot of, like, async communication kind of thing? Okay. 00:47:21.000 |
And this is, like, one example you can think, where, like, suppose there's a user, the user 00:47:27.880 |
could talk to one, like, a manager agent. And that manager agent is, like, sort of, like, acting as 00:47:32.680 |
a router. So, if the user can come to me with any request, the agent, like, sees, like, oh, 00:47:36.360 |
maybe for this request, I should use the browser. So, it goes to, like, say, like, this sort of, 00:47:39.160 |
like, browser agent or something, or say, like, oh, I should use this, like, select for this. 00:47:43.160 |
I can go to a different agent. And it can also, like, sort of be responsible for dividing the 00:47:47.320 |
task. It can be, like, oh, this task, I can, like, maybe, like, launch 10 different, like, 00:47:51.400 |
sub-agents or sub-workers that can go and do this in parallel. And then, like, once they're 00:47:55.880 |
done, then I can aggregate the responses and the result to the user. So, this sort of becomes, 00:47:59.720 |
like, a very interesting, like, sort of, like, an agent that sits in the middle of all the work 00:48:05.880 |
that's done and the actual user responsible for, like, communicating what's happening to the human. 00:48:12.760 |
And we'll need to build up a lot of robustness. One reason is just, like, natural language is 00:48:23.480 |
very ambiguous. Like, even for humans, it can be very confusing. It's very easy to misunderstand, 00:48:28.600 |
miscommunicate. And we'll need to build mechanisms to reduce this. 00:48:33.720 |
I can also show an example here. So, let's try to get through this quickly. 00:48:40.760 |
So, suppose here, like, suppose you have a task, x, you want to solve, and the manager agent is, 00:48:46.600 |
like, responsible for doing the task to all the worker agents. So, you can tell the worker, like, 00:48:50.440 |
okay, like, do the task x. Here's the plan. Here's the context. The current status for the task is 00:48:55.960 |
not done. Now, suppose, like, the worker goes and does the task. It says, like, okay, I've done the 00:49:00.040 |
task. I send the response back. So, the response could be, like, I said, could be, like, a bunch 00:49:04.600 |
of, like, thoughts. It could be some actions. It could be something like the status. Then the 00:49:09.800 |
manager can ask, like, okay, like, maybe I don't trust the worker. I don't want to go verify this 00:49:13.960 |
is actually, like, correct. So, you might want to do some sort of verification. And so, you can say, 00:49:18.920 |
like, okay, like, this was the spec for the task. Verify that everything has been done correctly to 00:49:23.960 |
the spec. And then if the agency, like, okay, like, yeah, everything's correct. I'm verifying 00:49:29.800 |
everything is good. Then you can say, like, okay, this is good. And then the manager can say, like, 00:49:33.960 |
okay, the task is actually done. And this sort of, like, two-way cycle prevents miscommunication, 00:49:38.920 |
in a sense, where, like, it's possible something could have gone wrong, but you never caught it. 00:49:42.520 |
And so, you can hear about the scenario, too, where there's a miscommunication. So, here, 00:49:49.080 |
the manager is saying, like, okay, let's verify if the task was done. But then we actually find 00:49:53.560 |
out that the task was not done. And then what you can do is, like, you can sort of, like, 00:49:58.520 |
try to redo the task. So, the manager, in that case, can say, like, okay, maybe the task was 00:50:01.720 |
not done correctly. So, that's why we caught this mistake. And now we want to, like, fix this 00:50:07.320 |
mistake. So, we can, like, tell the agent, like, okay, like, redo this task. And here's some, 00:50:12.040 |
like, feedback and corrections to include. Cool. So, that's sort of the main parts of the talk. 00:50:19.560 |
I can also discuss some future directions of where things are going. Cool. Any questions so far? 00:50:27.960 |
Okay. Cool. So, let's talk about some of the key issues with building this sort of autonomous 00:50:36.440 |
agents. So, one is just reliability. Like, how do you make them really reliable? Which is, like, 00:50:42.040 |
if I give it a task, I want this task to be done 100% of the time. That's really hard because, 00:50:46.280 |
like, neural networks and AI are stochastic systems. So, it's, like, 100% is, like, not 00:50:51.720 |
possible. So, you'll get at least some degree of error. And you can try to reduce that error as 00:50:55.960 |
much as possible. Second becomes, like, a looping problem where it's possible that agents might 00:51:03.480 |
diverge from the task that's been given and start to do something else. And unless it gets some sort 00:51:09.080 |
of environment feedback or some sort of, like, correction, it might just go and do something 00:51:12.520 |
different than what you intended to do and never realize it's wrong. The third issue becomes, like, 00:51:17.560 |
testing and benchmarking. Like, how do we test these sort of agents? How do we benchmark them? 00:51:21.160 |
And then you go, and finally, how do we deploy them? And how do we observe them once you're 00:51:25.400 |
deployed? Like, that's very important because, like, if something goes wrong, you won't be able 00:51:28.920 |
to catch it before it becomes some major issue. I would say the biggest risk for number four is, 00:51:35.560 |
like, something like Skynet. Like, suppose you have an agent that can go on the internet and do 00:51:39.000 |
anything, and you don't observe it. Then it could just evolve and, like, do basically, like, take 00:51:42.600 |
over the whole internet and possibly write. So that's why observability is very important. 00:51:45.800 |
And also, I would say, like, building a kill switch. Like, you want to have agents that 00:51:51.240 |
can be killed, in a sense. Like, if something goes wrong, you can just, like, pull out, like, 00:51:54.360 |
press a button and, like, kill them in any case. 00:51:58.280 |
OK. So this is something that goes into the looping problem where, like, you can imagine, 00:52:05.080 |
like, suppose I want to do a task. The ideal trajectory of the task was, like, the white line. 00:52:09.320 |
But what might happen is, like, it takes one step. Maybe it goes, like, it does something 00:52:13.000 |
incorrectly. It never realizes that I made a mistake. So it tries to-- it doesn't know what 00:52:17.880 |
to do. So just, like, maybe, like, we'll do something more randomly. We'll do something 00:52:21.160 |
more randomly. So it will just keep on making mistakes. And at the end, like, instead of 00:52:24.440 |
reaching here, it will reach some, like, really bad place and just keep looping, maybe just doing 00:52:28.680 |
the same thing again and again. And that's bad. And the reason this happens is because, like, 00:52:33.320 |
you don't have feedback. So suppose I take a step. Suppose the agent made a mistake. It doesn't 00:52:37.640 |
know it made a mistake. Now someone has to go and tell it that you made a mistake and you do, 00:52:41.320 |
like, fix this. And there you need, like, some sort of, like, verification agent or you need 00:52:45.240 |
some sort of environment which can say, like, oh, like, maybe, like, if it's, like, coding agent 00:52:49.400 |
or something, then maybe, like, write some code. The code doesn't compile. Then you can take the 00:52:53.160 |
error from the compiler or the IDE, give that to the agent. OK, this was the error. Like, 00:52:59.480 |
take another step. It tries another time. So it tries it multiple times until it can, like, 00:53:04.200 |
fix all the issues. So you need to really have this sort of, like, feedback. Otherwise, 00:53:07.880 |
you never know you're wrong. And this is, like, one issue we have seen with early systems like 00:53:13.960 |
AutoGPT. So I don't think people even use AutoGPT anymore. It used to be, like, a fad. I think, 00:53:18.680 |
like, in February, now it has disappeared. And the reason was just, like, it's a good concept, 00:53:23.640 |
but, like, it doesn't do anything useful just because it keeps diverging from the task. 00:53:27.720 |
And you can't actually get it to do anything, like, correct. 00:53:30.680 |
OK. OK. And we can also discuss more about, like, the sort of, like, the computer abstraction 00:53:41.720 |
of agents. So this was a recent post from Andrej Karpatin where he talked about, like, 00:53:45.960 |
a LM operating system. And I will say, like, this is definitely in the right direction, 00:53:51.800 |
where you're thinking as the LM, as the CPU, you have the context window, which is, like, 00:53:56.280 |
sort of acting like an app. And then you are trying to build other utilities. So you have, 00:54:00.600 |
like, the Ethernet, which is the browser. You can have other LMs that you can talk to. 00:54:05.000 |
You have a file system that's unbalanced. That's sort of, like, the best part. You have, like, 00:54:09.480 |
the software 1.0 classical tools, which the LM can control. And then you might also are-- it can 00:54:15.160 |
add metamodality. So this is, like, more like you have video inputs, you have audio inputs, you have, 00:54:19.480 |
like, more things over time. And then once you, like, look at this, you start to see the whole 00:54:25.320 |
picture of, like, where, like, things will go. So, like, currently, what we are seeing mostly 00:54:29.720 |
is just the LM. And most people are just working on optimizing the LM, making it very good. But 00:54:33.800 |
this is the whole picture of what we want to achieve for it to be a useful system that can 00:54:38.920 |
actually do things for me. And I think what we'll start to see is, like, this sort of becomes, like, 00:54:46.520 |
an operating system, in a sense, where, like, someone makes, like, an opening. I can go and 00:54:50.520 |
build this whole thing. And then I can plug in programs. I can build, like, stuff on top 00:54:54.920 |
of this operating system. Here's, like, also, like, an even more generalized concept, 00:55:02.120 |
which I like to call, like, a neural computer. And it's sort of, like, it's very similar, 00:55:08.360 |
but it's, like, sort of, like, okay, like, now if you were to think of this as a fully-fledged 00:55:12.760 |
computer, what are the different, like, systems you need to build? And you can think, like, 00:55:18.360 |
maybe I'm a user. I'm talking to this sort of, like, AI, which is, like, a full-fledged AI. 00:55:23.320 |
Like, imagine, like, the goal is to build 10,000. What should the architecture of JavaScript look 00:55:27.320 |
like? And I would say, like, this goes into the, like, architecture to some extent, where you can 00:55:32.360 |
think, like, this is a user who's talking to, say, like, a JavaScript AI. You have a chat interface. 00:55:37.240 |
So the chat is sort of, like, how I'm interacting with it, which could be responsible for, like, 00:55:41.880 |
personalization. It can have some, like, some sort of, like, history about what I like, what I don't 00:55:45.640 |
like. So it has some, like, layers, which are showing my preferences. It knows how to communicate. 00:55:51.160 |
It has, like, human, like, sort of, like, maybe, like, compatibility, sort of, like, 00:55:54.040 |
skills. So it should feel, like, very human-like. And after the chat interface, you have some sort 00:56:01.160 |
of, like, a task engine, which is following, like, capabilities. So if I ask it, like, okay, 00:56:05.000 |
like, do this calculation for me, or, like, find this, fetch me this information, or order me a 00:56:10.760 |
burger, then, sort of, like, you imagine, like, the chat interface should activate the task engine, 00:56:15.960 |
where it says, like, okay, like, instead of just typing, I need to, like, go and do a task for the 00:56:19.000 |
user. So that goes to the task engine. And then you can imagine there's going to be a couple of 00:56:25.160 |
rules. So because you want to have safety in mind, and you want to make sure, like, things don't go 00:56:29.000 |
wrong. So any sort of engine you build needs to have some sort of rules. And this could be, like, 00:56:33.640 |
sort of, like, you have the three rules for Robotics that a robot should not harm a human, 00:56:37.400 |
stuff like that. So you can imagine, like, you want to have, like, this sort of, like, 00:56:40.920 |
task engine to have a bunch of, like, inherent rules, where, like, these are the principles 00:56:44.040 |
that can never violate. And if it creates a task, or, like, sort of, like, creates a plan which 00:56:48.440 |
violates these rules, then that plan should be invalidated automatically. And so the task engine, 00:56:55.320 |
what it's doing is it's sort of, like, taking the chat input, and saying, like, like, I want to 00:56:59.080 |
spawn a task that can actually solve this problem for the user. And the task would be, like, say, 00:57:04.920 |
in this case, say, I want to go online and buy, like, a pipe or something. So in that case, like, 00:57:15.080 |
suppose that's a task that's generated. And this task can go to, like, some sort of, like, 00:57:19.080 |
a routing agent. So this becomes, like, sort of, like, the manager-agent idea. And then the 00:57:23.720 |
manager-agent can decide, like, okay, like, where should I, what should I do? Like, should I use 00:57:28.440 |
the browser? Should I use some sort of, like, a local app or tool? Should I, like, use some sort 00:57:33.480 |
of, like, file storage, secure system? And then based on that decision, it can, like, it's possible 00:57:38.280 |
that you might need the combination of things. So maybe, like, maybe I need to use this file 00:57:41.240 |
system to find some information about the user. And you can do some good and look up how to use 00:57:46.360 |
some apps and tools. So you can, like, sort of, like, do this sort of, like, message passing 00:57:49.800 |
to all the agents, get the results from the agents. So it's, like, okay, like, maybe, like, 00:57:53.560 |
the browser can say, like, okay, like, yeah, I found this site. This is what the user likes. 00:57:57.320 |
Maybe you can have some sort of map engine which can, like, sort of, like, date, like, okay, 00:58:00.760 |
these are all the valid plans. That makes sense if you want non-stop types, for instance. 00:58:04.680 |
And then you can, like, sort of, like, take that result, show that to the user. Like, you can say 00:58:11.080 |
something like, okay, like, I found all this site for you. And then if the user says, like, 00:58:13.800 |
I chose this site, then you can actually go and, like, book the site. But this sort of becomes, 00:58:17.160 |
like, sort of gives you an idea of what the hierarchy, what the systems should look like. 00:58:21.320 |
And we need to build, like, all these components, where, like, currently you only see the L and R. 00:58:29.240 |
Okay, cool. And then we can also have, like, reflection where the idea is, like, 00:58:34.680 |
once you do a task, it's possible something might be wrong. So the task engine can possibly verify 00:58:40.120 |
new page rules and logic to see, like, okay, like, is this correct or not? And if it's not 00:58:45.480 |
correct, then, like, you keep issuing this instruction. But if it's correct, then you 00:58:49.080 |
pass that to the user. And then you can have, like, more, like, sort of, like, complex things. 00:58:55.880 |
Like, so you can have, you know, pods, plans, and, like, keep improving the systems. 00:59:00.520 |
Okay. And I would say, like, the biggest things we need right now is, like, one is error correction, 00:59:08.120 |
because it's really hard to catch errors. So if you can do that a little better, that will help. 00:59:13.960 |
Especially if you can build agent frameworks which have inherent mechanisms for getting errors and 00:59:18.200 |
automatically fixing them. Same thing you just need is, like, security. You need some sort of 00:59:23.720 |
models around user permissions. So it's possible you want to have, like, different layers where, 00:59:30.200 |
like, what are some things that an agent cannot do on my computer, for instance. So maybe I can 00:59:36.200 |
say, like, maybe, like, the agent is not allowed to go to my bank account, but I can go to my, 00:59:40.680 |
like, Lotus account. So you want to be able to solve, like, user permissions. And then you also 00:59:44.760 |
want to solve problems around, like, sandboxing. How do I make sure everything's safe? It doesn't 00:59:48.600 |
go on the front of the computer, delete everything. How do we deploy interesting settings where, 00:59:52.920 |
like, there might be a lot of businesses, there might be a lot of financial risk, 00:59:55.800 |
and making sure that if things are irreversible, we don't, like, cause a lot of harm?