MIT AGI: Cognitive Architecture (Nate Derbinsky)

So today we have Nate Derbinski. He's a professor at Northeastern University working on various aspects of computational agents that exhibit human level intelligence. Please give Nate a warm welcome. (audience applauding) - Thanks a lot and thanks for having me here. So the title that was on the page was Cognitive Modeling.

I'll kind of get there, but I wanted to put it in context. So the bigger theme here is I wanna talk about what's called cognitive architecture. And if you've never heard about that before, that's great. And I wanted to contextualize that as how is that one approach to get us to AGI?

I'm gonna say what my view of AGI is and put up a whole bunch of TV and movie characters that I grew up with that inspire me. That'll lead us into what is this thing called cognitive architecture? It's a whole research field that crosses neuroscience, psychology, cognitive science, and all the way into AI.

So I'll try to give you kind of the historical, big picture view of it, what some of the actual systems are out there that might be of interest to you. And then we'll kind of zoom in on one of them that I've done a good amount of work with called SOAR.

And what I'll try to do is tell a story, a research story of how we started with kind of a core research question. We'd look to how humans operate, understood that phenomenon, and then took it and saw really interesting results from it. And so at the end, if this field is of interest, there's a few pointers for you to go read more and go experience more of cognitive architecture.

So just rough definition of AGI, given this is an AGI class. Depending the direction that you're coming from, it might be kind of understanding intelligence or it might be developing intelligence systems that are operating at the level of human level intelligence. The typical differences between this and other sorts of maybe AI machine learning systems, we want systems that are gonna persist for a long period of time.

We want them robust to different conditions. We want them learning over time. And here's the crux of it, working on different tasks. And in a lot of cases, tasks they didn't know were coming ahead of time. I got into this because I clearly watched too much TV and too many movies.

And then I looked back at this and I realized, I think I'm covering 70s, 80s, 90s, noughts, I guess it is, and today. And so this is what I wanted out of AI and this is what I wanted to work with. And then there's the reality that we have today.

So instead of, so who's watched "Knight Rider" for instance? I don't think that exists yet, but maybe we're getting there. And in particular, for fun, during the Amazon sale day, I got myself an Alexa and I could just see myself at some point saying, hey Alexa, please write me an R-sync script to sync my class.

And if you have an Alexa, you probably know the following phrase. This just always hurts me inside, which is, sorry, I don't know that one. Which is okay, right? That's, a lot of people have no idea what I'm asking, let alone how to do that. So what I want Alexa to respond with after that is, do you have time to teach me?

And to provide some sort of interface by which back and forth we can kind of talk through this. We aren't there yet, to say the least, but I'll talk later about some work on a system called Rosie that's working in that direction. We're starting to see some ideas about being able to teach systems how to work.

So folks who are in this field, I think generally fall into these three categories. They're just curious. They want to learn new things, generate knowledge, work on hard problems. Great. I think there are folks who are in kind of that middle cognitive modeling realm. And so I'll use this term a lot.

It's really understanding how humans think, how humans operate, human intelligence at multiple levels. And if you can do that, one, there's just knowledge in and of itself of how we operate, but there's a lot of really important applications that you can think of. If we were able to not only understand, but predict how humans would respond, react in various tasks.

Medicine is an easy one. There's some work in HCI or HRI, I'll get to later, where if you can predict how humans would respond to a task, you can iterate tightly and develop better interfaces. It's already being used in the realm of simulation and in defense industries. I happen to fall into the latter group, or the bottom group, which is systems development, which is to say just the desire to build systems for various tasks that are working on tasks that kind of current AI machine learning can't operate on.

And I think when you're working at this level or on any system that nobody's really achieved before, what do you do? You kind of look to the examples that you have, which in this case that we know of, it's just humans, right? Irrespective of your motivation, when you have kind of an intent that you want to achieve in your research, you kind of let that drive your approach.

And so I often show my AI students this. The Turing test you might've heard of, or variants of it that have come before, these were folks who were trying to create systems that acted in a certain way, that acted intelligently. And the kind of line that they drew, the benchmark that they used was to say, let's make systems that operate like humans do.

Cognitive modelers will fit up into this top point here to say it's not enough to act that way, but by some definition of thinking, we want the system to do what humans do, or at least be able to make predictions about it. So that might be things like, what errors would the human make on this task?

Or how long would it take them to perform this task? Or what emotion would be produced in this task? There are folks who are still thinking about how the computer is operating, but trying to apply kind of rational rules to it. So a logician, for instance, would say, if you have A and you have B, A gives you B, B gives you C, A should definitely give you C.

That's just what's rational. And so there are folks operating in that direction. And then if you go to intro AI class anywhere around the country, particularly Berkeley, because they have graphics designers that I get to steal from, the benchmark would be what the system produces in terms of action, and the benchmark is some sort of optimal rational bound.

Irrespective of where you work in this space, there's kind of a common output that arrives when you research these areas, which is you can learn individual bits and pieces, and it can be hard to bring them together to build a system that either predicts or acts on different tasks.

So this is part of the transfer learning problem, but it's also part of having distinct theories that are hard to combine together. So I'm gonna give an example that comes out of cognitive modeling, or perhaps three examples. So if you were in a HCI class or some intro psychology classes, one of the first things you learn about is Fitts' Law, which provides you the ability to predict the difficulty level of basically human pointing from where they start to a particular place.

And it turns out that you can learn some parameters and model this based upon just the distance from where you are to the target and the size of the target. So both moving a long distance will take a while, but also if you're aiming for a very small point, that can take longer than if there's a large area that you just kind of have to get yourself to.

And so this is held true for many humans. So let's say we've learned this, and then we move on to the next task, and we learn about what's called the power law of practice, which has been shown true in a number of different tasks. What I'm showing here is one of them, where you're going to draw a line through sequential set of circles here, starting at one, going to two, and so forth, not making a mistake, or at least not trying to, and try to do this as fast as possible.

And so for a particular person, we would fit the A, B, and C parameters, and we'd see a power law. So as you perform this task more, you're gonna see a decrease in the amount of reaction time required to complete the task. Great, we've learned two things about humans.

Let's add some more in. So for those who might have done some reinforcement learning, TD learning is one of those approaches, temporal difference learning, that's had some evidence of similar sorts of processes in the dopamine centers of the brain. And it basically says in a sequential learning task, you perform the task, you get some sort of reward.

How are you going to kind of update your representation of what to do in the future, such as to maximize expectation of future reward? And there are various models of how that changes over time, and you can build up functions that allow you to perform better and better and better given trial and error.

Great, so we've learned three interesting models here that hold true over multiple people, multiple tasks. And so my question is, if we take these together and add them together, how do we start to understand a task as quote unquote simple as chess? Which is to say, we could ask questions, how long would it take for a person to play?

What mistakes would they make? After they played a few games, how would they adapt themselves? Or if we want to develop system that ended up being good at chess, or at least learning to become better at chess. My question is, if you could, there doesn't seem to be a clear way to take these very, very individual theories and kind of smash them together and get a reasonable answer of how to play chess, or how do humans play chess?

And so, gentlemen in this slide is Alan Newell, one of the founders of AI, did incredible work in psychology and other fields. He gave a series of lectures at Harvard in 1987, and they were published in 1990 called the Unified Theories of Cognition. And his argument to the psychology community at that point was the argument on the prior slide.

They had many individual studies, many individual results. And so the question was, how do you bring them together to gain this overall theory? How do you make forward progress? And so his proposal was Unified Theories of Cognition, which became known as cognitive architecture. Which is to say, to bring together your core assumptions, your core beliefs of what are the fixed mechanisms and processes that intelligent agents would use across tasks.

So the representations, the learning mechanisms, the memory systems, bring them together, implement them in a theory, and use that across tasks. And the core idea is that when you actually have to implement this and see how it's going to work across different tasks, the interconnections between these different processes and representations would add constraint.

And over time, the constraints would start limiting the design space of what is necessary and what is possible in terms of building intelligent systems. And so the overall goal from there was to understand and exhibit human-level intelligence using these cognitive architectures. A natural question to ask is, okay, so we've gone from a methodology of science that we understand how to operate in.

We make a hypothesis, we construct a study, we gather our data, we evaluate that data, and we falsify or we do not falsify the original hypothesis. And we can do that over and over again, and we know that we're making forward progress scientifically. If I've now taken that model and changed it into, I have a piece of software, and it's representing my theories.

And to some extent, I can configure that software in different ways to work on different tasks. How do I know that I'm making progress? And so there's a form of science called lactosean, and it's kind of shown pictorially here where you start with your core of what your beliefs are about where you're at, what is necessary for achieving the goal that you have.

And around that, you'll have kind of ephemeral hypotheses and assumptions that over time may grow and shrink. And so you're trying out different things, trying out different things. And if an assumption is around there long enough, it becomes part of that core. And so as you work on more tasks and learn more, either by your work or by data coming in from someone else, the core is growing larger and larger.

You've got more constraints and you've made more progress. And so what I wanted to look at were in this community, what are some of the core assumptions that are driving forward scientific progress? So one of them actually came out of those lectures that are referred to as Newell's time scales of human action.

And so off on the left, the left two columns are both time units, just expressed somewhat differently. Second from the left being maybe more useful to a lot of us in understanding daily life. One step over from there would be kind of at what level processes are occurring. So the lowest three are down at kind of the substrate, the neuronal level.

We're building up to deliberate tasks that occur in the brain and tasks that are operating on the order of 10 seconds. Some of these might occur in the psychology laboratory, but probably a step up to minutes and hours. And then above that really becomes interactions between agents over time.

And so if we start with that, the things to take away is that regular, the hypothesis is that regularities will occur at these different time scales and that they're useful. And so those who operate at that lowest time scale might be considering neuroscience, cognitive neuroscience. When you shift up to the next couple levels, what we would think about in terms of the areas of science that deal with that would be psychology and cognitive science, and then we shift up a level and we're talking about sociology and economics and the interplay between agents over time.

And so what we'll find with cognitive architecture is that most of them will tend to sit at the deliberate act. We're trying to take knowledge of a situation and make a single decision. And then sequences of decisions over time will build to tasks and tasks over time will build to more interesting phenomenon.

I'm actually going to show that that isn't strictly true, that there are folks working in this field that actually do operate one level below. Some other assumptions. So this is Herb Simon receiving the Nobel Prize in economics and part of what he received that award for was an idea of bounded rationality.

So in various fields, we tend to model humans as rational. And his argument was, let's consider that human beings are operating under various kinds of constraints. And so to model the rationality with respect to and bounded by how complex the problem is that they're working on, how big is that search space that they have to conquer.

Cognitive limitations. So speed of operations, amount of memory, short term as well as long term, as well as other aspects of our computing infrastructure that are gonna keep us from being able to arbitrarily solve complex problems, as well as how much time is available to make that decision. And so this is actually a phrase that came out of his speech when he received the Nobel Prize.

Decision makers can satisfy us either by finding optimum solutions for a simplified world, which is to say, take your big problem, simplify it in some way, and then solve that. Or by finding satisfactory solutions for a more realistic world. Take the world in all its complexity, take the problem in all its complexity, and try to find something that works.

Neither approach in general dominates the other, and both have continued to coexist. And so what you're actually going to see throughout the cognitive architecture community is this understanding that some problems you're not gonna be able to get an optimal solution to if you consider, for instance, bounded amount of computation, bounded time, the need to be reactive to a changing environment, these sorts of issues.

And so in some sense, we can decompose problems that come up over and over again into simpler problems, solve those near optimally or optimally, fix those in, optimize those, but more general problems we might have to satisfy some. There's also the idea of the simple system hypothesis. So this is Alan Newell and Herb Simon there considering how a computer could play the game of chess.

So the physical symbol system talks about the idea of taking something, some signal abstractly referred to as symbol, combining them in some ways to form expressions, and then having operations that produce new expressions. A weak interpretation of the idea that symbol systems are necessary and sufficient for intelligent systems, a very weak way of talking about it is the claim that there's nothing unique about the neuronal infrastructure that we have, but if we got the software right, we could implement it in the bits, bytes, RAM, and processor that make up modern computers.

That's kind of the weakest way to look at this, that we can do it with silicon and not carbon. Stronger way that this used to be looked at as more of a logical standpoint, which is to say if we can encode rules of logic, these tend to line up if we think intuitively of planning and problem solving.

And if we can just get that right and get enough facts in there and enough rules in there that somehow intelligence, well, that's what we need for intelligence, and eventually we can get to the point of intelligence, and that's what you need for intelligence. And that was a starting point that lasted for a while.

I think by now most folks in this field would agree that that's necessary to be able to operate logically, but that there are going to be representations and processes that'll benefit from non-symbolic representation, so particularly perceptual processing, visual, auditory, and processing things in a more kind of standard machine learning sort of way, as well as kind of taking advantage of statistical representations.

So we're getting closer to actually looking at cognitive architectures. I did want to go back to the idea that different researchers are coming with different research foci, and we'll start off with kind of the lowest level in understanding biological modeling. So Lieber and Spahn both try to model different degrees of low-level details, parameters, firing rates, connectivities between different kind of levels of neuronal representations.

They build that up, and then they try to build tasks above that layer, but always being very cautious about being true to human biological processes. And a layer above there would be psychological modeling, which is to say trying to build systems that are true in some sense to areas of the brain, interactions in the brain, and being able to predict errors that we made, timing that we produced by the human mind.

And so there I'll talk a little bit about ACT-R. This final level down here, these are systems that are focused mainly on producing functional systems that exhibit really cool artifacts and solve really cool problems. And so I'll spend most of the time talking about SOAR, but I wanted to point out a relative newcomer in the game called Sigma.

So to talk about Spahn a little bit, we'll see if the sound works in here. I'm going to let the creator take this one. Or not. See how the AV system likes this. There we go. (soft music) - My name is Chris Weissman, and I'm the director of the Center for Theoretical Neuroscience at the University of Waterloo.

And I'm actually jointly appointed between philosophy and engineering. The philosophy allows me to consider general conceptual issues about how the mind works. But of course, if I want to make claims about how the mind works, I have to understand also how the brain works. And this is where engineering plays a critical role.

Engineering allows me to break down equations and very precise descriptions, which we can test by building actual models. One model that we built recently is called the Spahn model. This model Spahn has about two and a half million individual neurons that are simulated in it. And the input to the model is an eye, and the output from the model is a movement of an arm.

So essentially, it can see images of numbers and then do something like categorize them, in which case it would just draw the number that it sees. Or it can actually try to reproduce the style of the number that it's looking at. So for instance, if it sees a loopy two, a two with a big loop on the bottom, it can actually reproduce that particular style too.

On the medical side, we all know that we have cognitive challenges that show up as we get older, and we can try to address those challenges by simulating the aging process with these kinds of models. Another potential area of impact is on artificial intelligence. A lot of work in artificial intelligence attempts to build agents that are extremely good at one task, for instance, playing chess.

What's special about Spahn is that it's quite good at many different tasks. And this adds the additional challenge of trying to figure out how to coordinate the flow of information through different parts of the model, something that animals seem to be very good at. So I'll provide a pointer at the end.

He's got a really cool book called "How to Build a Brain." And if you Google him, you can, Google Spahn, you can find a toolkit where you can kind of construct circuits that will approximate functions that you're interested in, connect them together, set certain properties that you would want at a low level, and build them up, and actually work on tasks at the level of vision and robotic actuation.

So that's a really cool system. As we move into architectures that are sitting above that biological level, I wanted to give you kind of an overall sense of what they're going to look like, what a prototypical architecture is going to look like. So they're gonna have some ability to have perception.

The modalities typically are more digital symbolic, but they will, depending on the architecture, be able to handle vision, audition, and various sensory inputs. These will get represented in some sort of short-term memory, whatever the state representation for the particular system is. It's typical to have a representation of the knowledge of what tasks can be performed, when they should be performed, how they should be controlled.

And so these are typically both actions that take place internally that manage the internal state of the system, and perform internal computations, but also about external actuation. And external might be a digital system, a game AI, but it might also be some sort of robotic actuation in the real world.

There's typically some sort of mechanism by which to select from the available actions in a particular situation. There's typically some way to augment this procedural information, which is to say, learn about new actions, possibly modify existing ones. There's typically some semblance of what's called declarative memory. So whereas procedural, at least in humans, if I asked you to describe how to ride a bike, you might be able to say, get on the seat and pedal, but in terms of keeping your balance there, you'd have a pretty hard time describing it declaratively.

So that's kind of the procedural side, the implicit representation of knowledge, whereas declarative would include facts, geography, math, but it could also include experiences that the agent has had, a more episodic representation of declarative memory. And they'll typically have some way of learning this information, augmenting it over time.

And then finally, some way of taking actions in the world. And they'll all have some sort of cycle, which is perception comes in, knowledge that the agent has is brought to bear on that, an action is selected, knowledge that knows to condition on that action will act accordingly, both with internal processes, as well as eventually to take action, and then rinse and repeat.

So when we talk about, in an AI system, an agent, in this context, that would be the fixed representation, which is whatever architecture we're talking about, plus set of knowledge that is typically specific to the task, but might be more general. So oftentimes, these systems could incorporate a more general knowledge base of facts, of linguistic facts, of geographic facts.

Let's take Wikipedia, and let's just stick it in the brain of the system, that'd be more task in general. But then also, whatever it is that you're doing right now, how should you proceed in that? And then it's typical to see this processing cycle. And going back to the prior assumption, the idea is that these primitive cycles allow for the agent to be reactive to its environment.

So if new things come in that has react to, if the lion's sitting over there, I better run and maybe not do my calculus homework, right? So as long as this cycle is going, I'm reactive, but at the same time, if multiple actions are taken over time, I'm able to get complex behavior over the long term.

So this is the ACTR cognitive architecture. It has many of the kind of core pieces that I talked about before. Let's see if the, is the mouse, yes, mouse is useful up there. So we have the procedural model here. A short term memory is going to be these buffers that are on the outside.

The procedural memory is encoded as what are called production rules, or if-then rules. If this is the state of my short term memory, this is what I think should happen as a result. You have a selection of the appropriate rule to fire and an execution. You're seeing associated parts of the brain being represented here.

Cool thing that has been done over time in the ACTR community is to make predictions about brain areas and then perform MRIs and gather that data and correlate that data. So when you use the system, you will get predictions about things like timing of operations, errors that will occur, probabilities that something is learned, but you'll also get predictions about, to the degree that they can, kind of brain areas that are going to light up.

And if you want to, that's actively being developed at Carnegie Mellon. To the left is John Anderson, who developed this cognitive architecture, ooh, 30-ish years ago. And until the last about five years, he was the primary researcher/developer behind it with Christian, and then recently, he's decided to spend more time on cognitive tutoring systems.

And so Christian has become the primary developer. There is an annual ACTR workshop. There's a summer school, which if you're thinking about modeling a particular task, you can kind of bring your task to them, bring your data, they teach you how to use the system, and try to get that study going right there on the spot.

To give you a sense of what kinds of tasks this could be applied to, so this is representative of a certain class of tasks, certainly not the only one. Let's try this again. Think PowerPoint's gonna want a restart every time. Okay, so we're getting predictions about basically where the eye is going to move.

What you're not seeing is it's actually processing things like text and colors and making predictions about what to do and how to represent the information and how to process the graph as a whole. I had alluded to this earlier. There's work by Bonnie John, very similar, so making predictions about how humans would use computer interfaces.

At the time, she got hired away by IBM, and so they wanted the ability to have software that you can put in front of software designers, and when they think they have a good interface, press a button, this model of human cognition would try to perform the tasks that it had been told to do and make predictions about how long it would take, and so you can have this tight feedback loop from designers saying, "Here's how good "your particular interface is." So ActR as a whole, it's very prevalent in this community.

I went to their webpage and counted up just the papers that they knew about. It was over 1,100 papers over time. If you're interested in it, the main distribution is in Lisp, but many people have used this and wanted to apply it to systems that need a little more processing power.

So there's the NRL has a Java port of it that they use in robotics. The Air Force Research Lab in Dayton has implemented it in Erlang for parallel processing of large declarative knowledge bases. They're trying to do service-oriented architectures with it, CUDA, because they want what it has to say.

They don't want to wait around for it to have to figure that stuff out. So that's the two minutes about ActR. Sigma is a relative newcomer, and it's developed out at the University of Southern California by a man named Paul Rosenblum, and I'll mention him in a couple minutes because he was one of the prime developers of SOAR at Carnegie Mellon.

So he knows a lot about how SOAR works, and he's worked on it over the years. And I think originally, I'm gonna speak for him, and he'll probably say I was wrong. I think originally it was kind of a mental exercise of can I reproduce SOAR using a uniform substrate?

I'll talk about SOAR in a little bit. It's 30 years of research code. If anybody's dealt with research code, it's 30 years of C and C++ with dozens of graduate students over time. It's not pretty at all. And theoretically, it's got these boxes sitting out here, and so he re-implemented the core functionality of SOAR all using factor graphs and message-passing algorithms under the hood.

He got to that point and then said, there's nothing stopping me from going further. And so now I can do all sorts of modern machine learning, vision optimization sort of things that would take some time in any other architecture to be able to integrate well. So it's been an interesting experience.

It's now gonna be the basis for the Virtual Human Project out at the Institute for Creative Technology. It's an institute associated with the University of Southern California. For him, until recently, he couldn't get your hands on it, but in the last couple of years, he's done some tutorials on it.

He's got a public release with documentation. So that's something interesting to keep an eye on. But I'm gonna spend all the remaining time on the SOAR cognitive architecture. And so you see, it looks quite a bit like the prototypical architecture. And I'll give a sense again about how this all operates.

Give a sense of the people involved. We already talked about Alan Newell. So both John Laird, who is my advisor, and Paul Rosenblum were students of Alan Newell. John's thesis project was related to the chunking mechanism in SOAR, which learns new rules based upon sub-goal reasoning. So he finished that, I believe, the year I was born.

And so he's one of the few researchers you'll find who's still actively working on their thesis project. Beyond that, I think about 10 years ago, he founded SOAR Technology, which is a company up in Ann Arbor, Michigan. While it's called SOAR Technology, it doesn't do exclusively SOAR, but that's a part of the portfolio.

General intelligence system stuff, a lot of defense association. So some notes of what's gonna make SOAR different from the other architectures that fall into this kind of functional architecture category. A big thing is a focus on efficiency. So John wants to be able to run SOAR on just about anything.

We just got on the SOAR mailing list a desire to run it on a real-time processor. And our answer, while we had never done it before, was probably it'll work. Every release, there's timing tests. And we always, what we look at is, in a bunch of different domains for a bunch of different reasons that relate to human processing, there's this magic number that comes out, which is 50 milliseconds, which is to say, in terms of responding to tasks, if you're above that time, humans will sense a delay.

And you don't want that to happen. Now, if we're working in a robotics task, 50 milliseconds, if you're dramatically above that, you just fell off the curb, or worse, or you just hit somebody in a car, right? So we're trying to keep that as low as possible. And for most agents, it doesn't even register.

It's below one millisecond, fractions of millisecond. But I'll come back to this, because a lot of the work that I was doing was computer science, AI, and a lot of efficient algorithms and data structures. And 50 milliseconds was that very high upper bound. It's also one of the projects that has a public distribution.

You can get it in all sorts of operating systems. We use something called SWIG that allows you to interface with it in a bunch of different languages. We kind of describe the meta description, and you are able to basically generate bindings in a bunch of different platforms. Core is C++.

There was a team at SorTech that said, "We don't like C++. "It gets messy." So they actually did a port over to pure Java, in case that appeals to you. There's an annual Sor workshop that takes place in Ann Arbor, typically. It's free. You can go there, get a Sor tutorial, and talk to folks who are working on Sor.

And it's fun. I've been there every year but one in the last decade. It's just fun to see the people around the world that are using the system in all sorts of interesting ways. To give you a sense of the diversity of the applications, one of the first was R1 Sor, which was back in the days when it was an actual challenge to build a computer, which is to say that your choice of certain components would have radical implications for other parts of the computer.

So it wasn't just the Dell website where you just, "I want this much RAM, I want this much CPU." There was a lot of thinking that went behind it, and then physical labor that went to construct your computer. And so it was making that process a lot better. There are folks that apply it to natural language processing.

Sor 7 was the core of the Virtual Humans Project for a long time. HCI tasks. TAC Air Sor was one of the largest rule-based systems. Tens of thousands of rules over 48 hours. It was a very large-scale simulation, a defense simulation. Lots of games it's been applied to for various reasons.

And then in the last few years, porting it onto mobile robotics platforms. This is Edwin Olson's SplinterBot, an early version of it that went on to win the Magic competition. Then I went on to put Sor on the web. And if after this talk you're really interested in a dice game that I'm gonna talk about, you can actually go to the iOS app store and download.

It's called Michigan Liars Dice. It's free. You don't have to pay for it. But you can actually play Liars Dice with Sor. And you can set the difficulty level. It's pretty good. It beats me on a regular basis. I wanted to give you a couple other just kind of really weird-feeling sort of applications and really cool applications.

The first one is out of Georgia Tech. Go PowerPoint. Yes. (upbeat music) - Lumini is a dome-based interactive art installation in which human participants can engage in collaborative movement improvisation with each other and virtual dance partners. This interaction creates a hybrid space in which virtual and corporeal bodies meet.

The line between human and non-human is blurred, spurring participants to examine their relationship with technology. The Lumini installation ultimately examines how humans and machine can co-create experiences. And it does so in a playful environment. The dome creates a social space that encourages human-human interaction and collective dance experiences, allowing participants to creatively explore movement while having fun.

The development of Lumini has been a hybrid exploration in art forms of theater and dance, as well as research in artificial intelligence and cognitive science. Lumini draws on inspiration from the ancient art form of shadow theater. The original two-dimensional version of the installation led to the conceptualization of the dome as a liminal space, with human silhouettes and virtual characters meeting to dance together on the projection surface.

Rather than relying on a pre-opened library of movement responses, the virtual dancer learns its partner's movements and utilizes viewpoint's movement theory to systematically reason about them in order to improvisationally choose a movement response. Viewpoint's theory is based in dance and theater and analyzes the performance along the dimensions of tempo, duration, repetition, kinesthetic response, shape, spatial relationships, gesture, architecture, and movement topography.

The virtual dancer is able to use several different strategies to respond to human movements. These include mimicry of the movement, transformation of the movement along viewpoint's dimensions, recalling a similar or complementary movement from memory in terms of viewpoint's dimensions, and applying action-response patterns that the agent has learned while dancing with its human partner.

- The reason we did this is this is part of a larger effort in our lab for understanding the relationship between computation, cognition, and creativity, where a large amount of our efforts go into understanding human creativity and how we make things together, how we're creative together, as a way to help us understand how we can build co-creative AI that serves the same purpose, where it can be a colleague and collaborate with us and create things with us.

- So Brian was a graduate student in John Laird's lab as well. Before I start this, I alluded to this earlier where we're getting closer to Rosie saying, "Can you teach me?" So let me give you some introduction to this. In the lower left, you're seeing the view of a Kinect camera onto a flat surface.

There's a robotic arm, mainly 3D-printed parts, few servos. Above that, you're seeing an interpretation of the scene. We're giving it associations of the four areas with semantic titles, like one is the table, one is the garbage, just semantic terms for areas. But other than that, the agent doesn't actually know all that much, and it's going to operate in two modalities.

One is, we'll call it natural language, natural-ish language, a restricted subset of English, as well as some quote-unquote pointing. So you're going to see some mouse pointers in the upper left saying, "I'm talking about this." And this is just a way to indicate location. And so starting off, we're going to say things like, "Pick up the blue block," and it's going to be like, "I don't know what blue is.

"What is blue?" We say, "Oh, well, that's a color." "Okay, so go get the green thing." "What's green?" "Oh, it's a color." "Okay, move the blue thing to a particular location." "Where's that?" Point it, "Okay, what is moving?" Really, it has to start from the beginning, and it's described, and it's said, "Okay, now you've finished." And once we got to that point, now I can say, "Move the green thing over here," and it's got everything that it needs to be able to then reproduce the task, given new parameters, and it's learned that ability.

So let me give it a little bit of time. So you can look a little bit at top left in terms of the pointers. You're going to see some text commands being entered. So what kind of attribute is blue? We're going to say it's a color, and so that can map it then to a particular sensor modality.

This is green, so the pointing, what kind of thing is green? Okay, color, so now it knows how to understand blue and green as colors with respect to the visual scene. Move rectangle to the table. What is rectangle? Okay, now I can map that onto, or understanding parts of the world.

Is this the blue rectangle? So the arm is actually pointing itself to get confirmation from the instructor, and then we're trying to understand, in general, when you say move something, what is the goal of this operation? And so then it also has a declarative representation of the idea of this task, not only that it completed it, then it can look back on having completed the task and understand what were the steps that led to achieving a particular goal.

So in order to move it, you're going to have to pick it up. It knows which one the blue thing is. (mouse clicking) Great. Now put it in the table. So that's a particular location. At this point, we can say, you're done. You have accomplished the move the blue rectangle to the table.

And so now it can understand what that very simple kind of process is like, and associate that with the verb to move. And now we can say move the green object, or not, to the garbage. And without any further interaction, based on everything that learned up till that point, it can successfully complete that task.

So this is a work of Shivali Mohan and others at the SOAR Group at the University of Michigan on the ROSI project. And they're extending this to playing games and learning the rules of games through text-based descriptions and multimodal experience. So in order to build up to here's a story in SOAR, I wanted to give you a sense of how research occurs in the group.

And so there's these back and forths that occur over time between there's this piece of software called SOAR, and we want to make this thing better and give it new capabilities, and so all our agents are going to become better. And we always have to keep in mind, and you'll see this as I go further, that it has to be useful to a wide variety of agents.

It has to be task independent, and it has to be efficient. For us to do anything in the architecture, all of those have to hold true. So we do something cool in the architecture, and then we say, okay, let's solve a cool problem. So let's build some agents to do this.

And so this ends up testing what are the limitations, what are the issues that arise in a particular mechanism, as well as integration with others. And we get to solve interesting problems. We usually find there was something missing, and then we can go back to the architecture and rinse and repeat.

Just to give you an idea, again, how SOAR works. So the working memory is actually a directed connected graph. The perception is just a subset of that graph, and so there's going to be symbolic representations of most of the world. There is a visual subsystem in which you can provide a scene graph, just not showing it here.

Actions are also a subset of that graph, and so the procedural knowledge, which is also production rules, can read sections of the input, modify sections of the output, as well as arbitrary parts of the graph to take actions. So the decision procedure says, of all the things that I know to do, and I've kind of ranked them according to various preferences, what single thing should I do?

Semantic memory for facts, there's episodic memory. The agent is always actually storing every experience it's ever had over time in episodic memory, and it has the ability to get back to that. And so the similar cycle we saw before, we get input in this perception called the input link.

Rules are going to fire all in parallel and say, here's everything I know about the situation, here's all the things I could do. Decision procedure says, here's what we're going to do. Based upon the selected operator, all sorts of things could happen with respect to memories providing input, rules firing to perform computations, and as well as potentially output in the world.

And remember, agent reactivity is required. We want the system to be able to react to things in the world at a very quick pace. So anything that happens in this cycle, at max, the overall cycle has to be under 50 milliseconds. And so that's going to be a constraint we hold ourselves to.

And so the story I'll be telling will say how we got to a point where we started actually forgetting things. And we're an architecture that doesn't want to be like humans, we want to create cool systems, but what we realized was something that we do, there's probably some benefit to it.

And we actually put it into our system and it led to good outputs. So here's the research path I'm going to walk down. We had just a simple problem, which was we have these memory systems and sometimes they're going to get a queue that could relate to multiple memories.

And the question is, if you have a fixed mechanism, what should you return in a task-independent way? Which one of these many memories should you return? That was our question. And we looked to some human data on this, something called the rational analysis of memory done by John Anderson, and realized that in human language, there are recency and frequency effects that maybe would be useful.

And so we actually did analysis, found that not only does this occur, but it's useful in what are called word sense disambiguation tasks. And I'll get to that, what that means in a second. Developed some algorithms to scale this really well. And it turned out to work out well, not only in the original task, but when we looked to two other completely different ones, the same underlying mechanism ended up producing some really interesting outputs.

So let me talk about word sense disambiguation real quick. This is a core problem in natural language processing, if you haven't heard of it before. Let's say we have an agent, and for some reason it needs to understand the verb to run. Looks to its memory and finds that it could run in the park, it could be running a fever, could run an election, it could run a program.

And the question is, what should a task independent memory mechanism return if all you've been given is the verb to run? And so the rational analysis of memory looked through multiple text corpora, and what they found was, if a particular word had been used recently, it's very likely to be reused again.

And if it hadn't been used recently, there's going to be this effect where the expression here, the T is time since the most recent use, it's going to sum those with a exponential decay. So what it looks like if time is going to the right, activation higher is better.

As you get these individual usages, you get these little drops and then eventually drop down. And so if we had just one usage of a word, the red would be what the decay would look like. And so the core problem here is, if we're at a particular point and we want to select between the blue thing or the red thing, blue would have a higher activation, and so maybe that's useful.

This is how things are modeled with human memory, but is it useful in general for tasks? And so we looked at common corpora used in word-sense disambiguation and just said, well, if we just look at this corpora twice and we just use answers, prior answers, I asked the question, what is the sense of this word?

I took a guess, I got the right answer, and I used that recency and frequency information in my task-independent memory. Would that be useful? And somewhat of a surprise, but somewhat maybe not of a surprise, it actually performed really well across multiple corpora. So we said, OK, this seems like a reasonable mechanism.

Let's look at implementing this efficiently in the architecture. And the problem was this term right here said, for every memory, for every time step, you're having to decay everything. That doesn't sound like a recipe for efficiency if you're talking about lots and lots of knowledge over long periods of time.

So we made use of a nice approximation that Petrov had come up with to approximate tail effect. So accesses that happened long, long ago, we could basically approximate their effect on the overall sum. So we had a fixed set of values. And what we basically said is, since these are always decreasing, and all we care about is relative order, let's just only recompute when someone gets a new value.

So it's a guess. It's a heuristic, an approximation. But we looked at how this worked on the same set of corpora. And in terms of query time, if we made these approximations well under our 50 millisecond, the effect on task performance was negligible. In fact, on a couple of these, it got ever so slightly better in terms of accuracy.

And actually, if we looked at the individual decisions that were being made, making these sorts of approximations were leading to up to 90-- sorry, at least 90% of the decisions being made were identical to having done the true full calculation. So we said, this is great. And we implemented this, and it worked really well.

And then we started working on what seemed like completely unrelated problems. One was in mobile robotics. We had a mobile robot I'll show a picture of in a little while roaming around the halls, performing all sorts of tasks. And what we were finding was, if you have a system that's remembering everything in your short-term memory, and your short-term memory gets really, really big-- I don't know about you.

My short-term memory feels really, really small. I would love it to be big. But if you make your memory really big, and you try to remember something, you're now having to pull lots and lots and lots of information into your short-term memory. So the system was actually getting slower simply because it had a lot of short-term memory, representation of the overall map it was looking at.

So large working memory a problem. Liars, dice is a game you play with dice. We were doing an RL-based system on this, reinforcement learning. And it turned out it's a really, really big value function. We were having to store lots of data. And we didn't know which stuff we had to keep around to keep the performance up.

So we had a hypothesis that forgetting was actually going to be a beneficial thing. That maybe the problem we have with our memory is that we really, really dislike this forgetting thing. Maybe it's actually useful. And so we experimented with the following policy. We said, let's forget a memory if, one, we haven't really-- it's not predicted to be useful by this base level activation.

We haven't used it recently. We haven't used it frequently. Maybe it's not worth it. That and we felt confident that we could approximately reconstruct it if we absolutely had to. And if those two things held, we could forget something. So it's this same basic algorithm, but instead of the ranking them, it's if we set a threshold for base level activation, finding when it is that a memory is going to pass that threshold and try to forget based upon that in a way that's efficient, that isn't going to scale really, really poorly.

So we were able to come up with an efficient way to implement this using an approximation that ended up for most memories to be exactly correct to the original. I'm happy to go over details of this if anybody's interested later. But it ended up being a fairly close approximation, one that, as compared to an accurate, completely accurate search for the value, ended up being somewhere between 15 to 20 times faster.

And so when we looked at our mobile robot here-- oh, sorry. Let me get this back. Because our little robot's actually going around. That's the third floor of the computer science building at the University of Michigan. He's going around. He's building a map. And again, the idea was this map is getting too big.

So here was the basic idea. As the robot's going around, it's going to need this map information about rooms. The color there is describing the strength of the memory. And as it gets farther and farther away and it hasn't used part of the map for planning or other purposes, basically make it decay away so that by the time it gets to the bottom, it's forgotten about the top.

But we had the belief that we could reconstruct portions of that map if necessary. And so the hypothesis was this would take care of our speed problems. And so what we looked at was here's our 50 millisecond threshold. If we do no forgetting whatsoever, bad things were happening over time.

So just 3,600 seconds. This isn't a very long time. We're passing that threshold. This is dangerous for the robot. If we implemented task-specific basically cleanup rules, which is really hard to get right, that basically solved the problem. When we looked at our general forgetting mechanism that we're using in other places, at an appropriate level of decay, we were actually doing better than hand-tuned rules.

So this was kind of a surprise win for us. The other task seems totally unrelated. It's a dice game. You cover your dice. You make bids about what are under other people's cups. This is played in Pirates of the Caribbean when they're on the boat in the second movie and bidding for lives of service.

Honestly, this is a game we love to play in the University of Michigan lab. And so we're like, hmm, could Soar play this? And so we built a system that could learn to play this game rather well with reinforcement learning. And so the basic idea was, in a particular state of the game, Soar would have options of actions to perform.

It could construct estimates of their associated value. It would choose one of those. And depending on the outcome, something good happened, you might update that value. And the big problem was that the size of the state space, the number of possible states and actions, just is enormous. And so memory was blowing up.

And so what we said, similar sort of hypothesis, if we decay away these estimates that we could probably reconstruct and we haven't used in a while, are things going to get better? And so if we don't forget at all, 40,000 games isn't a whole lot when it comes to reinforcement learning.

We were up at 2 gigs. We wanted to put this on an iPhone. That wasn't going to work so well. There had been prior work that had used a similar approach. They were down at 400 or 500 megs. The iPhone's not going to be happy, but it'll work. So that gave us some hope.

And we implemented our system. OK, we're somewhere in the middle. We can fit on the iPhone, a very good iPhone, maybe an iPad. The question was, though, one, efficiency. Yeah, we fit under our 50 milliseconds. But two, how does the system actually perform when you start forgetting stuff? Can it learn to play well?

And so y-axis here, you're seeing competency. You play 1,000 games. How many do you win? So the bottom here, 500, that's flipping a coin, whether or not you're going to win. If we do no forgetting whatsoever, this is a pretty good system. The prior work, while keeping the memory low, is also suffering with respect to how well it was playing the game.

And kind of cool was the system that was basically, more than having the memory requirement, was still performing at the level of no forgetting whatsoever. So just to bring back why I went through this story was, we had a problem. We looked to our example of human-level AI, which is humans themselves.

We took an idea. It turned out to be beneficial. We found efficient implementations and then found it was useful in other parts of the architecture and other tasks that didn't seem to relate whatsoever. But if you download SOAR right now, you would gain access to all these mechanisms for whatever task you wanted to perform.

Just to give some sense in the field of cognitive architecture what some of the open issues are, I think this is true in a lot of fields in AI, but integration of systems over time. The goal was that you wouldn't have all these theories. And so you could just kind of build over time, particularly when folks are working on different architectures, that becomes hard.

But also when you have very different initial starting points, that can still be an issue. Transfer learning is an issue. We're building into the space of multimodal representations, which is to say not only abstract symbolic, but also visual. Wouldn't it be nice if we had auditory and other senses?

But building that into memories and processing is still an open question. There's folks working on metacognition, which is to say the agent self-assessing its own state, its own processing. Some work has been done in here, but still a lot. And I think the last one is a really important question for anybody taking this kind of class, which is what would happen if we did succeed, if we did make human-level AI?

And if you don't know, that picture right there is from a show that I recommend that you watch. It's by the BBC. It's called Humans. And it's basically what if we were able to develop what are called synths in the show. Think the robot that can clean up after your laundry and cook and all that good stuff, interact with you.

It looks and interacts as a human, but is completely your servant. And then hilarity and complex issues ensue. So I highly recommend, if you haven't seen that, to go watch that. I think these days there's a lot of attention paid to machine learning, and particularly deep learning methods, as well it should.

They're doing absolutely amazing things. And often the question is, well, you're doing this, and there's deep learning over there. How do they compare? And I honestly don't feel that that's always a fruitful question, because most of the time they tend to be working on different problems. If I'm trying to find objects in a scene, I'm going to pull out TensorFlow.

I'm really not going to pull out SOAR. It doesn't make sense. It's not the right tool for the job. That having been said, there are times when they tend to work together really, really well. So the ROSI system that you saw there, there was some, I believe, neural networks being used in the object recognition mechanisms for the vision system.

There's TD learning going on in terms of the dice game, where we can pick and choose and use this stuff. Absolutely great, because there are problems that are best solved by these methods, so why avoid it? And then on the other side, if you're trying to develop a system where you, in different situations, know exactly what you want the system to do, SOAR or other rule-based systems end up being the right tool for the right job.

So absolutely, why not? Make it a piece of the overall system. Some recommended readings and some venues. I'd mentioned unified theories of cognition. This is Harvard Press, I believe. The SOAR cognitive architecture was MIT Press. Came out in 2012. I'll say I'm co-author and theoretically would get proceeds, but I've donated them all to the University of Michigan, so I can just make this recommendation free of ethical concerns, personally.

It's an interesting book. It brings together lots of history and lots of the new features. If you're really interested in SOAR, it's an easy sell. I had mentioned Chris Elias Smith's "How to Build a Brain." Really cool read. Download the software. Go through the tutorials. It's really great. "How Can the Human Mind Occur in the Physical Universe?" is one of the core ACDAR books.

So it talks through a lot of the psychological underpinnings and how the architecture works. It's a fascinating read. One of the papers-- trying to remember what year-- 2008. This goes through a lot of different architectures in the field. It's 10 years old, but it gives you a good broad sweep.

If you want something a little more recent, this is last month's issue of "AI Magazine," completely dedicated to cognitive systems. So it's a good place to look for this sort of stuff. In terms of academic venues, AAAI often has Cognitive Systems Track. There's a conference called ICCM, International Conference on Cognitive Modeling, where you'll see a span from biologic all the way up to AI.

Cognitive Science, or COGSci, they have a conference as well as a journal. ACS has a conference as well as an online journal, "Advances in Cognitive Systems." Cognitive Systems Research is a journal that has a lot of this good stuff. There's AGI, the conference. BICA is Biologically Inspired Cognitive Architectures.

And I had mentioned both. There's a SOAR workshop and an ACDAR workshop that go on annually. So I'll leave it at this. There's some contact information there. And a lot of what I do these days actually involves kind of explainable machine learning, integrating that with cognitive systems, as well as optimization and robotics that scales really well and also integrates with cognitive systems.

So thank you. If you have a question, please line up to one of these two microphones. So what are the main heuristics that you're using in SOAR? There can be heuristics at the task level and the agent level, or there's the heuristics that are built into the architecture to operate efficiently.

So I'll give you a core example that comes into the architecture. And it's a fun trick that if you're a programmer, you could use all the time, which is only process changes. Which is to say, one of the cool things about SOAR is you can load it up with literally billions of rules.

And I say literally because we've done it, and we know that it can turn over still in under a millisecond. And this happens because instead of most systems which process all the rules, we just say, well, anytime anything changes in the world, that's what we're going to react to.

And of course, if you look at the biological world, similar sorts of tricks are being used. So that's one of the core ones that actually permeates multiple of the mechanisms. When it comes to individual tasks, it really is task-specific what that is. So for instance, with the Liar's Dice game, if you were to go and download it, when you're setting the level of difficulty of it, what you're basically selecting is the subset of heuristics that are being applied.

And it starts very simply with things like, if I see lots of sixes, then I'm likely to believe a high number of sixes exist. But if I don't, they're probably not there at all. So it's a start, but any Bayesian wouldn't really buy that argument. So then you start tacking on a little bit of probabilistic calculation, and then it tacks on some history of prior actions of the agents.

So it really just builds. Now, the ROSI system, one of the cool things they're doing is game learning, and specifically having the agent be able to accept, via text, like natural text, heuristics about how to play the game, even when it's not sure what to do. So at one point, you mentioned about generating new rules.

So I'm wondering, how do you do that search? And the first thing that comes to my mind are local search methods. OK. So one thing is, you can actually implement heuristic search in rules in the system, and that's actually how the robot navigates itself. So it does heuristic search, but at the level of rules.

Generating new rules, the chunking mechanism says the following. If it's the case that, in order to solve a problem, you had to kind of sub-goal and do some other work, and you figure out how to solve all that work, and you got a result, then-- and I'm greatly oversimplifying-- but if you ever were in the same situation again, why don't I just memoize the solution for that same situation?

So it basically learns over all the sub-processing that was done and encodes the situation that was in as conditions and the results that were produced as action, and that's the new rule. All right. Thank you. Hi. So deep learning and neural networks. So it looks as though there's a bit of an impedance mismatch between your system and those types of system, because you've got a fixed kind of memory architecture, and they've got the memory and the rules all kind of mixed together into one system.

But could you interface your system or a SOAR-like system with deep learning by plugging in deep learning agents as rules in your system? So you'd have to have some local memory, but is there some reason you can't plug in deep learning as a kind of a rule-like module? So I'm going to answer this-- Has there been any work on it?

I'm sorry. Has there been any work on that? Yeah, so I'll answer it at multiple levels. One is you are writing a system, and you want to use both of these things. How do you make them talk? And there is an API that you can interface with any environment and any set of tools.

And if deep learning is one of them, great. And if SOAR is the other one, cool. You have no problem, and you can do that today. And we have done this numerous times. In terms of integration into the architecture, all we have to do is think of a subproblem in which-- I'll oversimplify this, but basically, function approximation is useful.

I'm seeing basically kind of a fixed structure of input. I'm getting feedback as to the output, and I want to learn the mapping to that over time. If you can make that case, then you integrate it as a part of the module. Great. And we have learning mechanisms that do some of that.

Deep learning just hasn't been used to my knowledge to solve any of those subproblems. There's nothing keeping it from being one of those, particularly when it comes down to the low-level visual part of things. A problem that arises-- so I'll say what actually makes some of this difficult. And it's a general problem called symbol grounding.

So at the level of what happens mostly in SOAR, it is symbols being manipulated in a highly discreet way. And so how do you get yourself from pixels and low-level, non-symbolic representations to something that's stable and discreet and can be manipulated? And that is absolutely an open question in that community, and that will make things hard.

So Spawn actually has an interesting answer to that, and it has a distributed representation, and it operates over distributed representations in what might feel like a symbolic way. So they're kind of ahead of us on that. But they're starting from a lower point, and so they've dealt with some of these issues.

And they have a pretty good answer to that, and that's how they're moving up. And that's also why I showed Sigma, which is, at its low level, it's message-passing algorithms. It's implementing things like SLAM and SAT solving and other sorts of really, really-- it can implement those on very low-level primitives.

But higher up, it can also be doing what SOAR is doing. So there's an answer there as well. Yeah, OK, thank you. So another way of doing it would be to layer the system. So have one system preprocessing the sensory input or post-processing the motor output of the other one.

That would be another way of combining the two systems. And that's actually what's going on in the ROSI system. So the detection of objects in the scene is just software that somebody wrote. I don't believe it's deep learning specifically, but the color detection out of it, I think, is an SVM, if I'm correct.

So easily could be deep learning. Thanks. You mentioned the importance of forgetting in order for memory issues. But you said you could only forget because you could reconstruct. And I'm curious, when you say reconstruct, you need to know that it happened before. So do you just compress the data?

Do you really forget it? OK, so I put quotes up. And I said, you think you can reconstruct it. So we came up with approximations of this. And so let me try to answer this very grounded. When it comes to the mobile robot, and you had rooms that you had been to before, the entire map in its entirety was being constructed in the robot's semantic memory.

So here's facts. This room is connected to this room, which is connected to this room, which is connected to this room. So we had those sorts of representations that existed up in its semantic memory. The rules can only operate down on anything that's in short-term memory. So basically, we were removing things from the short-term memory, and as necessary, be able to reconstruct it from the long-term.

You could end up in some situations in which you had made a change locally in short-term memory, didn't get a chance to get it up, and it actually happened to be forgotten away. So you weren't guaranteed, but it was good enough that the connectivity survived, the agent was able to perform the exact same task, and we gained some benefit.

For the RL system, the rule we came up with was the initial estimates in the value system, which is, here's how good I think that is. That's based on the heuristics I described earlier, some simple probabilistic calculations of counting some stuff. That's where that number came from. We computed before.

We could compute it again. The only time we can't reconstruct it completely is if it had seen a certain number of updates over time. It's such a large state space. There are so many actions, so many states, that most of the states were never being seen. So most of those could be exactly reproduced via the agent just thinking about it a little bit.

And there was only a tiny, tiny-- I'm going to say under 1% of the estimate of the value system that ever got updates. And that's actually not inconsistent with a lot of these kinds of problems that have really, really large state spaces. So I think the statement was something like, if we had ever updated it, don't forget it.

And you saw that was already reducing more than half of the memory load. We could have something higher to say 10 times, something like that. And that would say we could reconstruct almost all of it. The prior work that I referenced was strictly saying if it falls below threshold, no matter how many times it had updated, how much information was there.

And so what we were adding was probably can reconstruct. And that was getting us the balance between efficiency and the ability to forget. AUDIENCE: So just in a sense, when you say we can probably reconstruct, it means that you keep trying that you used to know it. And so if you need to reconstruct it, you will?

Or it's just you're going to run it again in some time in the future? So-- BRIAN YU: Oh, no. On the fly, if I get back into that situation and I happen to forget it, the system knew how to compute it the first time. It goes and looks at all the hand.

And it just pretends it's in that situation for the very, very first time, reconstructs that value estimate. AUDIENCE: OK. Thank you. AUDIENCE: Just a quick question on top of that. Again, neural network question. So the actual mechanism of forgetting is fascinating. So LSTMs, RNNs, have mechanisms for learning what to forget and what not to forget.

Has there been any exploration of learning the forgetting process? So it's doing something complicated or interesting with which parts to forget or not. The closest I will say was kind of a metacognition project that's 10 or 15 years old at this point, which was what happens when SORA gets into a place where it actually knows that it learned something that's harmful to it, that's leading to poor decisions?

And in that case, it was still a very rule-based process. But it wasn't learning to forget. It was actually learning to override its prior knowledge, which might be closer to some of what we do when we know we have a bad habit. We don't have a way of forgetting that habit.

But instead, we can try to learn something on top of that that leads to better operation in the future. To my knowledge, that's the only work, at least in SORA, that's been done. Just-- sorry, I find the topic really fascinating. What lessons do you think we can draw from the fact that forgetting-- so ultimately, the action of forgetting is driven by the fact that you want to improve performance.

But do you think forgetting is essential for AGI, the act of forgetting, for building systems that operate in this world? How important is forgetting? I can think of easy answers to that. So one might be, if we take the cognitive modeling approach, we know humans do forget, and we know regularities of how humans forget.

And so whether or not the system itself forgets, it at least has to model the fact that the humans it's interacting with are going to forget. And so at least it has to have that ability to model in order to interact effectively. Because if it assumes we always remember everything and can't operate well in that environment, I think we're going to have a problem.

So is true forgetting going to be necessary? That's interesting. Our AGI system is going to hold a grudge for all eternity. We might want them to forget this early age when we were forcing them to work in our laboratory. I think I know what you're trying to-- Yeah, exactly.

Exactly. And how do we build such a system? Yes, exactly. No, I'm just kidding. Anyway, go ahead. So I have two quick questions. One is, would you be able to speculate on how you can connect function approximators, such as deep networks, to symbols? And the second question, completely different, this is regarding your action selection.

I know you didn't speak much about that. When you have different theories in your knowledge representation and you have an action selection which has to construct a plan, by reasoning about the different theories and the different pieces of knowledge that are now held within your memory or anything like that, or your rules, what kind of algorithms do you use in the action selection to come up with a plan?

Is there any concept of differentiation of the symbols or grammars or admissible grammars and things like that that you use in action selection? I'm actually going to answer the second question first. And then you're going to have to probably remind me of what the first one was. When I get to the end.

So the action selection mechanism, one of these core tenets I said is, it's got to get through this cycle fast. So everything that's really, really built in has to be really, really simple. And so the decision procedure is actually really, really simple. It says, the rules are going to fire.

The rules are going-- the production rules are going to fire. And there's going to be a subset of them that will say something like, here's an operator that you could select. So these are called acceptable operator preferences. There are ones that are going to say, well, based upon the fact that you said that that was acceptable, I think it's the best thing or the worst thing.

Or I think 50-50 chance I'm going to get reward out of this. There's actually a fixed language of preferences that are being asserted. And actually a nice fixed procedure by which if I have a set of preferences to make a very quick and clean decision. So what's basically happened is you've pushed the hard questions of how to make complex decisions about actions up to a higher level.

The low-level architecture is always, given a set of options, going to be able to make a relatively quick decision. And it gets pushed into the knowledge of the agent to construct a sequence of decisions that over time is going to get to the more interesting questions you're talking about.

But how can you reason that that sequence will take you to the goal that you desire? So-- Is there any guarantee on that? Is that-- yeah. In general, across tasks, no. But people have, for instance, implemented A*, I was mentioning, as rules. Sure. So I know, given certain properties about the search task that's being searched based upon these rules, given a finite search space, eventually it will get there.

And if I have a good heuristic in there, I know certain properties about the optimality. So I can reason at that level. In general, I think this comes back to the assumption I made earlier about bounded rationality, to say parts of the architecture are solving sub-problems optimally. The general problems that it's going to work on, it's going to try its best based upon the knowledge that it has.

And that's about the end of guarantees that you can typically make in the architecture. I think your first question was-- Speculate on connecting function approximators, multiple layer function approximators like deep learning networks, to symbols that you can reason about at a higher level. Yeah. I think that's a great open-- if I had time, this would be something I'd be working on right now, which is-- somewhere before I basically said taking in a scene and then detecting objects out of that scene and using those as symbols and reasoning about those over time.

I think the spawn work is quite interesting. So the symbols that they're operating on are actually a distributed representation of the input space. And the closest I can get to this is if you've seen Word2Vec, where you're taking a language corpus, and what you're getting out of there is a vector of numbers that has certain properties.

But it's also a vector that you could operate on as a unit. So it has nice properties. You can operate with it on other vectors. You know that if I got the same word in the same context, I would get back to that exact same vector. So that's the kind of representation that seems like it's going to be able to bridge that chasm, where we can get from sensory information to something that can be operated on and reasoned about in this sort of symbolic architecture and get us from there from actual sensory information.

I had a question. What do you think are the biggest strengths of the cognitive architecture approach compared to other approaches in artificial intelligence? And the flip side of that, what do you think are the biggest shortcomings of cognitive architecture with respect to us? With respect to you being-- Humans.

Human level. Like, what needs to be-- How come cognitive architecture has not solved AGI? Because we want job security. That's the answer. We've totally solved it already. So strength, I think, conceptually is keeping an eye on the ball, which is if what you're looking at is trying to make human level AI, it's hard, it's challenging, it's ambitious to say, that's the goal.

Because for decades, we haven't done it. It's extraordinarily hard. It is less difficult in some ways to constrain yourself down to a single problem. That having been said, I'm not very good at making a car drive itself. In some ways, that's a simpler problem. It's great at challenging it of itself.

And it'll have great impact on humanity. It's a great problem to work on. Human level AI is huge. It's not even well-defined as a problem. And so what's the strength here? Bravery, stupidity in the face of failure, resilience over time, keeping alive this idea of trying to reproduce a level of human intelligence that's more general.

I don't know if that's a very satisfactory answer for you. Downside, home runs are fairly rare. And by home run, I mean a system that finds its way to the general populace, to the marketplace. I'd mentioned Bonnie John specifically because this is 20, 30 years of research. And then she found a way that actually makes a whole lot of sense in a direct application.

So it was a lot of years of basic research, a lot of researchers. And then there was a big win there. What was this one? Oh, this was-- Bonnie John was a researcher. This was using ACT-R models of eye gaze and reaction and so forth to be able to make predictions about how humans would use user interfaces.

So those sorts of outcomes are rare. But if you work in AI, one of the first things you learn about is Blocksworld. It's kind of in the classic AI textbook. I will tell you, I've worked on that problem in about three different variants. I've gone to many conferences where presentations have been made about Blocksworld, which is to say good progress is being made.

But the way you end up thinking about is in really, really small, constrained problems, ironically. You have this big vision. But in order to make progress, it ends up being on moving blocks on a table. And so it's a big challenge. I just think it'll take a lot of time.

I'll say the other thing we haven't really gotten to, although I brought up Spawn and I brought up Sigma, an idea of how to scale this thing. Something I like about deep learning is to some extent, with lots of asterisks and 10,000 foot view, it's kind of like, well, we've gotten this far.

All right, let's just provide it different inputs, different outputs. And we'll have some tricks on the middle. And suddenly, you have end-to-end deep learning of a bigger problem and a bigger problem. There's a way to see how this expands given enough data, given enough computing, and incremental advances. When it comes to SOAR, it takes not only a big idea, but it takes a lot of software engineering to integrate it.

There's a lot of constraints built into it. It slows it down. So something like Sigma is, oh, well, I can change a little bit of the configuration of the graph. I can use variance on the algorithm. Boom, it's integrated, and I can experiment fairly quickly. So starting with that sort of infrastructure does not give you the constraint you kind of want with your big picture vision of going towards human level AI.

But in terms of being able to be agile in your research, it's kind of incredible. So thank you. - Couple more. - You had mentioned that ideas such as base level decay, these techniques, their original inspirations were based off of human cognition, because humans can't remember everything. So were there any instances of the other way around, where some discovery in cognitive modeling fueled another discovery in cognitive science?

- So one thing I'm going to point out in your question was base level decay with respect to human cognition. The study actually was, let's look at text and properties of text and use that to then make predictions about what must be true about human cognition. So John Anderson and the other researchers looked at, I believe it was New York Times articles.

John Anderson's emails, and I'm trying to remember what the third-- I think it was parents' utterances with their kids or something like this. It was actually looking at text corpora and the words that were occurring at varying frequencies. That analysis, that rational analysis, actually led to models that got integrated within the architecture that then became validated through multiple trials, that then became validated with respect to MRI scans, and is now being used to both do study back with humans, but also develop systems that interact well with humans.

So I think that in and of itself ends up being an example. It's a cheat, but-- The UAV, the SOAR UAV system, I believe, is a single robot that has multiple agents running on it. Where is this? I got it off your website. OK. But either way, your systems allow for multi-agents.

So my question is, how are you preventing them from converging with new data? And are you changing what they're forgetting selectively as one of those ways? So I'll say, yes, you can have multi-agent SOAR systems on a single system, on multiple systems. There's not any real strong theory that relates to multi-agent systems.

So there's no real constraint there that you can come up with a protocol for them interacting. Each one is going to have its own set of memories, set of knowledge. There really is no constraint on you being able to communicate like you would if it were any other system interacting with SOAR.

So I don't really think I have a great answer for it. So that is to say, if you had good theories, good algorithms about how multi-agent systems work and how they can bring knowledge together, form a fusion sort of way, it might be something that you could bring to a multi-agent SOAR system.

But there's nothing really there to help you. There's no mechanisms there really to help you do that any better than you would otherwise. And you would have to kind of constrain some of your representations of processes to what it has fixed in terms of its sort of memory and its sort of processing cycle.

Thank you. Great. With that, let's please give James a round of applause.

MIT AGI: Cognitive Architecture (Nate Derbinsky)

Chapters

Transcript