back to indexMemory Masterclass: Make Your AI Agents Remember What They Do! — Mark Bain, AIUS

00:00:00.000 |
I'm super excited to be here with you. This is my first time speaking at AI Engineer. 00:00:22.560 |
We have an amazing group of guest speakers, Vasilija Markovic from Cogni. Vasilija, oh, there 00:00:34.460 |
is Vasilija, Daniel Chalev from Graffiti and ZPI and Alex Gilmore from Neo4j. The plan looks 00:00:46.360 |
like this. I will do a very quick power talk about the topic that I'm super passionate. 00:00:52.120 |
The AI memory. Next, we'll have four live demos, and we'll move on to some new solution 00:01:01.760 |
that we are proposing, a GraphRack chat arena that I will be able to demonstrate, and I would 00:01:09.440 |
like you to follow along once it's being demonstrated. And at the very end, we'll have a very short 00:01:17.520 |
Q&A session. There is a Slack channel that I would like you to join. So, please scan the 00:01:28.160 |
QR code right now before we begin. And let's make sure that everyone has access to these materials. 00:01:37.160 |
There is a walkthrough shirt on the channel that will go through closer to the end of our 00:01:46.800 |
workshop. But I would like you to start setting it up, if you may, on your laptops if you want 00:01:54.760 |
All right. It's workshop-graphrackchat. You can also find it on Slack. And you can join 00:02:10.760 |
the channel. So, a little bit about myself. So, hi, everyone. Again, I'm Mark Bain. And I'm very passionate about the memory. What is memory? The deep 00:02:19.120 |
physics and applications of memory across different technologies. You can find me at Mark and Bain on social media or on my website. And let me tell you a little bit of a story about myself. So, when I was 16 years old, I was very good at maths and I did math olympiads with 00:02:48.760 |
many brilliant minds, including Wojciech Zaremba, the co-founder of OpenAI. And thanks to that deep understanding of maths and physics, I did have many great opportunities to be exposed to the problem of AI memory. So, first of all, I would like to recall two conversations that I had with Wojciech and Ilya in 2014 00:03:18.400 |
in September. When I came here to study at Stanford, at one party, we met with Ilya and Wojciech, who back then worked at Google. And they were kind of trying to pitch me that there will be a huge revolution in AI. And I kind of, like, followed that. I was a little bit unimpressed back then. Right now, I probably kind of take it as a very big excitement when 00:03:48.040 |
when I look back to the times. And I was really wishing good luck to the guys who were doing deep learning because back then, I didn't really see this prospect of GPUs giving that huge edge in compute. 00:04:05.040 |
However, during that conversation, it was like 20 minutes. At the very end, I asked Ilya, all right, so there is going to be a big AI revolution. But how will these AI systems communicate with each other? 00:04:22.040 |
And the answer was very perplexing and kind of sets the stage to what's happening right now. Ilya simply answered, I don't know. I think they will invent their own language. So that was 11 years ago. Fast forward to now. 00:04:40.040 |
The last two years I have spent doing very deep research on physics of AI. And kind of, like, dove into all of these most modern AI architectures, including attention, diffusion models, VAEs, and many other ones. And I realized that there is something critical. Something missing. And this power talk is about this missing thing. 00:05:09.680 |
So over the last two years, I kind of followed on on my last years of doing a lot of research in physics, computer science, information science. And I came to this conclusion that memory, AI memory, in fact, is any data in any format, and this is important, including code, 00:05:39.320 |
algorithms, algorithms, algorithms, algorithms, and hardware, and any causal changes that affect them. That was something very mind-blowing to reach that conclusion. And that conclusion sets the tone to this whole track, the graph track. 00:06:00.320 |
In fact, I was also perplexed by how biological systems use memory and how different cosmological structures or quantum structures, they, in fact, have a memory. They kind of remember. 00:06:15.320 |
And let's get back to maths and to physics and geometry. When I was doing science olympiads, I was really focused on two or three things. Geometry, trigonometry, and algebra. And I realized, in the last year, that more or less the volume of loss in physics per 00:06:43.320 |
Loss in physics perfectly matches the volume of loss in mathematics. And also, the constants in mathematics, if you really think deeply through geometry, they match the constants both in mathematics and in physics. And if you really think even deeper, they kind of like transcend over all the other disciplines. 00:07:08.320 |
So that way we think a lot. And I found out that the principles that govern LLMs are the exact same principles that govern neuroscience. And they are the exact same principles that govern mathematics. 00:07:27.320 |
And I studied, I studied papers of Perlman. I don't know if you've heard who is Perlman. 00:07:36.320 |
Perlman is this mathematician who refused to take a $1 million award for proving the -- for proving the -- for proving the -- one of the most important conjectures. 00:07:54.320 |
I studied about symmetries of three spheres. And once I realized that this deep math of spheres and circles is very much linked with how attention and diffusion models work, 00:08:18.320 |
So basically the formulas that Perlman reached are linking entropy with curvature. And curvature, 00:08:25.320 |
basically, if you think of curvature, it's a tension. It's gravity. So in a sense, there are multiple disciplines where the same things are appearing multiple times. And I will be publishing a series of papers with some amazing 00:08:46.320 |
Supervisors who are co-authors who are co-authors of two of these methods, methodologies. The transformers and VAEs. And I came to this realization that this equation governs everything. 00:09:08.320 |
It governs math, governs physics, governs our AI memory, governs neuroscience, biology, physics, chemistry, and so on and so forth. 00:09:20.320 |
So I came to this equation that memory times compute would like to be a squared imaginary unit circle. 00:09:34.320 |
If that existed ever, we would have perfect symmetries and we would kind of not exist. Because for us to exist, this asymmetries needs to show up. And in a sense, every single LLM through 00:09:47.320 |
weights and biases, the weights and biases, the weights are giving the structure, the compute that comes and transforms the data in sort of the raw format, the compute turns it into weights. The weights are basically, if you take these billions of parameters, the weights are the sort of like matrix structure of how this data looks like when you really find relationships in the raw data. 00:10:16.320 |
All right. And then there are these biases, these tiny shifts that are kind of like trying to like in a robust way adapt to this model so that it doesn't break apart but still is still is very well reflecting the reality. So something is missing. So when we take weights and biases and we apply scaling clause and we keep adding more data, more compute, we kind of get a better and better and better understanding of the data. 00:10:45.320 |
In a sense, if we have infinite data, we wouldn't have any biases. And this understanding is again the principle of this track of GraphRack. The disappearance of biases is what we are looking for when we are scaling our models. So in a sense, the amount of memory and compute should be exactly the same. 00:11:14.320 |
It's just slightly expressed in a different way. But if there are some imbalances, then something important happens. And I came to another conclusion that our universe is basically a network database. It has a graph structure and it's a temporal structure. So it keeps on moving, following some certain principles and rules. 00:11:43.320 |
And these principles and rules are not necessarily fuzzy. They have to be fuzzy because otherwise everything would be completely predictable. But if it would be completely predictable, it means that me, myself, would know everything about every single of you about myself from the past and myself from the future. So in a sense, it's impossible. And that's why we have 00:11:48.320 |
So in a sense, it's impossible. And that's why we have this sort of heat, diffusion, entropy models. 00:11:55.320 |
They allow us to exist. But something is preserved. 00:12:07.320 |
Any single symmetry that happens at the quantum level. 00:12:07.320 |
Any single symmetry that happens at the quantum level. 00:12:07.320 |
Any single symmetry that happens at the quantum level. 00:12:08.320 |
Any single symmetry that happens at the quantum level. 00:12:08.320 |
Any single symmetry that happens at the quantum level. 00:12:08.320 |
Any single symmetry that happens at the quantum level. 00:12:09.320 |
Any single symmetry that happens at the quantum level. 00:12:10.320 |
Any single tiny symmetry that happens at the quantum level. 00:12:10.320 |
Any single tiny symmetry that happens at the quantum level. 00:12:10.320 |
Any single tiny symmetry that happens at the quantum level. 00:12:10.320 |
Any single tiny symmetry that happens at the quantum level. 00:12:13.320 |
And that's why we have this sort of like heat, diffusion, entropy models. 00:12:15.320 |
And that's why we have this sort of like heat, diffusion, entropy models. 00:12:19.320 |
And that's why we have this sort of like heat, diffusion, entropy models. 00:12:24.320 |
Any single asymmetry that happens at the quantum level. 00:12:28.320 |
Any single tiny asymmetry that happens preserves causal links. 00:12:44.960 |
And these causal links are the exact thing that I would like you to have as a takeaway from 00:12:55.260 |
The difference between simple rack, hybrid rack, any types of rack, and graph rack is that 00:13:02.980 |
we are having the ability to keep these causal links in our memory systems. 00:13:11.520 |
Basically the relationships are what preserves causality. 00:13:24.840 |
That's why we can optimize hypothesis generation and testing. 00:13:31.680 |
So we will be able to do amazing research in biosciences, chemical sciences, just because 00:13:40.000 |
of understanding that this causality is preserved within the relationships. 00:13:46.700 |
And these relationships, when there are these asymmetries that are needed, they kind of create this curvature, 00:13:55.800 |
So we intuitively feel every single of you is choosing some specific workshops and talks that 00:14:04.540 |
Right now all of you are attending to the talk and workshop that we are giving. 00:14:21.120 |
And this value, this information is transcended through space and time. 00:14:27.960 |
So it's very subjective to you or any other object. 00:14:33.100 |
And I think we really need to understand this. 00:14:39.600 |
So LLMs are basically these weights and biases or correlations that give us this opportunity 00:14:45.800 |
to be fuzzy, you know, actually one thing that I learned from Wojciech 10, 8, 11 years ago was 00:14:55.280 |
that hallucinations are the exact necessary thing to be able to solve a problem where you have too little 00:15:02.760 |
memory or too little compute for the combinatorial space of the problem you're solving. 00:15:07.640 |
So you're basically imagining, you're taking some hypothesis based on your history and you're 00:15:15.120 |
kind of trying to project it into the future. 00:15:17.520 |
But you have too little memory, too little compute to do that, so it can be as good as the amount 00:15:24.320 |
So it means that the missing part is something that you kind of can curve thanks to all of these 00:15:31.760 |
causal relationships and this fuzziness and reasoning is reading of this asymmetries. 00:15:47.440 |
Hence, I really believe that agentic systems are sort of the next big thing right now because 00:15:59.880 |
they are following the network database principle. 00:16:05.260 |
But to be causal, to recover this causality from our fuzziness, we need graph databases. 00:16:16.780 |
And that's the major thing in this emerging trend of GraphRack that we are here to talk about. 00:16:26.560 |
And I would like to at this moment invite on stage our three amazing guest speakers. 00:16:49.500 |
So Vasilya will show us how to learn a search and optimize memory based on certain use case 00:17:26.560 |
And I'm Vasilya, I'm originally from Montenegro, a small country in the Balkans. 00:17:30.560 |
This is beautiful, so if you want to go there, my cousins Igor and Milos are going to welcome 00:17:36.900 |
So, you know, in case you're just curious about memory, I'm building a memory tool on top of 00:17:44.880 |
My background is in business, big data engineering, and clinical psychology. 00:17:48.800 |
So a lot what Mark talked about kind of connects to that. 00:17:55.140 |
The demo is to do a Mexican standoff between two developers, where we are analyzing their 00:18:01.940 |
And these data from the GitHub depositories is in the graph, and this Mexican standoff means 00:18:07.820 |
that we will let a crew of agents go, analyze, look at their data, and try to compare them against 00:18:13.940 |
each other and give us a result that should represent who should we hire, let's say, ideally 00:18:21.060 |
So what we're seeing here currently is how Cognify works in the background. 00:18:25.460 |
So Cognify is working by adding some data, turning that into a semantic graph, and then 00:18:31.060 |
we can search it with a wide variety of options. 00:18:33.380 |
We plugged in crew AI on top of it, so we can pretty much do this on the fly. 00:18:37.020 |
So here in the background, I have a client running. 00:18:43.180 |
So it's now currently searching the data sets and starting to build the graphs. 00:18:52.140 |
But in the background, we are effectively ingesting the GitHub data from the GitHub API, building 00:18:58.880 |
the semantic structure, and then letting the agents actually search it and make decisions 00:19:04.560 |
So as every time with live demos, things might go wrong, so I have a video version in case 00:19:21.040 |
And as you can see, we have activity log where the graph is being continuously updated on the 00:19:28.880 |
And then data is being enriched, and the agents are going and making decisions on top. 00:19:33.120 |
So what you can see here on the side is effectively the logic that is reading, writing, analyzing, 00:19:41.720 |
and using all of this, let's say, preconfigured set of weights and benchmarks to analyze any person 00:19:51.320 |
You can ingest from any type of a data source, 30-plus data sources supported now. 00:19:57.380 |
You can build graphs from relational databases, semi-structured data, and we also have these 00:20:02.020 |
memory association layers inspired by the cognitive science approach. 00:20:05.920 |
And then effectively, as we kind of build and enrich this graph on the fly, we see that 00:20:14.640 |
And then we're storing the data back into the graph. 00:20:17.020 |
So this is the stateful, temporal aspect of it. 00:20:20.560 |
We can build the graph in a way that we can add the data back, that we can analyze these 00:20:25.120 |
reports, that we can search them, and that we can let other agents access them on the fly. 00:20:29.460 |
The idea for us was let's have a place where agents can write and continuously add the data 00:20:39.760 |
So if we click on any node, we can see the details about the commits, about the information 00:20:45.920 |
from the developers, the PRs, whatever they did in the past, and which repos they contributed 00:20:52.820 |
And then at the end, as the graph is pretty much filled, we will see the final report kind 00:21:02.000 |
So it's taking, it's preparing now the final output for the hiring decision task. 00:21:07.960 |
So let's have a look at that when it gets loaded. 00:21:13.420 |
I hope to have a hosted version for you all today, but didn't work through AI's causing 00:21:30.980 |
So I will just show you the video with the end so we don't wait for it. 00:21:39.520 |
So here you can see that towards the end, we can see the graph and we can see the final 00:21:53.840 |
And in the green node, we can see that we decided to hire Laszlo, our developer who has a PhD 00:22:01.280 |
So it's not really difficult to make that call. 00:22:03.200 |
And we see why and we see the numbers and the benchmarks. 00:22:07.520 |
Again, very fast three minute demo, so hope you enjoyed. 00:22:10.560 |
And if you have some questions, I'm here afterwards. 00:22:13.680 |
So happy to see new users and if you're interested, try it. 00:22:24.960 |
So Vasilija showed us something I call semantic memory. 00:22:29.560 |
So basically you take your data, you load it and cognify it, as they like to say. 00:22:45.480 |
And next up is Alex will show us Neo4j MCP server. 00:22:53.400 |
Test, test, test, test, test, test, test, five, four, three, two, one, we're good. 00:23:14.320 |
I'm going to demo the memory MCP server that we have available. 00:23:22.440 |
So there is this walkthrough document that I have. 00:23:26.200 |
We'll make this available in the Slack or by some means so that you can do this on your 00:23:32.600 |
And what we're going to showcase today is really like the foundational functionality that 00:23:35.880 |
we would like to see in a agentic memory sort of application. 00:23:40.520 |
Primarily, we're going to take a look at semantic memory in this MCP server, but we are currently 00:23:46.980 |
And we're going to add additional memory types as well, which we'll discuss probably later 00:23:53.380 |
So in order to do this, we will need a Neo4j database. 00:23:57.280 |
Neo4j is a graph native database that we'll be using to store our knowledge graph that we're 00:24:02.460 |
They have an Aura option, which is hosted in the cloud, or we can just do this locally 00:24:09.620 |
Additionally, we're going to do this via Claw desktop. 00:24:16.300 |
And then we can just add this config to the MCP configuration file in Claw. 00:24:22.180 |
And this will just connect to the Neo4j instance that you create. 00:24:26.300 |
And what's happening here is we're going to -- Claw will pull down the memory server from 00:24:32.100 |
PyPy and it will host it in the back end for us. 00:24:34.340 |
And then it will be able to use the tools that are accessible via the MCP server. 00:24:39.980 |
And the final thing that we're going to do before we can actually have the conversation 00:24:42.500 |
is we're just going to use this brief system prompt. 00:24:45.580 |
And what this does is just ensure that we are properly recalling and then logging memories 00:24:52.540 |
So with that, we can take a look at a conversation that I had in Claw desktop using this memory 00:25:00.680 |
And so this is a conversation about starting an agentic AI memory company. 00:25:09.000 |
And so initially, we have nothing in our memory store, which is as expected. 00:25:13.980 |
Now, as we kind of progress through this conversation, we can see that at each interaction, it tries 00:25:20.080 |
to recall memories that are related to the user prompt. 00:25:24.180 |
And then at the end of this interaction, it will create new entities in our knowledge graph 00:25:31.660 |
And so in this case, an entity is going to have a name, a type, and then a list of observations. 00:25:37.820 |
And these are just facts that we know about this entity. 00:25:40.600 |
And this is what is going to be updated as we learn more. 00:25:43.920 |
In terms of the relationships, these are just identifying how these entities relate to one another. 00:25:51.020 |
And this is really the core piece of why using a graph database as sort of the context layer 00:25:57.020 |
is so important because we can identify how these entities are actually related to each other. 00:26:06.420 |
And so as this goes on, we can see that we have quite a few interactions. 00:26:10.480 |
We are adding observations, creating more entities. 00:26:14.580 |
And at the very end here, we can see we have quite a lengthy conversation. 00:26:17.620 |
We can say, let's review what we have so far. 00:26:21.180 |
And so we can read the entire knowledge graph back as context, and Claude can then summarize 00:26:27.460 |
And so we have all of the entities we found, all the relationships that we've identified, 00:26:31.480 |
and all the facts that we know about these entities based on our conversation. 00:26:35.660 |
And so this provides a nice review of what we discussed about this company and our ideas 00:26:45.560 |
This is available both in Aura and local, and we can actually visualize this knowledge 00:26:49.720 |
We can see that we discussed Neo4j, we discussed MCP, and Lang graph. 00:26:54.380 |
And if we click on one of these nodes, we can see that there is a list of observations 00:26:59.680 |
And this is all of the information that we've tracked throughout that conversation. 00:27:02.880 |
And so it's important to know that even though this knowledge graph was created with 00:27:06.180 |
a single conversation, we can also take this and use it in additional conversations. 00:27:10.400 |
We can use this knowledge graph with other clients such as cursor IDE or Windsurf. 00:27:15.560 |
And so this is really a powerful way to create a memory layer for all of your applications. 00:27:38.380 |
I will just assure personal beliefs about MCPs. 00:27:43.420 |
I was testing MCPs of Neo4j, Graffiti, Cogni, Memzear just before the workshop. 00:27:50.380 |
And I'm a strong believer that this is our future. 00:27:56.140 |
And in a second, I will be showing a mini graph rack chat arena. 00:28:01.320 |
And next up, something very, very important that Daniel does is temporal graphs. 00:28:09.720 |
They have 10,000 stars on GitHub and growing very fast. 00:28:26.980 |
So I'm here today to tell you that there's no one-size-fits-all memory. 00:28:36.820 |
And why you need to model your memory after your business domain. 00:28:42.580 |
So if you saw me a little bit earlier and I was talking about Graffiti, Zepp's open source 00:28:48.020 |
temporal graph framework, you might have seen me just speak to how you can build custom entities 00:28:56.020 |
and edges in the Graffiti graph for your particular business domain. 00:29:02.700 |
So business objects from your business domain. 00:29:05.640 |
What I'm going to demo today is actually how Zepp implements that and how easy it is to use 00:29:13.740 |
And what we've done here is we've solved a fundamental problem plaguing memory. 00:29:18.660 |
And we're enabling developers to build out memory that is far more cogent and capable for many 00:29:31.580 |
So I'm going to just show you a quick example of where things go really wrong. 00:29:38.940 |
So many of you might have used ChatGPT before. 00:29:44.780 |
And you might have noticed that it really struggles with relevance. 00:29:49.060 |
Sometimes it just pulls out all sorts of arbitrary facts about you. 00:29:53.060 |
And unfortunately, when you store arbitrary facts and retrieve them as memory, you get inaccurate 00:30:02.420 |
And the same problem happens when you're building your own agents. 00:30:10.460 |
And it should remember things about jazz music, NPR podcasts, the daily, et cetera, all the things 00:30:18.420 |
But unfortunately, because I'm in conversation with the agent or it's picking up my voice when 00:30:22.700 |
I'm, you know, it's a voice agent, it's learning all sorts of irrelevant things, like I wake 00:30:28.380 |
up at 7:00 a.m., my dog's name is Melody, et cetera. 00:30:33.220 |
And the point here is that irrelevant facts pollute memory. 00:30:39.260 |
They're not specific to the media player business domain. 00:30:43.100 |
And so the technical reality here is as well that many frameworks take this really simplistic 00:30:52.340 |
If you're using a framework that has memory capabilities, agent framework, it's generating 00:30:56.860 |
facts and throwing it into a vector database. 00:30:59.020 |
And unfortunately, the facts dumped into the vector database or Redis mean that when you're 00:31:04.240 |
recalling that memory, it's difficult to differentiate what should be returned. 00:31:08.100 |
We're going to return what is semantically similar. 00:31:12.020 |
And here we have a bunch of facts that are semantically similar to my request for my favorite tunes. 00:31:22.200 |
And unfortunately, Melody is there as well, because Melody is a dog named Melody. 00:31:27.720 |
And that might be something to do with tunes. 00:31:37.880 |
So basically, semantic similarity is not business relevance. 00:31:45.960 |
I was speaking a little bit earlier about how vectors and are just basically projections into 00:31:52.220 |
There's no causal or relational relations between them. 00:32:00.720 |
We need domain-aware memory, not better semantic search. 00:32:06.200 |
So with that, I am going to, unfortunately, be showing you a video because the Wi-Fi has been 00:32:37.860 |
And it's asking me, well, how much do I earn a year? 00:32:42.760 |
It's asking me about what student loan debt I might have. 00:32:46.500 |
And you'll see that on the right-hand side, what is stored in Zepp's memory are some very 00:32:59.260 |
We have financial goals, debts, income sources, et cetera. 00:33:08.780 |
And they're defined in a way which is really simple to understand. 00:33:21.020 |
So let's go take a look at some of the code here. 00:33:23.920 |
We have a TypeScript financial goal schema using Zepp's underlying SDK. 00:33:32.160 |
We can give a description to the entity type. 00:33:35.120 |
We can even define fields, the business rules for those fields, the values that they take on. 00:33:41.320 |
And then we can build tools for our agent to retrieve a financial snapshot which runs multiple 00:33:47.780 |
Zepp searches at the same time concurrently and filters by specific node types. 00:33:58.100 |
And when we start our Zepp application, what we're going to do is we're going to register 00:34:02.600 |
these particular objects with Zepp so it knows to build this ontology in the graph. 00:34:15.880 |
I'm going to say that I have $5,000 a month rent. 00:34:25.720 |
And in a few seconds, we see that Zepp's already parsed that new message and has captured that 00:34:38.780 |
And we can see the knowledge graph for this user has got a debt account entity. 00:34:45.240 |
It's got fields on it that we've defined as a developer. 00:34:50.220 |
And so again, we can really get really tight about what we retrieve from Zepp by filtering. 00:34:57.620 |
So just very quickly, we wrote a paper about how all of this works. 00:35:01.400 |
You can get to it by that link below and appreciate your time today. 00:35:14.720 |
So once I'm getting ready, I would appreciate if you confirm with me whether you have access 00:35:22.520 |
Is the Slack working for you, the Slack channel? 00:35:27.420 |
So I'd appreciate if you have any questions to any of the speakers. 00:35:36.700 |
And we are happy to answer more of these questions just after the workshop. 00:35:41.160 |
I right now move on with a use case that I developed and to this GraphRack chat arena. 00:35:51.720 |
To be specific, before delving into agentic memory, into knowledge graphs, I led a private 00:36:04.860 |
cyber security lab and worked for defence clients. 00:36:10.180 |
A very big client with very serious problems on the security side. 00:36:16.320 |
And I used to -- in one project, I had to navigate between something like 27, 29 different terminals 00:36:33.840 |
Like if you think of different Linux distros, every firewall and networking devices usually 00:36:42.740 |
So you need to know lots of languages to communicate with these machines to work with such clients. 00:36:48.060 |
And I realized that LLMs are not only amazing to translate these languages, but they are also 00:36:54.000 |
very good to kind of create a new type of shell, a human language shell. 00:37:00.600 |
But such shells, they would really be excellent if they have episodic memory, the sort of temporal 00:37:10.860 |
memory of what was happening in this shell historically. 00:37:14.440 |
And if we have access to this temporal history, the events, we kind of know what the users were 00:37:23.780 |
We kind of can control every single code execution function that's running, including the ones 00:37:30.420 |
So I spotted with some investors and advisers of mine, I spotted a niche. 00:37:38.780 |
And I wanted to do a super quick demo of how it would work. 00:37:45.020 |
So basically, you would run commands and type pwd. 00:37:51.380 |
And in a sense, I suppose lots of us had computer science classes or we worked in shell. 00:37:58.980 |
And we have to remember all of these commands, like, show me running Docker containers. 00:38:08.200 |
But if you go for more advanced commands, I think it's for a reason, yeah, I think it's 00:38:39.540 |
In general, I would need to know right now some command that can extract me, for instance, 00:38:45.200 |
the name of the container that's running and its status. 00:38:53.720 |
I can make mistakes, like human language fuzzy mistakes. 00:39:14.920 |
So basically, if you plug in the agentic -- if you plug in the agentic memory to things like 00:39:24.380 |
that, I think it got it wrong, but you get me right. 00:39:29.000 |
So if I get through, like, different shells and terminals, and I have this textual context 00:39:36.060 |
of what was done and the context of the certain machine of what is happening here. 00:39:43.860 |
And it kind of spans across all the user -- all the machines, all the users, and all the sessions 00:39:49.200 |
in PTYs, TTYs, I think that we can really have a very good context also for security. 00:39:57.140 |
So that space, the temporal logs, the episodic logs, is something that I see will boom and 00:40:04.500 |
So I believe that all of our agents that will be executing code in terminals will be executing 00:40:13.100 |
it through -- maybe not all, but the ones that are running on the enterprise gate. 00:40:19.020 |
They will be going through agentic firewalls. 00:40:27.880 |
And now let's move on to GraphRack chat arena. 00:40:36.880 |
And this doc is allowing you to set up a repo that we've created for this workshop. 00:40:45.360 |
So about a year ago, I met with Jerry Liu from LamaLindex and we were chatting quite a while 00:40:50.740 |
about how to evolve this conversational memory. 00:41:01.820 |
Data abstractions, I kind of quickly solved within like two months. 00:41:05.440 |
Evals, I realized that there won't be any evals in form of a benchmark. 00:41:11.020 |
All of these hot potatoes and all of that, it's fun. 00:41:13.520 |
I know that there are great papers written by our guest speakers and other folks about hot 00:41:20.900 |
You can't do a benchmark for a thing that doesn't exist. 00:41:25.060 |
Basically, the agentic GraphRack memory will be this type of memory that evolves. 00:41:33.780 |
So if you don't know what will evolve, you will need a simulation arena. 00:41:42.920 |
So one year, fast forward, and we've created a prototype of such agentic memory arena. 00:41:50.340 |
Think about it like web arena, but for memory. 00:42:04.320 |
One approach will be sort of the repo, the library itself, and the other is through MCPs. 00:42:12.300 |
Because we don't really know what will work out better. 00:42:15.100 |
So whether repos or the MCPs will work out better. 00:42:17.400 |
So we need to test these different approaches. 00:42:28.420 |
So we get this nice chat where you can talk to these agents. 00:42:39.600 |
And there is a Neo4j agent running behind the scenes. 00:42:43.160 |
There is a Cypher graph agent running behind the scenes. 00:42:47.020 |
And I can kind of for now switch between these agents. 00:42:50.520 |
Maybe I'll increase the font size a little bit. 00:42:52.800 |
So the Neo agent is basically answering the questions about this amazing technology, the graphs, specifically Neo4j. 00:43:04.440 |
And then an agent that is excellent at running Cypher queries talks with me. 00:43:10.660 |
And I'm writing add to graph that I mark and I'm passionate about memory architectures. 00:43:16.520 |
And basically what it does is it runs these layers that are created by Cogni, by Memzero, by Graffiti, and all the other vendors of semantic and temporal memory solutions. 00:43:29.520 |
Or specifically created by an MCP server that Alex was demonstrating, the Neo4j MCP server. 00:43:40.180 |
So I'm really looking forward to how this technology evolves. 00:43:46.260 |
But what I quickly wanted to show you is that it already works. 00:43:50.460 |
It has this science of being this agentic memory arena. 00:43:54.920 |
So I can ask my graph through questions, and the agent goes to the connection. 00:44:07.660 |
It's just one Neo4j graph on the backend, and all of these technologies that can be tested. 00:44:13.360 |
How the graphs are being created and retrieved. 00:44:16.600 |
It's like - when I think of that, it's like the most brilliant idea that we can do with agentic 00:44:29.240 |
I can basically rerun the commands to see what's happening on this graph. 00:44:38.640 |
And next thing is, I would like to add to the graph that Vasilio will show how to integrate 00:44:55.060 |
But then I transfer it to graffiti, and I can repeat the exact same process. 00:45:00.460 |
So I can right now, using graffiti, search what I just added. 00:45:04.960 |
And I can switch between these different memory solutions. 00:45:10.800 |
And we do not have time to practice it together, do the workshop, but I'm sure we will write 00:45:20.040 |
And I would appreciate, if you have any questions, pass them on to Slack. 00:45:25.800 |
I will ask Andreas whether we have time for a short Q&A. 00:45:30.480 |
Or we need to move it to the breakout or outside of the room. 00:45:41.680 |
I really would like Vasilio, Daniel, and Alex to come back to stage so you can ask any of 00:45:47.480 |
us, please direct the questions to any of us, and we'll try to answer them. 00:46:00.180 |
How do you decide what is a bad memory over time? 00:46:04.880 |
Because you could, like, as a developer and as a person, we evolve the line of thought. 00:46:10.480 |
So one thing that you thought was good, like, three years, ten years ago, may not be good 00:46:21.260 |
So I will answer in -- maybe you guys can help. 00:46:27.740 |
So basically the one that causes a lot of noise. 00:46:33.740 |
So you decrease noise by redundancy and by relationships. 00:46:39.640 |
So the less relationships and the more noisiness, the -- so in a sense, a not well-connected 00:46:47.820 |
node has the potential of not being correct, but there are other ways to validate that. 00:46:57.340 |
A practical way, we let you model the data with Pydantix so you can kind of blow the data 00:47:03.180 |
you need and add weights to the edges and nodes. 00:47:06.420 |
So you can do something like temporal weighting, you can add your custom, let's say, logic and 00:47:10.600 |
then effectively you would know how your data is kind of evolving in time and how it's becoming 00:47:15.260 |
less or more relevant and what is the set of algorithms you would need to apply. 00:47:19.600 |
So this is the idea, not solve it for you, but help you solve it with tooling. 00:47:23.880 |
But yeah, there is -- depends on the use case, I would say. 00:47:28.780 |
I think what I would add is that there is missing causal links. 00:47:33.740 |
Missing causal links is what is most probably a good indicator of fuzzy nice. 00:47:44.960 |
How would you embed in security or privacy into the network or the application layer? 00:47:51.620 |
If there's a corporate, they have top secret data, or I have personal data that is a graph, 00:48:04.380 |
So basically, you do have to have that context. 00:48:07.100 |
You do have to have these decisions, intentions of colonels, of majors, and anyone like in the 00:48:13.280 |
enterprise -- like CISOs and anyone in the enterprise stack. 00:48:17.640 |
And in a sense, it also gets kind of like fuzzy and complex, so I expect this to be a very 00:48:25.140 |
But I'm sure that applying ontologies, the right ontologies, first of all, to this enterprise 00:48:30.400 |
cybersecurity stack really kind of provides these guardrails for navigating this challenging 00:48:38.040 |
problem and decreasing this fuzziness and errors. 00:48:43.040 |
I would also just add, like, all these applications are built on Neo4j. 00:48:46.740 |
And so in Neo4j, you can, like, do role-based access controls, and so you can prevent users 00:48:53.200 |
from accessing data that they're not allowed to see. 00:48:55.580 |
So it's something that you can configure with that. 00:49:10.460 |
Like, we also noticed that if you isolate a graph per user or kind of keep it, like, 00:49:11.460 |
very physically separate, for us, it really works well. 00:49:19.840 |
Mark, in your earlier presentation, you mentioned this equation that relates to gravity, entropy, 00:49:28.220 |
Could you show those two again and explain them again? 00:49:33.720 |
Other than that, it's probably for a series of papers to properly explain that. 00:49:41.780 |
The other one is that if you take all the attention, diffusion, and VAs, which are doing 00:49:45.840 |
the smoothing, it preserves the sort of asymmetries. 00:49:50.000 |
So very briefly speaking, let's set up the vocabulary. 00:49:53.060 |
So first of all, curvature equals attention equals gravity. 00:49:57.280 |
This is the very simple, most important principle here. 00:50:00.520 |
I will need to, when writing these papers, we are really tightly trying to define these three. 00:50:11.640 |
And if it's not the exact same thing, if there are other definitions, we need to show what's 00:50:17.040 |
And now, if you think about attention, it kind of shows the sort of, like, pathways towards 00:50:25.360 |
If you take a sphere, if you start bending that sphere and make it like, you know, you kind 00:50:30.360 |
of try to extend it, two things happen, entropy increases and curvature increases, in a sense. 00:50:37.560 |
And Perlman, what he did, he proved that you can, like, bend these spheres in any way, 3D 00:50:43.360 |
4D and 5D and higher level spheres were already solved. 00:50:49.520 |
And these equations are proving that basically there won't be any other architectures for LLMs. 00:50:55.360 |
It will be just attention, diffusion models, and VAs. 00:50:58.400 |
Maybe not just VAs, but, like, kind of, like, something that smooths -- leaves room for biases. 00:51:12.400 |
And we'll answer the questions outside of the room.