back to indexThe Hidden Life of Embeddings: Linus Lee

00:00:14.000 |
Hey everyone, I'm Linus, I'm here to talk about embedding. 00:00:17.000 |
I'm grateful to be here at the inaugural AI engineer conference. 00:00:24.000 |
Before I talk about that, a little bit about myself. 00:00:28.000 |
I work on AI at Notion for the last year or so. 00:00:32.000 |
Before that, I did a lot of independent work prototyping, 00:00:34.000 |
experimenting with, trying out different things with language models, 00:00:37.000 |
with traditional LLP, things like TF-IDF, BM25, 00:00:40.000 |
to build interesting interfaces for reading and writing. 00:00:42.000 |
In particular, I worked a lot with embedding models 00:00:44.000 |
and latent spaces of models, which is what I'll be talking about today. 00:00:48.000 |
But before I do that, I want to take a moment to say 00:00:51.000 |
it's been almost a year since Notion launched Notion AI. 00:00:54.000 |
Our public beta was first announced in around November 2022. 00:00:58.000 |
So as we get close to a year, we've been steadily launching new and interesting features inside Notion AI. 00:01:04.000 |
From November, we have AI autofill inside databases, translation, and things coming soon, 00:01:10.000 |
though not today, so keep an eye on the space. 00:01:12.000 |
And obviously we're hiring, just like everybody else here. 00:01:14.000 |
We're looking for AI engineers, product engineers, machine learning engineers 00:01:18.000 |
to tackle the full gamut of problems that people have been talking about today. 00:01:21.000 |
Agents, tool use, evaluations, data, training, and all the interface stuff that we'll see today and tomorrow. 00:01:28.000 |
So if you're interested, please grab me, and we'll have a little chat. 00:01:32.000 |
Now, it wouldn't be Alan's talk without talking about latent spaces, so let's talk about it. 00:01:36.000 |
One of the problems that I find always motivated by is the problem of steering language models. 00:01:41.000 |
And I always say that prompting language models feels a lot like you're steering a car from the backseat with a pool noodle. 00:01:46.000 |
Like, yes, technically you have some control over the motion of the vehicle. 00:01:51.000 |
But you're not really in the driver's seat, the control isn't really there, it's not really direct. 00:01:55.000 |
There's like three layers of indirection between you and what the vehicle's doing. 00:01:58.000 |
And that, to me, trying to prompt a model, especially smaller, more efficient models that we can use for production, 00:02:05.000 |
with just tokens, just prompts, feels a lot like there's too many layers of indirection. 00:02:10.000 |
And even though models are getting better at understanding prompts, I think there's always going to be this fundamental barrier 00:02:14.000 |
between indirect control of models with just prompts and getting the model to do what we want them to do. 00:02:21.000 |
And so perhaps we can get a closer layer of control, a more direct layer of control, by looking inside the model, 00:02:28.000 |
Latent spaces arise, I think, most famously inside embedding models. 00:02:35.000 |
If you embed some piece of text, that vector of 1536 numbers or 1024 numbers is inside a high-dimensional vector space. 00:02:42.000 |
That's a latent space, but also you can look at the latent spaces inside activation spaces of models, inside token embeddings, inside image models, 00:02:49.000 |
and then obviously other model architectures like auto-encoders. 00:02:51.000 |
Today we're going to be looking at embedding models, but I think a lot of the general takeaways apply to other models, 00:02:56.000 |
and I think there's a lot of fascinating research work happening inside other models as well. 00:03:00.000 |
When you look at an embedding, you kind of see this, right? 00:03:04.000 |
You see rows and rows of numbers if you ever debug some kind of an embedding pipeline and you'll print out the embedding. 00:03:09.000 |
You can kind of tell it has like a thousand numbers, but it's just looking at a matrix screen of numbers raining down. 00:03:14.000 |
But in theory, there's a lot of information actually packed inside those embeddings. 00:03:18.000 |
If you get an embedding of a piece of text or image, these latent spaces, these embeddings represent, in theory, the most salient features of a text or the image that the model is using to lower its loss or do its task. 00:03:29.000 |
And so maybe if we can disentangle some meaningful attributes or features out of these embeddings, 00:03:34.000 |
if we can look at them a little more closely and interpret them a little better, 00:03:37.000 |
maybe we can build more expressive interfaces that let them control the model by interfering or intervening inside the model. 00:03:43.000 |
Another way to say that is that embeddings show us what the model sees in a sample of input. 00:03:49.000 |
So maybe we can read out what it sees and try to understand better what the model's doing. 00:03:53.000 |
And maybe we can even control the embedding, intermediate activations to see what the model can generate. 00:04:04.000 |
So some of this some of you might have seen before, but I promise there's some new stuff at the end, so hang tight. 00:04:18.000 |
It's a sentence about this novel, one of my favorite novels, named Diaspora. 00:04:22.000 |
It's a science fiction novel by Greg Egan that explores evolution and existence, post-human artificial intelligences, 00:04:27.000 |
something to do with alien civilizations and the questioning the nature of reality and consciousness, 00:04:31.000 |
which you might be doing a lot given all the things that are happening. 00:04:35.000 |
And so I have trained this model that can generate some embeddings out of this text. 00:04:41.000 |
So if I hit the center, it's going to give us an embedding. 00:04:44.000 |
But it's an embedding of length 2048, and so it's quite large. 00:04:49.000 |
But then I have a decoder half of this model that can take this embedding and try to reconstruct 00:04:54.000 |
the original input that may have produced this embedding. 00:04:57.000 |
So in this case, it took the original sentence. 00:05:00.000 |
You can tell it's not exactly the same length, maybe. 00:05:02.000 |
But it's mostly reconstructed the original sentence, including the specific details like 00:05:08.000 |
So we have an encoder that's going from text to embedding and a decoder that's going from embedding back to text. 00:05:14.000 |
And now we can start to do things with the embedding to vary it a little bit and see what the decoder might see 00:05:19.000 |
if we make some modifications to the embedding. 00:05:22.000 |
So here I've tried to kind of blur the embedding and sample some points around the embedding with this blur radius. 00:05:29.000 |
And you can see the text that's generated from those blurry embeddings. 00:05:38.000 |
It still kept the name Greg, but it's a different person. 00:05:40.000 |
And so there's kind of a semantic blur that's happened here. 00:05:47.000 |
What's a little more useful is trying to actually manipulate things in more meaningful directions. 00:05:55.000 |
So maybe I want to find a direction in this embedding space. 00:05:59.000 |
Here I've computed a direction where if you push an embedding in that direction, 00:06:02.000 |
that's going to represent a shorter piece of text of roughly the same topic. 00:06:09.000 |
And it'll try to push the embedding of this text in that direction and decode them out. 00:06:15.000 |
And you can tell they're a little bit shorter if I push it a little bit further even. 00:06:19.000 |
So now I'm taking that shorter direction and moving a little farther along it 00:06:23.000 |
and sampling, generating text out of those embeddings again. 00:06:28.000 |
But they've still kept the general kind of idea, general topic. 00:06:33.000 |
And with that kind of building block, you can build really interesting interfaces. 00:06:36.000 |
Like, for example, I can plop this piece of text down here. 00:06:39.000 |
And maybe I want to generate a couple of sort of shorter versions. 00:06:52.000 |
And I'm going to make the sentiment of the sentence a little more negative. 00:06:57.000 |
And you can start to explore the latent space of this embedding model, this language model, 00:07:03.000 |
by actually moving around in a kind of spatial canvas interface, which is kind of interesting. 00:07:07.000 |
Another thing you can do with this kind of embedding model is now that we have a vague sense that there are specific directions in this space that mean specific things, 00:07:15.000 |
we can start to more directly look at a text and ask the model, hey, where does this piece of text lie along your length direction or along your negative sentiment direction? 00:07:26.000 |
So this is the original text that we've been playing with. 00:07:29.000 |
It's pretty objective, like a Wikipedia-style piece of text. 00:07:31.000 |
Here I've asked ChatGPT to take the original text and make it sound a lot more pessimistic. 00:07:36.000 |
So things like the futile quest for meaning and plunging deeper into the abyss of nihilism. 00:07:41.000 |
And if I embed both of these, what I'm asking the model to do here is embed both of these things in the embedding space of the model, 00:07:48.000 |
and then project those embeddings down onto each of these directions. 00:07:53.000 |
So one way to read this table is that this default piece of text is at this point in this negative direction, 00:07:59.000 |
which by itself doesn't mean anything, but it's clearly less than this. 00:08:02.000 |
So this piece of text is much further along the negative sentiment axis inside this model. 00:08:06.000 |
When you look at other properties, like how much of the artistic kind of topic does it talk about is roughly the same, 00:08:11.000 |
the length is roughly the same, maybe the negative sentiment text is a bit more elaborate in its vocabulary, 00:08:20.000 |
and so you can start to project these things into these meaningful directions and say, 00:08:23.000 |
what are the features of the models, what are the attributes of the models finding in the text that we're feeding it? 00:08:29.000 |
Another way you could test out some of these ideas is by mixing embeddings. 00:08:34.000 |
And so here I'm going to embed both of these pieces of text. 00:08:37.000 |
This one's the one that we've been playing with. 00:08:38.000 |
This one is the beginning of a short story that I wrote once. 00:08:41.000 |
It's about this town in the Mediterranean coast that's calm and a little bit old. 00:08:49.000 |
And so I'm going to say, this is a 2,000-dimensional embedding. 00:08:53.000 |
I'm going to say, give me a new embedding that's just the first 1,000 or so dimensions from the one embedding, 00:08:58.000 |
and then take the last 1,000 dimensions of the second embedding and just like slam them together, 00:09:04.000 |
And naively, you wouldn't really think that that would amount too much. 00:09:07.000 |
But actually, if you generate some samples from it, you can tell, you can see in a bit, 00:09:13.000 |
you get a sentence that's kind of a semantic mix of both. 00:09:16.000 |
You have structural similarities to both of those things. 00:09:19.000 |
Like you have this structure where there's a quoted kind of title of a book in the beginning. 00:09:22.000 |
There's topical similarities, there's punctuation similarities, tone similarities. 00:09:27.000 |
And so this is an example of interpolating in latent space. 00:09:31.000 |
The last thing I have, you may have seen on Twitter, is about, 00:09:36.000 |
okay, I have this un-embedding model and I have kind of an un-embedding model. 00:09:40.000 |
Can I use this un-embedding model and somehow fine-tune it or otherwise adapt it so we can read out text from other kinds of embedding spaces? 00:09:48.000 |
So this is the same sentence we've been using, but now when I hit this run button, 00:09:52.000 |
it's going to embed this text not using my embedding model, but using OpenAI's text-to-eta2. 00:09:58.000 |
And then there's a linear adapter that I've trained so that my decoder model can read out not from my embedding model, 00:10:08.000 |
It's going to try to decode out the text from given just the OpenAI embedding. 00:10:12.000 |
And you can see, okay, it's not as perfect, but there's a surprising amount of detail that we've recovered out of just the embedding 00:10:24.000 |
So you can see this proper noun, diaspora, it's surprisingly still in there. 00:10:28.000 |
This feature where there's a quoted title of a book is in there. 00:10:31.000 |
It's roughly about the same topic, things like the rogue AI. 00:10:34.000 |
Sometimes when I rerun this, there's also references to the author where the name is roughly correct. 00:10:40.000 |
So even surprising features like proper nouns, punctuation, things like the quotes, general structure and topic, obviously, 00:10:48.000 |
those are recoverable given just the embedding because of the amount of detail that these high-capacity embedding spaces have. 00:10:55.000 |
But not only can you do this in the text space, you can also do this in image space. 00:11:07.000 |
And for dumb technical reasons, I have to put two of me in. 00:11:10.000 |
And then let's try to interpolate in this image space. 00:11:13.000 |
So this is now using clip, clip's embedding space. 00:11:15.000 |
I'm going to try to generate, say, like six images in between me and the Notion avatar version of me, the cartoon version of me, 00:11:24.000 |
if the backend will warm up, cold starting models is sometimes difficult. 00:11:33.000 |
So now it's generating six images, bridging, kind of interpolating between the photographic version of me and the cartoon version of me. 00:11:39.000 |
And again, it's not perfect, but you can see here, on the left, it's quite photographic. 00:11:45.000 |
And then as you move further down this interpolation, you're seeing more kind of cartoony features appear. 00:11:51.000 |
And it's actually quite a surprisingly smooth transition. 00:11:54.000 |
Another thing you can do on top of this is you can do text manipulations as well, because clip is a multimodal text and image model. 00:12:06.000 |
I'm going to subtract the vector for a photo of a smiling man. 00:12:12.000 |
And instead, I'm going to add the vector for a photo of a very sad, crying man. 00:12:21.000 |
And empirically, I find that for text, I have to be a little more careful. 00:12:24.000 |
So I'm going to dial down how much of those vectors I'm adding and subtracting. 00:12:55.000 |
This time, maybe just generate four for the sake of time. 00:12:58.000 |
Or maybe there's a bug and it won't let me generate. 00:13:03.000 |
So in all these images that I've done, both in the text and image domain... 00:13:16.000 |
Okay, the beach didn't quite survive the latent space arithmetic. 00:13:19.000 |
But in all these demos, the only thing I'm doing is calculating vectors, calculating embeddings for examples. 00:13:25.000 |
And embedding them and just adding them together with some normalization. 00:13:30.000 |
And it's surprising that just by doing that, you can try to manipulate interesting features in text and images. 00:13:35.000 |
And with this, you can also do things like add style and subject at the same time. 00:13:42.000 |
This is a cool image that I thought I generated when I made my first demo. 00:13:44.000 |
And then you can also do some pretty smooth transitions between landscape imagery. 00:13:53.000 |
In all these prototypes, one principle that I've tried to reiterate to myself is that oftentimes when you're studying this very complex, sophisticated models, 00:14:04.000 |
you don't necessarily have the ability to look inside and say, "Okay, what's happening?" 00:14:08.000 |
Even getting an intuitive understanding of what is the model thinking, what is the model looking at, can be difficult. 00:14:12.000 |
And I think these are some of the ways that I've tried to render these invisible parts of the model a little bit more visible. 00:14:18.000 |
To let you a little bit more directly observe exactly what the model is. 00:14:22.000 |
The representations the model is operating in. 00:14:26.000 |
And sometimes you can also take those and directly interact or let humans directly interact with the representations to explore what these spaces represent. 00:14:34.000 |
And I think there's a ton of interesting, pretty groundbreaking research that's happening here. 00:14:39.000 |
On the left here is the Othello world model paper, which is fascinating. 00:14:44.000 |
And then on the right is a very, very recent, I had to add this in last minute because it's super relevant. 00:14:49.000 |
In a lot of these examples, I've calculated these feature dimensions by just giving examples and calculating centroids between them. 00:14:54.000 |
But here, Anthropix and Newark, along with other work from Conjecture and other labs, have found unsupervised ways to try to automatically discover these dimensions inside models. 00:15:03.000 |
And in general, I'm really excited to see latent spaces that appear to encode, by some definition, interpretable, controllable representations of the models, input and output. 00:15:12.000 |
I want to talk a little bit in the last few minutes about the models that I'm using. 00:15:18.000 |
I won't go into too much detail, but it's fine-tuned from a T5 checkpoint as a denoising autoencoder. 00:15:24.000 |
It's an encoder-decoder transformer with some modifications that you can see in the code. 00:15:32.000 |
I have some pooling layers to get an embedding. 00:15:34.000 |
This is like a normal T5 embedding model stack. 00:15:36.000 |
And then on the right, I have this special kind of gated layer that pulls from the embedding to decode from the embedding. 00:15:47.000 |
But we take this model, and we can adapt it to other models as well, as you saw with the OpenAI embedding recovery. 00:15:52.000 |
And so on the left is the normal trading regime where you have an encoder, you get an embedding, and you try to reconstruct the text. 00:15:57.000 |
On the right, we just train this linear adapter layer to go from embedding of a different model to then reconstruct the text with a normal decoder. 00:16:04.000 |
And today, I'm excited to share that these models that I've been dealing with, that you may have asked about before, are open on HuggingFace. 00:16:14.000 |
So you can go download them and try them out now. 00:16:18.000 |
And then there's a Colab notebook that lets you get started really quickly and try to do things like interpolation and interpretation of these features. 00:16:24.000 |
And so if you find any interesting results with these, please let me know. 00:16:27.000 |
And if you have any questions, also reach out and I'll be able to help you out. 00:16:31.000 |
The image model that I was using at the end was CacaoBrainsCarlo. 00:16:38.000 |
In this model, this model is an unclip model, which is trained kind of like the way that DALI 2 was trained as a diffusion model that's trained to invert clip embedding. 00:16:45.000 |
So go from clip embedding of images back to text. 00:16:48.000 |
And that lets us do similar things as the text model that we used. 00:16:52.000 |
In all this prototyping, I think a general principle, if you have one takeaway from this talk, 00:16:57.000 |
it's that when you're working with these really complex models and kind of inscrutable pieces of data, 00:17:02.000 |
if you can get something into a thing that feels like it can fit in your hand that you can play with, 00:17:06.000 |
that you can concretely see and observe and interact with, can be directly manipulated, visualized, 00:17:11.000 |
all these things, all the tools and prototypes that you can build around these things, 00:17:14.000 |
I think help us get a deeper understanding of how these models work and how we can improve them. 00:17:18.000 |
And in that way, I think models, language models and image models, generative models, 00:17:24.000 |
are a really interesting laboratory for knowledge, for studying how these different kinds of modalities can be represented. 00:17:31.000 |
And Brad Victor said, "The purpose of a thinking medium is to bring thought outside the head 00:17:36.000 |
to represent these concepts in a form that can be seen with the senses and manipulated with the body. 00:17:41.000 |
In this way, the medium is literally an extension of the mind." 00:17:44.000 |
And I think that's a great poetic way to kind of describe the philosophy that I've approached a lot of my prototyping with. 00:17:51.000 |
So, if you follow some of these principles and try to dig deeper in what the models are actually looking at, 00:17:56.000 |
build interfaces around them, I think more humane interfaces to knowledge are possible.