NER With Transformers and spaCy (Python)

Okay, in this video, we're going to have a look at using named entity recognition through the spaCy library, which I've already done a video on before. But this time, rather than using the traditional spaCy models, we're going to use a transformer model. Now, the setup for this is pretty similar to normal spaCy.

So the very first thing we want to do is obviously install spaCy transformers. So to do that, all we need to write is pip install spaCy and then in square brackets, transformers. Now, alongside this, we can also install what we need to run it with CUDA. So if you have CUDA, you can check your version number like this.

Okay, and so for me, I can see that I have CUDA 11.1. And so what I would do, rather than just writing pip install spaCy transformers, I would write this, transformers. And then after this, I include my CUDA version. So I write CUDA, and I'm on 11.1, so I put 111.

And then I'd enter this and install spaCy. But I already have it installed, so I'm not going to do that again. Now, once we have installed spaCy transformers, we also need to download the transformer model from spaCy. And to do that, we would write this. So we write Python, and we specify the spaCy module, and then we write download.

And the transformer model is very similar to the other spaCy models if you've used them before. We have to do this for every spaCy model, not just the transformer model. So depending on which model you want, I'm going to be using the English one. There are other models as well for other languages.

We're using the English core web. And I think there's only a web transformer model, but I'm not 100% sure on that, so don't take my word for that. And then the transformer part of this is trf at the end. So usually the models we typically use are sm for small and lg for large.

This time we're using the transformer version of that model. Again, I already have it installed, so I'm not going to run it again. And now to start using this, we do what we would normally do with spaCy. So all we actually need to do is import spaCy. And I'm also going to import displaCy so that I can visualize the named entity recognition.

So from spaCy import displaCy. And then what we do is this is almost exactly the same as what we always do with spaCy. There's really no difference here. So we do nlp spaCy.load. And then here we include our model name. So usually what we do is write encore web sm or lg.

This time we're going to write trf for transformer. Run that. And once we've loaded that, we can form named entity recognition as we normally would. So what we do is doc nlp. And then in here we pass our tex. And then we want to visualize that. And we can do that in Jupyter just like this.

So render. And then we pass in the document or the doc that we just created. And we want the style to be entities because we're doing n-e-r here. Okay, so there we have our trf model. Now I'm going to swap these around because I want to compare this to what we would normally use, which is the...

Let's use the large model because it's probably the closest match to the transformer model. And I'll just call this lg. And modify this to be lg as well. And what we'll do is just do the same thing again here. And here. And rerun that. And you can see that in this case, there's actually no difference.

Now, I think this is to be expected because the traditional spaCy model, especially the large model, is still really good in terms of performance. So in most cases, or at least a fair few cases, we shouldn't really see that much difference because this is still a pretty solid model.

And we can see this as well here. So this is a longer text. I got this from the investing subreddit. And let's do the same thing again. So we're going to take the transformer model. Also the traditional model. We'll convert both of those to dots. So just do it like this.

Let's try and keep it reasonably tidy. Okay, and then we just want this. And this again. Okay. Now, again, with this one, we don't see any difference because, like I said, this traditional spaCy model is still pretty good. But what we can do is start adding more complex or at least longer pieces of text.

And then let's compare how they perform. Because in this case, the transformer model will outperform the traditional model. Okay, so this text is quite a bit longer. And now let me copy these two. Bring them down here. And let's do that again. I'm not sure why. Sorry, I'm reloading the model that I didn't realize.

Okay. And here and here. Now let's compare these two. So the transformer model is correctly identifying Fastly as an organization. The large model is identifying Fastly as a person. So you can see a little bit of difference there. Then up here, we have this quarter one 21 identified as a date by a transform model.

Not at all with the large traditional model. Go a little bit further. So this one, I'm not really sure if this is a good thing or bad thing. So the transformer model is identifying a whopping 27% as the percent. Which I don't think that's really a good thing. But it depends on what you're doing.

Maybe you kind of want that exaggeration of the percentage in there. But I don't think so. So I suppose that depends on what you're doing. So I suppose that depends. But I would say probably transform model is actually performing worse with that single, that single use, that single case.

The rest of these, the money percentages, they're all matching between the two models. Here, the large model is pulling out this one cardinal. I'm not sure we really want that one in there to be honest. And then if we continue, the transform model is getting the Q1 date and the facility organization down here.

Usual model, missing Q1. We do have facilities person. And then down here, the large model is seeing CFO as an organization, which is obviously not the case. It's actually a substitution, obviously. So, yeah, I mean, they're not far from each other for sure. But I think in this case, transform model is definitely outperforming, at least by a little bit.

And I think the more complex the language gets, the better the transform model can form. So, I mean, that's it for this video. I just wanted to give you a quick demonstration of spaCy and transform, which I think is really cool that you can actually use both of those packages together.

So I think thank you very much for watching and I'll see you again in the next one.

NER With Transformers and spaCy (Python)

Transcript