Back to Index

How to learn NLP for free for beginners


Chapters

0:0 Intro
1:53 ML 101 + Prerequisites
4:58 Sentdex + Neural Nets from Scratch
7:32 ML Coursera
9:31 100 Page ML Book
11:14 Applied ML + Daniel Bourke
13:17 Origin of Modern NLP
13:41 CS224N
14:44 NLP Specialization Coursera
15:57 Modern NLP + Transformers Intro
16:54 Transformer Courses
18:14 Doing Projects
19:18 Semantic + Vector Search
19:54 NLP for Semantic Search
20:44 Mining of Massive Datasets
22:27 Final Points

Transcript

Today I want to talk about something a little bit different and instead of focusing on a specific topic I want to actually talk about how I would learn machine learning and in particular NLP if I was starting from scratch. I've tried to keep this almost like guide as simple as possible because I think a lot of learning in terms of machine learning, data science, NLP, whatever else really needs to come from your own interests and so I think a very strict order of things to do can almost ruin that learning experience and so it's very important to just try and kind of figure things out as you go but at the same time I do think it's important to be aware of maybe a few particular courses or resources or books that have been very helpful for others in the past and therefore could also help you.

So I've broken it down into kind of four parts so there's the machine learning 101, applied ML, the origin of what I would say modern NLP and also just modern NLP and then after that I've also added in another bit so this is obviously very specific and we'll just mention this at the end but we won't really go into too much depth on it.

So each of these parts here act as almost like a stage in your journey of learning and of course at the start you actually need to learn about machine learning and this is almost not even the start. So yes you would start here but you also need to learn Python for example.

So in reality it might actually look something like this. So you would start with ML 101 and at the same time in parallel you might want to be learning Python. Now the reason I say in parallel rather than just starting with Python is more to do with whether you're interested in that or not because for me I started Python with the specific interest of like algorithmic trading and then very quickly that moved into sort of more machine learning.

But learning Python for the sake of Python I wasn't necessarily that interested in although I will say that once I got started I very quickly became very obsessed with just coding random things. So in my personal opinion and that's not necessarily going to be right for everyone, I would kind of learn those two things in parallel because you have to learn the Python side for the machine learning side but they both complement each other and having both of them in parallel rather than just doing Python will probably keep you more interested than just focusing on one at a time.

So in terms of resources for just Python I use Codecademy more than anything else. Now when I used it it was I think cheaper than it is now. So something to be aware of but there are so many good resources out there for learning Python. You probably aren't going to struggle to find anything.

The other thing I would say is also try projects. So just building projects will help your coding skills and every other component on here so much because it shows you where you're missing knowledge and also helps you learn how to apply what you have learned. And at the same time I think it's very motivating to build something.

At least for me I find it incredibly motivating to just build something cool that I find interesting. So that's another thing that I would definitely recommend on the Python side. So yeah that's a prerequisite almost or something you do in parallel to what I have here. So machine learning 101 is really kind of an introduction to machine learning and where you would learn about key concepts.

So you have neural networks up here, loss optimization, activation functions, and there is a lot more than just that in there. So with all of these resources you can mostly get them for free. So neural networks from scratch is very good by Sendex. And Sendex in general is also very good especially when you're starting Python.

That's probably another thing to add to the Python prerequisites. You can also learn from Sendex. That's one of the places I learned as well. So neural networks from scratch is in my opinion a really good resource and it's one of those things that wasn't around when I was learning this stuff but I really wish it had been.

And so much so that I bought it anyway and it is genuinely incredible. Now you don't need to buy the book as well. The book is great and I like having something physical but you can also do all this stuff online as well. Or not all of it. I think the first parts of the book has already been covered on Sendex's YouTube channel.

So where are we? Where do we find that? I think it's just YouTube Sendex. Yeah so just Sendex. And you can find on here he has here we are it's literally right at the top neural networks from scratch intro neural code. And this is really nice as well because you have a lot of visuals in here and obviously Sendex is talking through it which is really nice.

But of course also the book is is really incredible. And the reason I like it is because it's going through the code of everything making everything very visual but breaking everything down. Because what you'll find later on in machine learning that everything is very abstract but it's so important to understand the core concepts behind all of these things.

So we can see here like he's really breaking down everything. So this looks horrible but this is a really nice way to actually just understand what is going on in a neural network. And then later on with like actually coding and keeping things really simple like inputs weights. And the issue is that in most cases most tutorials they go too abstract too quickly.

Whereas in this book and on the video tutorials Sendex doesn't go too abstract too quickly. He builds it up from nothing which is really nice especially if you're just starting out. So that's one in my opinion very good resource for when you're starting out. Another one that it seems almost like everyone in machine learning or data science has done this course at some point.

And it is I will be honest it is outdated but not so much in it's not too outdated it's still very valuable. So the things that you learn on this course very much focus on the concepts behind much of the older stuff in machine learning and particularly the things that make machine learning what it is today.

Now one of the main reasons I say this is outdated is that it uses a programming language called Octave which is almost like an open source MATLAB if any of you know what MATLAB is. So in reality it is actually very similar to Python syntax so it's not so much of a bad thing that you use Octave because it translates over very easily.

And one thing that is great about this course is that you can actually enroll for free. I don't know if it's free the entire way through as it's been ages since I did this course but either way you can audit almost everything on Coursera for free it just means you don't get a certificate at the end but that's completely fine.

And if you really wanted a certificate at the end of it you can obviously pay for it. But this is an incredible course and you can see some of the things that go through linear algebra this is so important just a little bit of understanding of the maths behind all this to start with logistic regression, regularization, they go through so many important concepts that honestly like absolutely crucial for later on.

So this is you know without a doubt one of the best courses when you're getting started I think it's really really good. And then the final one that I wanted to talk about is a 100-page machine learning book. Again obviously this is a book I do think you can get it online as far as I know.

So I'd have a look at that it at one point you could for sure there's a pdf but this is really nice because it's it's 100 pages well I think I think maybe there's a few more than 100. Yeah okay so it goes kind of over 100 pages but it because it's such such a small number of pages everything in here is is very important so it's kind of stripped down everything you need just so you have all the important things.

So we can go ahead maybe find something interesting here. Let's see what whatever this is here that I made a note of a long time ago. Okay so I'm looking at like just classification first regression this is like just a few paragraphs explaining the difference between these two incredibly important things.

There we have parameters versus high parameters it's just so much important stuff and you have charts and visuals which I personally really like to have visuals in everything as much as possible. So this is another really really cool book that I would 100% recommend it's also code as well is it python yeah so that's really nice in my opinion and once you have gone past that you kind of want to start looking at applied machine learning.

So this is the only thing on this list I haven't personally gone through although I have watched a few videos from this and the reason I recommend it is because Daniel Burke is absolutely incredible at taking machine learning and stripping down everything you don't need to know and just keeping what you do need to know and teaching you in an incredibly entertaining way.

So this boot camp course as well you can I think possibly also find it on Udemy it has an incredibly high rating every course that Daniel Burke has done is incredibly high rating people are super happy with it and it's really friendly to beginners so that's why I recommend this even though I haven't been all the way through it myself but either way I'm very confident that this is a very good resource for those of you that are interested now but of course it's up to you and another thing that I would say here is okay I think it's very important to learn this sort of thing but at the same time if you're if you're really just kind of impatient and I mean it's probably best you do it but if you are super impatient fine you can probably skip it and move on to the modern NLP stuff because a lot of the code in modern NLP is actually quite simple so you can maybe drop that bit just as long as you know you're aware that there's going to be a lot of things that come up where you're going to be quite confused like pandas and numpy if you haven't seen them before but nonetheless you will learn a bit of numpy in these early ones especially neural networks from scratch so it's kind of possible to skip it if you if you're just impatient which is fine I'm impatient all the time as well I just want to learn new things so then the next thing is if you're really going for NLP I think it's really good to learn what is current and present in NLP but at the same time to really understand any of that it's very important to understand where it came from so I would definitely recommend Sanford CS224N it's incredible on YouTube that will take you through so many like incredibly useful things in NLP that are taught by some of the like best people in the world and it's very relevant it's incredibly useful the only thing is at least when I went through this course which to be fair was a few years ago they didn't really have anything on the most recent stuff in NLP that might have changed now and that is why I have kept it over here because if they do have anything on the more recent stuff in NLP I think it's very little so that's why I kept it in sort of the more the orange origin stuff because it mainly seems to focus on things that you see here like word vectors recurrent neural networks function gradient and maybe they go into attention near the end although I can't say for sure there and the next thing that I would recommend and that I also did so I did this course and I found this to be very good over on Coursera again I do believe if you look in like enrollment options down here or come down here you can audit the course can I take the course for free you can audit the course for free see now I'm not sure how you do that it's somewhere on here they keep it kind of hidden but you can audit it for free so that's incredibly useful and this is a really good course it's actually a specialization so it's multiple courses in one again they don't really go into the more modern stuff although I do know that they I believe actually paired up with Hugging Face which is like modern NLP and did introduce some of that at some point so yeah sort of attention models yeah you know you know brought in t5 invert so and reformer models as well so that's that's pretty cool so even at the end of this you will get to learn some of the more recent NLP stuff so then the final bit is modern NLP so in here there are a few things so how transformers work so this is an article I wrote and that of course there'll be links to this that you can access for free and in this I just wanted to explain where transformers came from in terms of going from recurrent neural networks through to adding attention to those recurrent neural networks encoded decoder models and how that tension works and I mean obviously I'm biased but I think this is a relatively good summary of what a transformer model is and transform models are really the foundation of modern NLP they're incredibly important so I think that's quite good just to read through that and try and understand what is actually going on and then there's two courses here so there's my course here you can always find discount on this but I don't necessarily think you need to even pay because there's a Hugging Face course and the Hugging Face course will cover everything I cover in my course anyway and more the only thing is that maybe my course is maybe a little more applied and maybe a little more concise in some parts and as well a lot of the stuff I cover in my course I also cover on YouTube for free so you just have a bit more of a structure I suppose so the Hugging Face course is like I said incredible you have a nice little summary of the course here which I can't can I zoom in that's probably the closest I'm going to get so you have an introduction diving in and more advanced stuff and you go through everything so you're using transformers Hugging Face they have multiple libraries for NLP transformers library datasets library tokenizers library they're all very good I would definitely recommend this as that sort of starting point or as the almost the foundation of where you learn your modern NLP techniques so that's the modern NLP stuff and yeah as I have written here I think at this point you're definitely in a good position to do some pretty interesting projects so at this point I would expect you've probably done some projects anyway but if you haven't then for sure at this point you really should do some projects so these are just projects that I've done in the past and actually showed you how to do on YouTube so there's building transform models from scratch I think this is really useful to understand how all of this works and there's a few things in there so this is like a building an Italian BERT model from from scratch and now I do have other series as well on that sort of thing and there's also another series I did here which again is the same it's like a Q&A app when I used to have that awful GoPro camera not very nice but anyway they are at least useful I hope as a guide or as inspiration to what sort of projects you can do so I mean that's really you know when it comes to NLP that's probably as much as you want to go but I will mention of course a lot of what I do is going to sort of semantic search and information retrieval so if that's the sort of thing that you're interested in then of course there are other resources that you can go through for that and these are all free if you want them to be free so the only one that isn't sometimes free is this bug here which we'll talk about in a minute but there are these two here which are just series that I've written and if you go on here you can see there's quite a line here and it's all free so it's quite I think it's it's a good deal so you come down and you can see we have all these different chapters that really go through what we would call sentence transformers semantic search and for me I think this is a really interesting sort of subdomain of NLP and then here we're looking at vector search another important part of of information retrieval and I of course again be able to find a free link to this in the in the description you see it's a few parts here we're just introducing similarity search face and a few other things so again free course I think it's a good deal and then this is the final one so mining of massive data sets is a really cool book it's quite big as well it's this really interesting picture on the front here and this is this is definitely this is more like when you start getting into your theory side of things and it's not directly applicable to a lot of NLP I will be honest but there is so many interesting things it's almost like the it's like the theory of everything that built up information retrieval and some obviously more advanced stuff as well but this is really when you're getting into trying to understand everything on a almost deeper level I would say though at the same time a lot of it you might not even find directly applicable to vector search as I said it's more everything behind vector search it can be incredibly useful and some of this stuff is is really interesting and definitely useful when you're actually implementing these sort of things but that's another one that's really good but like I said you can also get that online for free which is really nice so if you if you go over to here mmds.org you can come here and where is it I think there is a link over yeah so you have all the links here I'm sure there's a way to find a whole booking one oh yeah yeah in this as well so you can also do that it's pretty useful so yeah I mean that that's kind of what I would recommend you do if you're looking to learn NLP machine learning all this sort of stuff that's the route I would go a lot of those things obviously I went through myself and for me it has worked out well so far so that's a good indication that at least some of these resources will hopefully be useful to you as well and you know I get this question asked a lot you know how do I learn these things and hopefully this video will be some sort of guide for at least some of that but I'm also super interested in what resources you recommend so yeah definitely let me know if there's anything you think is missing and yeah thank you very much for watching I hope it's been useful and I will see you in the next one bye