Getting Started with GPT-3 vs. Open Source LLMs

Today we're going to get started with what will be a series of videos, tutorials, examples, articles on what is called LangChain. Now LangChain is a pretty new NLP framework that has become very popular very quickly. At the core of LangChain you have large language models and the idea behind it is that we can use the framework to build very cool apps using large language models very quickly.

We can use it for chatbots, generative question answering, summarization, logic loops that include large language models and web search and all these crazy different things that we can chain together in some sort of logical fashion. In this video what we are going to do is just have a quick introduction to LangChain and how we can use it.

We're going to take a look at the core components of what will make up our chains in LangChain and we're going to look at some very simple generative language examples using both the HuggingFace endpoint in LangChain and the OpenAI endpoint in LangChain. So let's get started by having a look at the, I think the main four components that I believe need explaining.

So we have prompt templates, large language models, agents and memory. Now prompt templates are actually pretty straight forward. They are templates for different types of prompts. Now let me actually show you a couple of examples from an app I built a while ago. So in this app here we have all these different styles, so these instructions that we can pass to a large language model, we have conservative Q&A, so basically we want to answer a question based on the context below and if the question can't be answered based on the context say I don't know and then you feed in the context and you feed in these questions.

We have simple instructions, given the common questions and answers below I think we would be feeding in here which would be the question and these would be the answers. Extract key libraries and tools, so this is talking about extracting like code libraries that you would use, so write a list of libraries and tools present in the context below and then this would basically return items from a database and you'd see from that a set of libraries that were mentioned in whatever information you've retrieved.

So these are the type of things I mean when we have, when I say we have prompt templates. Next of course we have large language models, I don't think I really need to explain them, it's just big models that are capable of doing pretty incredible things like GPT-3, Bloom and so on.

Next we have agents, now agents are processes that use large language models to decide what actions should be taken given a particular query or set of instructions or so on. So these can be paired with tools like web search or calculators and we package them all into this logical loop of operations, now it sounds pretty complicated so it's probably best I just show you an example of what this is.

So if we go over to the LangChain website, they have a really cool example in agents, getting started and we'll just scroll down a little bit and we can see here an example. So this is the agent executor chain, so there's a few components in here, the first thing that comes in is a thought from the large language model and so we're basing it on this query here, who is Olivia Wilde's boyfriend, what is his current age raised to 0.23 power.

So there's a few logical steps in this process and this is why we might need to use something like this. So the model, the large language model says okay from this, it's thought is I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to 0.23 power, the action here that the agent is deciding is search, okay and then it decides okay the input for this search action must initially be Olivia Wilde's boyfriend.

Now this here, so this defines that we're going to use a web search component, it goes to the web search component, types this in and the result that it gets is this, Harry Styles, so that's the observation based on what we have so far and this is part of a specific agent framework called React and at some point in the future we will definitely go into that into a lot more detail.

For now let's continue with this, based on this observation the language model now thinks okay I need to find out Harry Styles' age, so it starts the search again, it searches for his age, it gets 28 years and then the next thought is I need to calculate 28 raised to the 0.23 power, goes to the calculator action this time, so not search, it calculates this and we get the answer here, okay and then the final thought is I know the final answer, the final answer is this, okay so that's an example of one of these agents using multiple tools, in here we have the calculator and also the search tool as well.

So I think they are a pretty exciting and cool use of language. Now and then the final one is memory, so we have short-term, long-term memory for our models, now again this is really interesting, for long-term memory if you have watched my videos, if you've read my articles or anything like that in the past you have probably come across it.

We're going to take a look at this getting started, we have this conversation buffer memory which essentially you would use in a chatbot and it will just remember all the previous inputs and outputs and adds them into your next set of generations, so this is what you would use to have a conversation with a chatbot where it's remembering the previous steps of that conversation, there's some different versions of that like conversation summary memory and all of this is essentially what I would refer to as the short-term memory and then on the other side, so for long-term memory you have the data augmented generation stuff which is essentially where you're retrieving bits of information to feed into your model from an external data source and that would just allow it to essentially answer questions in a specific domain better or keep more up-to-date information or simply allow us to fact check what the large language model is actually saying.

Now they're the main components to LangChain and what we'll do now is actually just get started and we're going to do something really simple which is just using large language models in LangChain, so to get started we need to just pip install LangChain, so this will obviously install the library and what we will do is just go through some really basic examples of using LangChain for large language model generation with both OpenAI and HuggingFace, so let's get started with HuggingFace, now if you would like to follow along with this I'll leave a link to this notebook in the top right of the video right now or you can click a link in the video description to take you to this colab, so with HuggingFace we need to install the HuggingFace hub as a prerequisite and what's actually going to happen here is we're not going to be running HuggingFace models locally, we're actually going to be calling their inference API and we're going to be getting results directly from that, so to do that we actually do need a HuggingFace API token and this is all free by the way, so to get that we need to go to HuggingFace.co and if you don't have an account you'll need to sign up for one, I believe the sign up will be over here on the top right of the web page, you need to click here if you have signed in and you need to go to settings then you head over to access tokens and you will need to get, I think you can actually just use a read token but a write token you can use as well, in either case if this is your first time you will need to click new token, either choose read or write, I'm going to go with write because I know that one does definitely work, you just have to write something in here and then you click generate token, I've already created mine so I'm just going to copy this okay and then with that you would just put it into here, now I've already set my environment variable here so I'm not going to do it again and then we can come down and we can start generating text using a HuggingFace model from HuggingFace hub, so there are a few things that we'll need for this, we're going to be using a prompt template which is a template for our prompt as I mentioned before, we're going to be using HuggingFace hub class from LangChain and we're also going to be using this chain which is like a pipeline or a chain of steps from LangChain, now this one is pretty simple it's just a prompt template so you create your prompt based on this prompt template and then you generate your text using your large language model, now we are going to be initializing a large language model from HuggingFace and for that we are going to be using this model here, now this model if we go over to HuggingFace we can click on here type in Flan and you can see there are a few different models here, the Google Flan T5 XL is not the biggest but it is the biggest that will work on the free tier of inference here, so that's what we're using, if you try and use the XL model it will I think more likely not time out, at least it did for me, so with that in mind we initialize the model, we set the randomness or the temperature of the model to be very low so that we get relatively stable results, if you want more creative writing you would want to increase this value and then we create our template, so our template is going to just be very simple it's going to be a question answering template where you have a question, now this is our input variable that we'll be using in the template and then we have our answer and then the model will essentially continue from this point, so with that we use our prompt template, we use this template here and we just say the input variables within this template is a question, because this isn't an F string here, if it was an S string it would look different, it's actually just a string, so here we're saying whatever the question input is we're going to put it here, then we create our chain prompt followed by our large language model and then we're going to ask the question which NFL team won the Super Bowl in the 2010 season and we're just going to print that out so I'm going to run this, okay and we get Green Bay Packers, now if we would like to ask multiple questions together we have to do this, so we get like a list of dictionaries, within each one of those dictionaries we need to have the input variables, so if we had multiple input variables we would pass them into here, so this question is going to be mapped to question in our template and now I'm going to ask the question, so the first one same thing again, now I'm going to ask a bit more of like a logical question here, some more facts and again like common sense and we can run these, I think this model doesn't actually do so well with these, so we have this kind of like format here, first one I believe is correct, the second one 184 centimeters which is not true it should be about 193 centimeters for this one so who's the 12th person on the moon, it's saying John Glenn who never went to the moon and then how many eyes does the blade of grass have, apparently it has one, so you know this model is, it's not the biggest model, it's somewhat limited and there are other models that are open source that will perform much better like bloom but when we're using this endpoint here without running these locally we are kind of restricted in size to this model, so that's what we have there, one other thing now obviously these haven't performed so well so these are not very likely to perform well either but what we can do with a lot of large language models is we can actually feed in all these questions at once so we wouldn't need to do this like iteratively calling the large language model and asking it one question and then another question and another question, some of the better large language models as we'll see soon would be able to handle them all at once, now in this case we'll see it doesn't quite work but we'll see later that there are models that can do that, so the only thing that changed here is I changed my template so I said answer the following questions one at a time, pass in those questions and then try to get some answers, the model didn't really listen to me so it just kind of did its own thing, nonetheless that is what we got for that one, now let's compare that to the OpenAI approach of doing things, now for this we again need another prerequisite which is the OpenAI library, just say pip install OpenAI and we come down here, we will also need to pass in our OpenAI API key, let me just show you how to get that quickly, so if we go OpenAI it was at betaopenai.com but I think they've changed it recently so it's no longer beta, it's just openai.com/api, you come over here to log in or sign up if you don't have an account, once you have logged in you have to head over to the right here, go to account, come down and I think we need to just go to settings, API keys and then you can create a new secret key here so you just click create new secret key, okay for me it doesn't actually let me create another one because I already have too many here but that's fine you just create your new secret key and then you just copy it, for me I have already added it to my environment variables in OpenAI API key so I don't need to rerun that, one thing is if you are using OpenAI via Azure you should also set these things as well so you should set that you're using Azure here, the OpenAI API version, Azure has several API versions apparently so you will need to set that and then you also need to set the URL for your Azure OpenAI resource here as well and then here I don't think that's relevant so skip that and after that so we need to decide on which model we're going to use, we're going to be using the text DaVinci 0.0.3 model which is one of the better generation models from OpenAI runners, okay and that is our large language model, that is our lang chain, large language model there, we can actually generate stuff with this directly but as we did before we are going to use the large language model chain, again if you're using Azure you'll need to follow this step here rather than what I just did, so come down here we use large language model chain again and we're using the same prompt as what we initially created before so this is a simple question answer prompt, large language model is this time DaVinci and I'm going to run this and we get this answer so the Green Bay Packers won the Super Bowl in the 2010 season so a little more descriptive than the answer we got from our T5 Flan model but that's to be expected, the OpenAI's DaVinci model is a lot bigger and pretty advanced so after that let's try again with multiple questions, let's see what we get, okay so we get the Green Bay Packers won the Super Bowl in the 2010 season, correct, next we get this which again is mostly wrong so Eugene A.

Cernan was the 12th person to walk on the moon, as far as I know it is Harrison Smith I think, yeah Harrison Smith so not quite right but I think the rest of it was very close so Apollo 17 and I'm pretty sure it was December 1972 as well although not 100% sure on that so we can assume that is correct so the Apollo 17 mission in December 1972 and I think this guy is actually his teammate so I suppose this would have been the 11th person on the moon so it got, it did get pretty close but not quite there and that is actually the third question I've skipped one by accident, if I'm 6 foot 4 inches how tall am I in centimetres, very specific we've got 193.04 centimetres which is probably like the exact measurement but I know for sure 193 is correct and then on to the last question, how many eyes does a blade of grass have, we get a blade of grass does not have any eyes, okay so we get a sensible answer this time and then I wanted to very quickly just show this, so this is a list of items and when passing this to the large items model chain dot run this is actually incorrect so this is actually just going to see all of this as a single string, now in this case with our model, with our DaVinci model it does pretty well, still got this one wrong but it's actually able to manage with this even though it's not in the correct format or when asking these questions one by one and you can see it actually does get, sometimes it gets the correct answer and sometimes it messes other questions up which is I think pretty interesting to see, now the final one is what we did before so where we come up to here where we have the string and let's go ahead and do that, bring it down here, so I'm going to answer multiple questions in a single string, a large items model actually I want to be using DaVinci, okay and we get this so the green, Bay Packers won the Superbowl in the 2010 season, I am 193 centimeters tall, yep Edwin Buzz Aldrin wrong and Ablade Agresta does not have eyes, so we get some good answers there, okay so that's it for this very quick introduction to Langtrain, as I said in the future we're going to be covering this library in a lot more detail and as you've already seen at the start of the video there are some pretty interesting things we can do with this library very easily, but for now that's it for this video I hope all this been interesting and useful, so thank you very much for watching and I will see you again in the next one, bye.

Getting Started with GPT-3 vs. Open Source LLMs - LangChain #1

Chapters

Transcript