OpenAI GPT 3.5 AI assistant with Langchain + Pinecone #1

00:00:00.000 | Okay so I'm going to share something that I've been working on for the last few days

00:00:04.000 | and it's going to take, I think, quite a few more days for this to progress into something

00:00:08.920 | that is in a more showable state but I wanted to kind of share that progress and I think

00:00:16.480 | it makes sense for us to walk through the steps of building it as they are built.

00:00:23.220 | So the idea is something that I've actually, I think, wanted to build since I kind of came

00:00:29.060 | into the space of NLP and machine learning and in fact I think it's one of the main reasons

00:00:34.460 | I even got interested in NLP in the first place which is actually trying to build some

00:00:40.220 | kind of AI assistant and I don't mean the AI assistant like Siri or Alexa because honestly

00:00:48.680 | I don't think they're that useful like okay it's cool because you can speak and the lights

00:00:53.100 | will turn on or a certain song will play, that's great, but it's not actually helping

00:00:59.100 | me do things in my daily life that I, you know, personally would think of as productive.

00:01:05.300 | So what I've always wanted to build is an AI assistant that can actually do that but

00:01:09.520 | unfortunately it's always been kind of out of reach to do anything like this, that is

00:01:14.780 | until very recently with the release of these crazy AI models we've seen CheckGPT, obviously

00:01:23.060 | GPT-3 for quite a while, we have long-term memory, we have essentially all of these different

00:01:29.500 | components are kind of coming into place where we're now at the stage that we can build an

00:01:35.860 | AI assistant that is actually useful and it shouldn't be too difficult to actually do

00:01:42.820 | it either.

00:01:44.020 | So the idea I have so far is what you can see here, so I want to obviously build this

00:01:49.700 | personal chatbot, I think it's better to call it an AI assistant than chatbot because that's

00:01:54.460 | kind of like one component of it and the first thing that I'm going to work on is can it

00:01:59.900 | help me with like researching papers from archive.

00:02:04.700 | So this will be an incredibly helpful thing because I spend a lot of my time just kind

00:02:09.820 | of like reading things and trying to research things and you know what I always find is

00:02:15.380 | that I end up going down like a rabbit hole of papers, like there's this paper and to

00:02:20.220 | understand what it's talking about I need to go and read another paper and then to understand

00:02:24.140 | that I need to go and read another paper and that's fine but I think it would be so useful

00:02:29.940 | to actually just be able to ask a question in like a chat or in like a search bar or

00:02:37.500 | something and get a answer that kind of takes all of these relevant parts of the research

00:02:45.420 | and summarizes it to me and also tells me where that's from so I can read into it more.

00:02:50.260 | So that would be incredibly useful and that's my main focus right now actually building

00:02:56.540 | this part of it and to some degree it's kind of there so let me just show you what we have

00:03:02.020 | so far with that.

00:03:03.020 | Okay so we're going to start with a couple of questions I've already asked and this is

00:03:08.220 | just in a notebook at the moment so there's a ton of code that I've just run in order

00:03:12.260 | to get to this point and then we'll just see a couple of questions just so you can get

00:03:16.140 | a feel for what the idea behind this might look like when it is more kind of fleshed

00:03:22.740 | out.

00:03:23.740 | So right now what we're working with is a very small portion of data set and of course

00:03:29.820 | this is not really in a chatbot like format it only has the archive data in there it can't

00:03:34.580 | really do that much right now.

00:03:36.680 | This is really like the first component of the idea of an AI assistant.

00:03:42.820 | So yeah let's take a look.

00:03:44.340 | Okay so now I'm asking something that is kind of in there what is the latest research on

00:03:50.340 | reasoning and acting in language models?

00:03:53.240 | So if I come down we get this paper here towards reasoning in large language models a survey

00:03:58.140 | it does straight away like a very it seems like a very useful paper but this is quite

00:04:04.340 | a lot of text maybe I don't want to read all this it's not in the best format either.

00:04:08.420 | So what we can do is put them all together and feed them through an OpenAI completion

00:04:13.780 | endpoint so doing text of entry 003 so we come down here and then this is asking the

00:04:20.460 | question now and this does take a little bit of time because I have a lot of information

00:04:25.620 | going into the large language model and small amount coming out but it's you know all of

00:04:30.300 | that together is considered by the large language model and based on how much we have it's going

00:04:35.740 | to take longer or less time.

00:04:38.140 | So the latest research on reasoning and acting in language models includes techniques for

00:04:42.940 | improving and eliciting reasoning in these models.

00:04:46.180 | So this is I think probably referring to things like chain of thought which is true methods

00:04:53.620 | and benchmarks for evaluating reasoning abilities findings implications previous research in

00:04:58.260 | the fields and suggestions future.

00:05:00.820 | So it's an okay review but it's not very specific.

00:05:04.140 | So I kind of figured it probably doesn't have that much specific information on this topic

00:05:08.300 | in there at the moment because there isn't much I think I got like a month's worth of

00:05:12.740 | papers in there which is not necessarily that much.

00:05:16.800 | So run this and what we're going to do is ask this so what is the term that describes

00:05:22.200 | how large language models seem to exhibit reasoning abilities when they get to a certain

00:05:26.940 | size.

00:05:27.940 | All right this is something you know the answer to and it will hopefully give it us.

00:05:32.140 | Okay so emergent behavior that is correct so essentially this is the idea that large

00:05:37.380 | language models you keep increasing the size of them and once you get to a certain size

00:05:42.980 | it's like they can all of a sudden just do this new thing right.

00:05:48.020 | So kind of like what we're doing here once the GPT models got to a certain size all of

00:05:52.480 | a sudden they could answer questions very easily and they could do like named entity

00:05:58.720 | extraction just by you know telling them to within the prompt and all these like crazy

00:06:03.960 | things that just kind of seem to emerge out of nowhere all of a sudden even though they

00:06:07.880 | were not necessarily fine-tuned to do that in the first place.

00:06:12.600 | So that's pretty cool and then okay let's try this.

00:06:15.600 | Tell me about the idea behind immersion abilities in large language models.

00:06:20.400 | Let's run this.

00:06:21.760 | Okay cool so took quite a while and that's definitely something you need to work on but

00:06:27.040 | okay tell me about the idea behind these immersion abilities and it says okay emergent abilities

00:06:32.200 | of language models is a concept that larger models are more proficient at meta learning

00:06:37.200 | than smaller models and can acquire abilities that are not present in their smaller models

00:06:41.480 | which is true.

00:06:43.160 | This includes tasks such as few-shot prompting which is the idea that you can give a few

00:06:47.160 | examples within your prompt and the model will learn how to do something based purely

00:06:51.640 | on that.

00:06:52.640 | A transliteration from the international phonetic alphabet.

00:06:56.000 | Now I don't know whether that's true or not this is the point where you'd like okay let's

00:07:00.160 | take a look at these papers and see what they say.

00:07:02.920 | Recovering a word from its scrambled letters and Persian question answering.

00:07:06.880 | Now recovering word from scrambled letters I'm pretty sure I've seen that Persian question

00:07:11.240 | answering now I have no idea but it would not surprise me it seems pretty realistic.

00:07:17.080 | Furthermore large language models can be trained with more data which can potentially lead

00:07:22.920 | to miscorrelation between different modalities so I'm not entirely sure what that means but

00:07:31.120 | you know again we have these papers in there so we can read into it which is pretty cool.

00:07:35.800 | To mitigate the risks associated with immersion abilities researchers urge to develop up-to-date

00:07:40.960 | benchmarks to measure unforeseen behaviors in their large language models.

00:07:45.960 | So this is something that people keep talking about it's like okay we have all these benchmarks

00:07:49.600 | have been created pretty quickly but then the models are kind of saturating these benchmarks

00:07:55.560 | almost as soon as they are made like they just become irrelevant because the models

00:08:00.240 | are just so far beyond these benchmarks so quickly.

00:08:03.920 | So that's definitely like an important thing right now.

00:08:06.760 | And then final question or final serious question is what are chain of thoughts in large language

00:08:12.940 | models so this is kind of like a prompting technique hopefully it will explain it a little

00:08:17.200 | better than I can.

00:08:18.920 | So chain of thoughts are a technique used to enable complex reasoning and generate explanation

00:08:25.300 | with large language models by forcing them to explicitly verbalize their reasoning steps

00:08:30.840 | as natural language.

00:08:32.140 | So this is kind of like you ask it a question and what it's going to do is first it's going

00:08:37.320 | to kind of say okay this is your question this is the first step I'm going to take in

00:08:41.840 | order to find the answer to your question and based on that first step it's going to

00:08:46.440 | move on to the next step right and then the next step and then it will come to your answer.

00:08:51.160 | So this is kind of a very human way of going about things when we are trying to answer

00:08:57.840 | a question we will usually go through it step by step in our heads and this is essentially

00:09:02.540 | what we're getting the models to do here as well and it works very well for just improving

00:09:06.880 | the performance or the question answering abilities of these models.

00:09:11.760 | So it's really cool and this method has improved performance on a variety of tasks and sparked

00:09:16.760 | the active development of further refinements which is true there are kind of further advances

00:09:22.480 | on that one of those is the React thing that I mentioned earlier so yeah this is kind of

00:09:28.880 | getting there like the data set I have behind this is very small it's not even that relevant

00:09:33.680 | to what I'm wanting to do and I'll kind of discuss that a little more in a moment and

00:09:39.920 | yeah it's already doing relatively well which I think is pretty cool and then I wanted to

00:09:43.760 | test so can you tell me why zebras are stripy I kind of tried to force it to not answer

00:09:49.240 | questions that are not answerable by whatever you have within the data being given so I

00:09:55.240 | asked if the question cannot be answered using the information in the context answer I don't

00:10:00.280 | know so usually or at least some of the time when you do this it will say I don't know

00:10:05.040 | if it doesn't see the answer in the context but I've only had that work so far with smaller

00:10:12.280 | context where we're feeding quite a lot into here so I think I need to kind of figure out

00:10:17.240 | this prompt a bit better because when I last tried this it didn't work so come down yeah

00:10:23.640 | so like it does actually give you the answer and I also checked in these papers to make

00:10:28.820 | sure that they don't actually tell you why zebras are stripy inside them and they didn't

00:10:34.520 | it is kind of just making this up not making this up it knows this within the model weights

00:10:40.440 | but it doesn't know it from the source knowledge which is the context that we try to provide

00:10:44.920 | here so this is just something that I will be testing and hopefully at some point I can

00:10:49.640 | get it so it will say I don't know even for a really obvious question because I don't

00:10:54.120 | want it to be outputting false information now that is you know the idea of what we have

00:11:01.000 | so far so the archive part of it what we'll do is very quickly go through the rest of

00:11:04.880 | it and then I'll probably leave it there and we'll go through some of the data pre-processing

00:11:09.160 | stuff that I did in the next video right so I wanted to be able to make good writing suggestions

00:11:15.280 | so a lot of my work is writing articles so that would be insanely useful if I can get

00:11:20.920 | good writing suggestions on these niche topics because I've tried with like gpt and gpt3

00:11:26.480 | and when it comes to kind of like technical topics it doesn't really do very well it can

00:11:31.960 | help a little bit with like ideation on kind of getting out of I want to say writer's block

00:11:37.960 | but beyond that it's not actually that useful so it would be really good to get good suggestions

00:11:42.440 | on these more technical topics I would really like this if we do go for like a chatbot type

00:11:47.960 | thing to be able to talk well like chat gpt can so I can like go through a multi-step

00:11:53.600 | process of answering questions I don't know how realistic that's going to be because the

00:11:59.040 | research behind chat gpt hasn't been released so you know there's no way of knowing how

00:12:05.260 | they how they did that unless so they will probably release a chat gpt endpoint at some

00:12:09.880 | point so you know maybe that will feature in this at a later date there are other data

00:12:15.240 | sources I'd like to include like youtube videos articles so a ton of the you know when I'm

00:12:21.040 | going through things and or building something I will end up kind of forgetting things over

00:12:26.640 | time that I did you know like years ago and what I will do very often is either you know

00:12:31.760 | find someone else who's explaining it or if I know that I've explained it in the past

00:12:34.840 | I'll go back to my old videos go back to my old articles and actually use them to kind

00:12:39.320 | of relearn because it's almost like I'm speaking to myself like it's just an older version

00:12:43.400 | of myself so the language I'm using is super easy for me to understand so I would like

00:12:49.160 | to be able to index all that stuff as well like just past things I've done and as well

00:12:54.400 | as kind of including that would be my notes like in this here I'm using obsidian which

00:12:59.360 | is like a note-taking app which I've just only just started using in the past week before

00:13:04.560 | then my notes are just kind of everywhere but if I continue using this and I have a

00:13:09.040 | lot of notes in here it'd be really good to index all of that code documentation so you

00:13:15.160 | know like lang chain it's a new library chatgpt doesn't know anything about it so you can't

00:13:20.360 | ask it questions about it it would be really good to like be able to put that in our long-term

00:13:24.880 | memory component and then actually we can answer questions about it like super quickly

00:13:29.400 | which will be amazing now what will it look like yeah this lang chain pinecone for storing

00:13:37.680 | all the archive papers and everything else so one thing I did want to mention this is

00:13:42.400 | right now what you just saw all the papers I got for that they've just kind of been brought

00:13:48.960 | in bulk from scraping the archive website and using the archive API and there was no

00:13:57.320 | intelligent logic behind that what I have started working on and it's kind of you know

00:14:02.320 | it's getting there is basically you give one paper and then it's going to work through

00:14:07.440 | that paper it's going to extract the references and then it's going to find the other papers

00:14:11.400 | that were mentioned in that paper and it's kind of like a graph or a tree that kind of

00:14:16.240 | expands out or branches out from that that would be really cool because I think then

00:14:20.620 | you can be a lot more specific on what you're indexing and not just kind of index the last

00:14:27.240 | 20 years of archive which would be insane so that would be really cool maybe you can

00:14:32.760 | do a bit of both though I will say like maybe some more recent papers you just kind of get

00:14:37.800 | them in and index them in bulk right the okay React framework is kind of what I mentioned

00:14:45.040 | before there's some notes here I didn't actually mean to press that so let's go back to here

00:14:51.000 | so this is basically okay your chatbot your AI system will be able to search the internet

00:14:57.640 | it will be able to use Python to answer questions and all of these kind of like cool crazy things

00:15:05.320 | that'll be really fun to try and then there's also the chain of thought thing so I haven't

00:15:11.160 | actually tested that yet I would like to so that is in there as well and then where are

00:15:15.960 | we at the moment we are still doing step one so it's where we are for now that's all I'm

00:15:23.240 | focusing on I think that would still take a little bit of time to build and it's definitely

00:15:28.720 | going to take a couple of videos to actually walk you through everything so you know we're

00:15:33.680 | going to be on this for a little while and then we'll move on to trying to make it into

00:15:37.680 | more of a chatbot we're going to have like the endpoint and a nice like web app like

00:15:43.440 | maybe I can use angular or something it's like the only framework I've ever used before

00:15:49.040 | I don't know if that's even relevant anymore but my idea actually here was to ask chatGPT

00:15:54.280 | to help me build that so that will be interesting to to try out let's see and then finally I

00:16:00.200 | want it to be in a super easy to use format so you can literally just like git clone it

00:16:04.680 | and use it for your own things so yeah that's where we are at the moment you know for this

00:16:11.080 | video I'm going to leave it there because there's a ton of code it's super messy but

00:16:15.680 | I will take you through it in the next video where we're going to focus on actually getting

00:16:20.520 | all these archive papers and we'll kind of explore both options there so we have like

00:16:26.360 | that graph method or we have just kind of like the dumb method of bulky downloading

00:16:32.160 | everything for now that's all for this video I hope it's been interesting I'm super excited

00:16:38.040 | to see where this project goes and it's already looks kind of promising so that'll be cool

00:16:44.360 | to see so thank you very much for watching and I will see you again in the next one bye

00:16:50.600 | bye.

00:16:56.600 | you.

00:16:59.640 | you.

00:17:05.680 | you

OpenAI GPT 3.5 AI assistant with Langchain + Pinecone #1

Chapters