back to indexOpenAI GPT 3.5 AI assistant with Langchain + Pinecone #1
Chapters
0:0 Building an AI assistant with GPT 3.5
1:43 AI assistant overview (langchain etc)
3:44 Asking some questions on arXiv
11:3 Rest of the idea (understand YT, articles)
13:30 What will the AI assistant look like?
16:10 Next video: scraping arxiv
00:00:00.000 |
Okay so I'm going to share something that I've been working on for the last few days 00:00:04.000 |
and it's going to take, I think, quite a few more days for this to progress into something 00:00:08.920 |
that is in a more showable state but I wanted to kind of share that progress and I think 00:00:16.480 |
it makes sense for us to walk through the steps of building it as they are built. 00:00:23.220 |
So the idea is something that I've actually, I think, wanted to build since I kind of came 00:00:29.060 |
into the space of NLP and machine learning and in fact I think it's one of the main reasons 00:00:34.460 |
I even got interested in NLP in the first place which is actually trying to build some 00:00:40.220 |
kind of AI assistant and I don't mean the AI assistant like Siri or Alexa because honestly 00:00:48.680 |
I don't think they're that useful like okay it's cool because you can speak and the lights 00:00:53.100 |
will turn on or a certain song will play, that's great, but it's not actually helping 00:00:59.100 |
me do things in my daily life that I, you know, personally would think of as productive. 00:01:05.300 |
So what I've always wanted to build is an AI assistant that can actually do that but 00:01:09.520 |
unfortunately it's always been kind of out of reach to do anything like this, that is 00:01:14.780 |
until very recently with the release of these crazy AI models we've seen CheckGPT, obviously 00:01:23.060 |
GPT-3 for quite a while, we have long-term memory, we have essentially all of these different 00:01:29.500 |
components are kind of coming into place where we're now at the stage that we can build an 00:01:35.860 |
AI assistant that is actually useful and it shouldn't be too difficult to actually do 00:01:44.020 |
So the idea I have so far is what you can see here, so I want to obviously build this 00:01:49.700 |
personal chatbot, I think it's better to call it an AI assistant than chatbot because that's 00:01:54.460 |
kind of like one component of it and the first thing that I'm going to work on is can it 00:01:59.900 |
help me with like researching papers from archive. 00:02:04.700 |
So this will be an incredibly helpful thing because I spend a lot of my time just kind 00:02:09.820 |
of like reading things and trying to research things and you know what I always find is 00:02:15.380 |
that I end up going down like a rabbit hole of papers, like there's this paper and to 00:02:20.220 |
understand what it's talking about I need to go and read another paper and then to understand 00:02:24.140 |
that I need to go and read another paper and that's fine but I think it would be so useful 00:02:29.940 |
to actually just be able to ask a question in like a chat or in like a search bar or 00:02:37.500 |
something and get a answer that kind of takes all of these relevant parts of the research 00:02:45.420 |
and summarizes it to me and also tells me where that's from so I can read into it more. 00:02:50.260 |
So that would be incredibly useful and that's my main focus right now actually building 00:02:56.540 |
this part of it and to some degree it's kind of there so let me just show you what we have 00:03:03.020 |
Okay so we're going to start with a couple of questions I've already asked and this is 00:03:08.220 |
just in a notebook at the moment so there's a ton of code that I've just run in order 00:03:12.260 |
to get to this point and then we'll just see a couple of questions just so you can get 00:03:16.140 |
a feel for what the idea behind this might look like when it is more kind of fleshed 00:03:23.740 |
So right now what we're working with is a very small portion of data set and of course 00:03:29.820 |
this is not really in a chatbot like format it only has the archive data in there it can't 00:03:36.680 |
This is really like the first component of the idea of an AI assistant. 00:03:44.340 |
Okay so now I'm asking something that is kind of in there what is the latest research on 00:03:53.240 |
So if I come down we get this paper here towards reasoning in large language models a survey 00:03:58.140 |
it does straight away like a very it seems like a very useful paper but this is quite 00:04:04.340 |
a lot of text maybe I don't want to read all this it's not in the best format either. 00:04:08.420 |
So what we can do is put them all together and feed them through an OpenAI completion 00:04:13.780 |
endpoint so doing text of entry 003 so we come down here and then this is asking the 00:04:20.460 |
question now and this does take a little bit of time because I have a lot of information 00:04:25.620 |
going into the large language model and small amount coming out but it's you know all of 00:04:30.300 |
that together is considered by the large language model and based on how much we have it's going 00:04:38.140 |
So the latest research on reasoning and acting in language models includes techniques for 00:04:42.940 |
improving and eliciting reasoning in these models. 00:04:46.180 |
So this is I think probably referring to things like chain of thought which is true methods 00:04:53.620 |
and benchmarks for evaluating reasoning abilities findings implications previous research in 00:05:00.820 |
So it's an okay review but it's not very specific. 00:05:04.140 |
So I kind of figured it probably doesn't have that much specific information on this topic 00:05:08.300 |
in there at the moment because there isn't much I think I got like a month's worth of 00:05:12.740 |
papers in there which is not necessarily that much. 00:05:16.800 |
So run this and what we're going to do is ask this so what is the term that describes 00:05:22.200 |
how large language models seem to exhibit reasoning abilities when they get to a certain 00:05:27.940 |
All right this is something you know the answer to and it will hopefully give it us. 00:05:32.140 |
Okay so emergent behavior that is correct so essentially this is the idea that large 00:05:37.380 |
language models you keep increasing the size of them and once you get to a certain size 00:05:42.980 |
it's like they can all of a sudden just do this new thing right. 00:05:48.020 |
So kind of like what we're doing here once the GPT models got to a certain size all of 00:05:52.480 |
a sudden they could answer questions very easily and they could do like named entity 00:05:58.720 |
extraction just by you know telling them to within the prompt and all these like crazy 00:06:03.960 |
things that just kind of seem to emerge out of nowhere all of a sudden even though they 00:06:07.880 |
were not necessarily fine-tuned to do that in the first place. 00:06:12.600 |
So that's pretty cool and then okay let's try this. 00:06:15.600 |
Tell me about the idea behind immersion abilities in large language models. 00:06:21.760 |
Okay cool so took quite a while and that's definitely something you need to work on but 00:06:27.040 |
okay tell me about the idea behind these immersion abilities and it says okay emergent abilities 00:06:32.200 |
of language models is a concept that larger models are more proficient at meta learning 00:06:37.200 |
than smaller models and can acquire abilities that are not present in their smaller models 00:06:43.160 |
This includes tasks such as few-shot prompting which is the idea that you can give a few 00:06:47.160 |
examples within your prompt and the model will learn how to do something based purely 00:06:52.640 |
A transliteration from the international phonetic alphabet. 00:06:56.000 |
Now I don't know whether that's true or not this is the point where you'd like okay let's 00:07:00.160 |
take a look at these papers and see what they say. 00:07:02.920 |
Recovering a word from its scrambled letters and Persian question answering. 00:07:06.880 |
Now recovering word from scrambled letters I'm pretty sure I've seen that Persian question 00:07:11.240 |
answering now I have no idea but it would not surprise me it seems pretty realistic. 00:07:17.080 |
Furthermore large language models can be trained with more data which can potentially lead 00:07:22.920 |
to miscorrelation between different modalities so I'm not entirely sure what that means but 00:07:31.120 |
you know again we have these papers in there so we can read into it which is pretty cool. 00:07:35.800 |
To mitigate the risks associated with immersion abilities researchers urge to develop up-to-date 00:07:40.960 |
benchmarks to measure unforeseen behaviors in their large language models. 00:07:45.960 |
So this is something that people keep talking about it's like okay we have all these benchmarks 00:07:49.600 |
have been created pretty quickly but then the models are kind of saturating these benchmarks 00:07:55.560 |
almost as soon as they are made like they just become irrelevant because the models 00:08:00.240 |
are just so far beyond these benchmarks so quickly. 00:08:03.920 |
So that's definitely like an important thing right now. 00:08:06.760 |
And then final question or final serious question is what are chain of thoughts in large language 00:08:12.940 |
models so this is kind of like a prompting technique hopefully it will explain it a little 00:08:18.920 |
So chain of thoughts are a technique used to enable complex reasoning and generate explanation 00:08:25.300 |
with large language models by forcing them to explicitly verbalize their reasoning steps 00:08:32.140 |
So this is kind of like you ask it a question and what it's going to do is first it's going 00:08:37.320 |
to kind of say okay this is your question this is the first step I'm going to take in 00:08:41.840 |
order to find the answer to your question and based on that first step it's going to 00:08:46.440 |
move on to the next step right and then the next step and then it will come to your answer. 00:08:51.160 |
So this is kind of a very human way of going about things when we are trying to answer 00:08:57.840 |
a question we will usually go through it step by step in our heads and this is essentially 00:09:02.540 |
what we're getting the models to do here as well and it works very well for just improving 00:09:06.880 |
the performance or the question answering abilities of these models. 00:09:11.760 |
So it's really cool and this method has improved performance on a variety of tasks and sparked 00:09:16.760 |
the active development of further refinements which is true there are kind of further advances 00:09:22.480 |
on that one of those is the React thing that I mentioned earlier so yeah this is kind of 00:09:28.880 |
getting there like the data set I have behind this is very small it's not even that relevant 00:09:33.680 |
to what I'm wanting to do and I'll kind of discuss that a little more in a moment and 00:09:39.920 |
yeah it's already doing relatively well which I think is pretty cool and then I wanted to 00:09:43.760 |
test so can you tell me why zebras are stripy I kind of tried to force it to not answer 00:09:49.240 |
questions that are not answerable by whatever you have within the data being given so I 00:09:55.240 |
asked if the question cannot be answered using the information in the context answer I don't 00:10:00.280 |
know so usually or at least some of the time when you do this it will say I don't know 00:10:05.040 |
if it doesn't see the answer in the context but I've only had that work so far with smaller 00:10:12.280 |
context where we're feeding quite a lot into here so I think I need to kind of figure out 00:10:17.240 |
this prompt a bit better because when I last tried this it didn't work so come down yeah 00:10:23.640 |
so like it does actually give you the answer and I also checked in these papers to make 00:10:28.820 |
sure that they don't actually tell you why zebras are stripy inside them and they didn't 00:10:34.520 |
it is kind of just making this up not making this up it knows this within the model weights 00:10:40.440 |
but it doesn't know it from the source knowledge which is the context that we try to provide 00:10:44.920 |
here so this is just something that I will be testing and hopefully at some point I can 00:10:49.640 |
get it so it will say I don't know even for a really obvious question because I don't 00:10:54.120 |
want it to be outputting false information now that is you know the idea of what we have 00:11:01.000 |
so far so the archive part of it what we'll do is very quickly go through the rest of 00:11:04.880 |
it and then I'll probably leave it there and we'll go through some of the data pre-processing 00:11:09.160 |
stuff that I did in the next video right so I wanted to be able to make good writing suggestions 00:11:15.280 |
so a lot of my work is writing articles so that would be insanely useful if I can get 00:11:20.920 |
good writing suggestions on these niche topics because I've tried with like gpt and gpt3 00:11:26.480 |
and when it comes to kind of like technical topics it doesn't really do very well it can 00:11:31.960 |
help a little bit with like ideation on kind of getting out of I want to say writer's block 00:11:37.960 |
but beyond that it's not actually that useful so it would be really good to get good suggestions 00:11:42.440 |
on these more technical topics I would really like this if we do go for like a chatbot type 00:11:47.960 |
thing to be able to talk well like chat gpt can so I can like go through a multi-step 00:11:53.600 |
process of answering questions I don't know how realistic that's going to be because the 00:11:59.040 |
research behind chat gpt hasn't been released so you know there's no way of knowing how 00:12:05.260 |
they how they did that unless so they will probably release a chat gpt endpoint at some 00:12:09.880 |
point so you know maybe that will feature in this at a later date there are other data 00:12:15.240 |
sources I'd like to include like youtube videos articles so a ton of the you know when I'm 00:12:21.040 |
going through things and or building something I will end up kind of forgetting things over 00:12:26.640 |
time that I did you know like years ago and what I will do very often is either you know 00:12:31.760 |
find someone else who's explaining it or if I know that I've explained it in the past 00:12:34.840 |
I'll go back to my old videos go back to my old articles and actually use them to kind 00:12:39.320 |
of relearn because it's almost like I'm speaking to myself like it's just an older version 00:12:43.400 |
of myself so the language I'm using is super easy for me to understand so I would like 00:12:49.160 |
to be able to index all that stuff as well like just past things I've done and as well 00:12:54.400 |
as kind of including that would be my notes like in this here I'm using obsidian which 00:12:59.360 |
is like a note-taking app which I've just only just started using in the past week before 00:13:04.560 |
then my notes are just kind of everywhere but if I continue using this and I have a 00:13:09.040 |
lot of notes in here it'd be really good to index all of that code documentation so you 00:13:15.160 |
know like lang chain it's a new library chatgpt doesn't know anything about it so you can't 00:13:20.360 |
ask it questions about it it would be really good to like be able to put that in our long-term 00:13:24.880 |
memory component and then actually we can answer questions about it like super quickly 00:13:29.400 |
which will be amazing now what will it look like yeah this lang chain pinecone for storing 00:13:37.680 |
all the archive papers and everything else so one thing I did want to mention this is 00:13:42.400 |
right now what you just saw all the papers I got for that they've just kind of been brought 00:13:48.960 |
in bulk from scraping the archive website and using the archive API and there was no 00:13:57.320 |
intelligent logic behind that what I have started working on and it's kind of you know 00:14:02.320 |
it's getting there is basically you give one paper and then it's going to work through 00:14:07.440 |
that paper it's going to extract the references and then it's going to find the other papers 00:14:11.400 |
that were mentioned in that paper and it's kind of like a graph or a tree that kind of 00:14:16.240 |
expands out or branches out from that that would be really cool because I think then 00:14:20.620 |
you can be a lot more specific on what you're indexing and not just kind of index the last 00:14:27.240 |
20 years of archive which would be insane so that would be really cool maybe you can 00:14:32.760 |
do a bit of both though I will say like maybe some more recent papers you just kind of get 00:14:37.800 |
them in and index them in bulk right the okay React framework is kind of what I mentioned 00:14:45.040 |
before there's some notes here I didn't actually mean to press that so let's go back to here 00:14:51.000 |
so this is basically okay your chatbot your AI system will be able to search the internet 00:14:57.640 |
it will be able to use Python to answer questions and all of these kind of like cool crazy things 00:15:05.320 |
that'll be really fun to try and then there's also the chain of thought thing so I haven't 00:15:11.160 |
actually tested that yet I would like to so that is in there as well and then where are 00:15:15.960 |
we at the moment we are still doing step one so it's where we are for now that's all I'm 00:15:23.240 |
focusing on I think that would still take a little bit of time to build and it's definitely 00:15:28.720 |
going to take a couple of videos to actually walk you through everything so you know we're 00:15:33.680 |
going to be on this for a little while and then we'll move on to trying to make it into 00:15:37.680 |
more of a chatbot we're going to have like the endpoint and a nice like web app like 00:15:43.440 |
maybe I can use angular or something it's like the only framework I've ever used before 00:15:49.040 |
I don't know if that's even relevant anymore but my idea actually here was to ask chatGPT 00:15:54.280 |
to help me build that so that will be interesting to to try out let's see and then finally I 00:16:00.200 |
want it to be in a super easy to use format so you can literally just like git clone it 00:16:04.680 |
and use it for your own things so yeah that's where we are at the moment you know for this 00:16:11.080 |
video I'm going to leave it there because there's a ton of code it's super messy but 00:16:15.680 |
I will take you through it in the next video where we're going to focus on actually getting 00:16:20.520 |
all these archive papers and we'll kind of explore both options there so we have like 00:16:26.360 |
that graph method or we have just kind of like the dumb method of bulky downloading 00:16:32.160 |
everything for now that's all for this video I hope it's been interesting I'm super excited 00:16:38.040 |
to see where this project goes and it's already looks kind of promising so that'll be cool 00:16:44.360 |
to see so thank you very much for watching and I will see you again in the next one bye