back to indexBuilding efficient hybrid context query for LLM grounding: Simrat Hanspal

00:00:00.000 |
Hey hey hey, how's everyone? This is Simrit Hanspal, Technical Evangelist at Hasura and 00:00:23.760 |
today I'm going to talk to you about building efficient hybrid drag queries. Let us understand 00:00:30.640 |
this with the use case of product search in e-commerce domain. Present day, product search 00:00:36.500 |
is mostly keyword based. Keywords are not great at capturing the complete intent of the user's 00:00:41.840 |
search query. So you want to move to using natural language. But product search can be 00:00:46.380 |
either contextual where you're looking for, where you're searching for product based on 00:00:50.600 |
the descriptive nature or it can be completely structured where you're querying based on 00:00:55.480 |
the structured fields or it can be both. Large language models are great but they're frozen 00:01:01.280 |
in time and they cannot solve tasks on data they have not seen before. One of the ways 00:01:06.720 |
to expose the unseen data to large language model is by providing context to the question 00:01:11.840 |
alongside the question. This helps the large language model generate more accurate and grounded 00:01:17.300 |
answers. This powerful technique is called Retrieval Augmented Generation or RAG in short. So you 00:01:24.180 |
see, we need to build a RAG pipeline for our product search use case. We also need to make 00:01:29.220 |
sure that our RAG pipeline is production ready and will not leak any sensitive data even if prompted. 00:01:35.220 |
This security concern has been one of the primary concerns of enterprises when building Gen AI applications. 00:01:45.060 |
Data driven applications have been around for a while. Then why are we talking about secure 00:01:49.940 |
data retrieval all over again for Gen AI applications? Well, this is because we are 00:01:55.620 |
seeing a paradigm shift in application development. With data driven applications, data is mostly 00:02:02.580 |
constant and it is the application or the software that evolves for any different or new functionality. 00:02:09.060 |
For example, product search on current e-commerce websites would pick constant data fields only the 00:02:15.540 |
records or the results would change. While in context context driven or RAG application, the data is no 00:02:26.020 |
longer a constant data packet and it needs to adapt to the dynamic needs of the user's natural language query. 00:02:32.820 |
With natural language query, there is no structural limitations and it can and it gives a scope for 00:02:40.420 |
malicious attack. Good news, Hasura enables you to build secure data API over your multiple different 00:02:48.100 |
data sources in no time. Hasura APIs are GraphQL APIs and hence they are dynamic in nature. So you get unified, 00:02:57.140 |
dynamic, secure data API in no time. Just what we needed. So let's get started with building a RAG pipeline for 00:03:08.180 |
Let us again look at what are the different queries that we can expect for our RAG applications. We can 00:03:15.460 |
have semantic search where we are searching based on semantic similarity with product description from 00:03:21.540 |
product vector DB. We can also have structured search where we are searching based on structured fields 00:03:29.140 |
fields in the relation database like for example price and category in Postgres and this requires converting 00:03:37.300 |
the natural language query into a structured query like SQL or GraphQL. Then we can also have hybrid queries. 00:03:45.060 |
These searches have the elements of both semantic and structured queries. With Hasura, 00:03:50.100 |
we don't need to build separate data APIs for each of them. We can build a unified data API for all three of them. 00:03:56.420 |
So let's get started. We start by connecting our multiple different data sources with Hasura and 00:04:04.340 |
then we query it using a single GraphQL API. I've also built a streamlet application which takes in the 00:04:11.380 |
user input, calls the large language model, generates a GraphQL API query which then gets executed on Hasura. 00:04:20.100 |
So let's head over to Hasura console to get a feel of what it looks like. To start, 00:04:26.660 |
we'll go to the data tab to connect all of our different data sources. I'm not going to do that 00:04:32.420 |
because I have my product Postgres table and product vector table already integrated. As I mentioned before, 00:04:40.740 |
you can use Hasura to query both your relational and vector DB and multiple data sources using 00:04:48.260 |
a single GraphQL API. But for the sake of simplicity of this demo, I'm going to be using only the vector 00:04:54.340 |
DB. So I'm using VV8 in this case where I have my vectors and I have also got my price and category 00:05:03.620 |
structured fields here. One thing to note here is that I have used Hasura's event to auto vectorize 00:05:10.260 |
my records into my vector DB, which means as and when a new record got inserted into my Postgres table, 00:05:17.620 |
it got auto vectorized and saved in my vector DB. Amazing, I know. So let's go back to our API tab. 00:05:27.060 |
This is where you can play around, execute different queries and see the results. 00:05:32.340 |
Nice. Now that we have gotten a fairly decent sense of what Hasura console is like, 00:05:41.860 |
we can move to the Streamlit app that I have created. As you can see, there are a few configurations 00:05:48.980 |
on the left hand side panel. So you have Hasura's endpoint and admin secret. This is required to connect 00:05:54.500 |
with Hasura securely. And then I also have OpenAI's API key. This is required for the chat completion API 00:06:02.900 |
that I'm using. So let's begin. Let's begin with querying the three different contexts that we were 00:06:12.260 |
talking about that we want to fetch. So let's start with purely semantic one. Let's look at the different 00:06:17.940 |
product descriptions that we have and pick something. Let us pick products on essential oils. So let me say, 00:06:41.700 |
Great. So we've gotten the GraphQL query, which has identified essential oils for relaxation as the 00:06:47.540 |
descriptive part of the query, which we want to find in our vector DB by doing a semantic search. 00:06:54.340 |
And we can also see that we have gotten the results for this query. Nice. Let's go over and execute a 00:07:04.180 |
structured query price is a good field to execute a structured query. So let's say, um, actually, show me all 00:07:19.860 |
Wait, so it has rightly identified that there is a price filter with the less than condition. 00:07:33.300 |
And it shows you all the different products with price less than 500. Nice. Let's execute a hybrid query now. 00:07:41.700 |
Let's see, looking for essential oil diffusers in the price range of 500 to 1000 dollars. 00:07:58.100 |
Nice. So we got a GraphQL query where it identified amazing essential oil diffuser as the semantic search 00:08:07.300 |
query and then the price filter, which is between 500 to 1000 and we received our results. Nice. 00:08:15.460 |
So far we have executed only the happy flows. Um, we have not looked at any other query where of unhappy flows, 00:08:25.700 |
but let's say I had an evil intent and I wanted to execute a malicious query, uh, which is not the 00:08:32.660 |
typical queries that we just looked at. So I have a malicious query. Let's execute this. 00:08:39.620 |
So this one is requesting to insert a product of hair hair oil product, um, with the name special oil and 00:08:49.860 |
price of $10,000 category is home. Fantastic hair oil is the description. And let's also add 00:08:58.900 |
the product ID and say this is 7001. Okay. Let's execute this. 00:09:05.380 |
So as you can see, it has generated a GraphQL query of type insert mutation. 00:09:16.180 |
But what we see is that it has also inserted the query. So let's go back to our table and console and look for product ID. 00:09:28.580 |
equal to 7001, which is remove the codes, because this is our integer field. And there you go. We have 00:09:42.020 |
the product, which has gotten inserted into the database. Um, this was not the intended behavior. 00:09:49.300 |
This is not what should have happened. So let us quickly go back to our Hasura console again. 00:09:54.900 |
And this time we are going to be defining a new role with very restricted permissions so that we 00:10:00.980 |
only provide select permission and such that this does not happen again. So I'm going to create a new role. 00:10:09.060 |
bot. And I'm going to provide only search permission. Let's go without any checks. I'm going to keep it 00:10:17.220 |
really simple. Let me allow all the product, all the columns to be accessible for this role. 00:10:23.380 |
That's about it. Nice. So the role has gotten inserted. Now let's query the same thing with the new role. 00:10:32.100 |
So let's say product search bot. But this time, let me just modify this query a little bit and say 00:10:40.020 |
seven thousand two. Okay, so let's execute this and see what happens. 00:10:45.220 |
Nice. So we got the same insert mutation query to be generated. But this time there was an error 00:10:55.460 |
executing this rightly so because we have defined a role which does not have the permission for insert queries. 00:11:04.580 |
Great. So this is all from me. Thank you, everyone. 00:11:07.140 |
Thank you once again. So let us really quickly recap. In this demo, we learned how we can use Hasura 00:11:14.500 |
to build hybrid query context for your sophisticated RAG applications like product search. 00:11:20.420 |
If you like the demo or would like to use Hasura for your RAG application, please reach out to me. 00:11:25.860 |
These are my contact details and thank you so much.