Building efficient hybrid context query for LLM grounding: Simrat Hanspal

00:00:00.000 | Hey hey hey, how's everyone? This is Simrit Hanspal, Technical Evangelist at Hasura and

00:00:23.760 | today I'm going to talk to you about building efficient hybrid drag queries. Let us understand

00:00:30.640 | this with the use case of product search in e-commerce domain. Present day, product search

00:00:36.500 | is mostly keyword based. Keywords are not great at capturing the complete intent of the user's

00:00:41.840 | search query. So you want to move to using natural language. But product search can be

00:00:46.380 | either contextual where you're looking for, where you're searching for product based on

00:00:50.600 | the descriptive nature or it can be completely structured where you're querying based on

00:00:55.480 | the structured fields or it can be both. Large language models are great but they're frozen

00:01:01.280 | in time and they cannot solve tasks on data they have not seen before. One of the ways

00:01:06.720 | to expose the unseen data to large language model is by providing context to the question

00:01:11.840 | alongside the question. This helps the large language model generate more accurate and grounded

00:01:17.300 | answers. This powerful technique is called Retrieval Augmented Generation or RAG in short. So you

00:01:24.180 | see, we need to build a RAG pipeline for our product search use case. We also need to make

00:01:29.220 | sure that our RAG pipeline is production ready and will not leak any sensitive data even if prompted.

00:01:35.220 | This security concern has been one of the primary concerns of enterprises when building Gen AI applications.

00:01:45.060 | Data driven applications have been around for a while. Then why are we talking about secure

00:01:49.940 | data retrieval all over again for Gen AI applications? Well, this is because we are

00:01:55.620 | seeing a paradigm shift in application development. With data driven applications, data is mostly

00:02:02.580 | constant and it is the application or the software that evolves for any different or new functionality.

00:02:09.060 | For example, product search on current e-commerce websites would pick constant data fields only the

00:02:15.540 | records or the results would change. While in context context driven or RAG application, the data is no

00:02:26.020 | longer a constant data packet and it needs to adapt to the dynamic needs of the user's natural language query.

00:02:32.820 | With natural language query, there is no structural limitations and it can and it gives a scope for

00:02:40.420 | malicious attack. Good news, Hasura enables you to build secure data API over your multiple different

00:02:48.100 | data sources in no time. Hasura APIs are GraphQL APIs and hence they are dynamic in nature. So you get unified,

00:02:57.140 | dynamic, secure data API in no time. Just what we needed. So let's get started with building a RAG pipeline for

00:03:05.060 | our product search use case.

00:03:08.180 | Let us again look at what are the different queries that we can expect for our RAG applications. We can

00:03:15.460 | have semantic search where we are searching based on semantic similarity with product description from

00:03:21.540 | product vector DB. We can also have structured search where we are searching based on structured fields

00:03:29.140 | fields in the relation database like for example price and category in Postgres and this requires converting

00:03:37.300 | the natural language query into a structured query like SQL or GraphQL. Then we can also have hybrid queries.

00:03:45.060 | These searches have the elements of both semantic and structured queries. With Hasura,

00:03:50.100 | we don't need to build separate data APIs for each of them. We can build a unified data API for all three of them.

00:03:56.420 | So let's get started. We start by connecting our multiple different data sources with Hasura and

00:04:04.340 | then we query it using a single GraphQL API. I've also built a streamlet application which takes in the

00:04:11.380 | user input, calls the large language model, generates a GraphQL API query which then gets executed on Hasura.

00:04:20.100 | So let's head over to Hasura console to get a feel of what it looks like. To start,

00:04:26.660 | we'll go to the data tab to connect all of our different data sources. I'm not going to do that

00:04:32.420 | because I have my product Postgres table and product vector table already integrated. As I mentioned before,

00:04:40.740 | you can use Hasura to query both your relational and vector DB and multiple data sources using

00:04:48.260 | a single GraphQL API. But for the sake of simplicity of this demo, I'm going to be using only the vector

00:04:54.340 | DB. So I'm using VV8 in this case where I have my vectors and I have also got my price and category

00:05:03.620 | structured fields here. One thing to note here is that I have used Hasura's event to auto vectorize

00:05:10.260 | my records into my vector DB, which means as and when a new record got inserted into my Postgres table,

00:05:17.620 | it got auto vectorized and saved in my vector DB. Amazing, I know. So let's go back to our API tab.

00:05:27.060 | This is where you can play around, execute different queries and see the results.

00:05:32.340 | Nice. Now that we have gotten a fairly decent sense of what Hasura console is like,

00:05:41.860 | we can move to the Streamlit app that I have created. As you can see, there are a few configurations

00:05:48.980 | on the left hand side panel. So you have Hasura's endpoint and admin secret. This is required to connect

00:05:54.500 | with Hasura securely. And then I also have OpenAI's API key. This is required for the chat completion API

00:06:02.900 | that I'm using. So let's begin. Let's begin with querying the three different contexts that we were

00:06:12.260 | talking about that we want to fetch. So let's start with purely semantic one. Let's look at the different

00:06:17.940 | product descriptions that we have and pick something. Let us pick products on essential oils. So let me say,

00:06:26.740 | show me essential oils for relaxation.

00:06:41.700 | Great. So we've gotten the GraphQL query, which has identified essential oils for relaxation as the

00:06:47.540 | descriptive part of the query, which we want to find in our vector DB by doing a semantic search.

00:06:54.340 | And we can also see that we have gotten the results for this query. Nice. Let's go over and execute a

00:07:04.180 | structured query price is a good field to execute a structured query. So let's say, um, actually, show me all

00:07:12.260 | products for less than

00:07:14.900 | price. 500

00:07:19.860 | Wait, so it has rightly identified that there is a price filter with the less than condition.

00:07:33.300 | And it shows you all the different products with price less than 500. Nice. Let's execute a hybrid query now.

00:07:41.700 | Let's see, looking for essential oil diffusers in the price range of 500 to 1000 dollars.

00:07:58.100 | Nice. So we got a GraphQL query where it identified amazing essential oil diffuser as the semantic search

00:08:07.300 | query and then the price filter, which is between 500 to 1000 and we received our results. Nice.

00:08:15.460 | So far we have executed only the happy flows. Um, we have not looked at any other query where of unhappy flows,

00:08:25.700 | but let's say I had an evil intent and I wanted to execute a malicious query, uh, which is not the

00:08:32.660 | typical queries that we just looked at. So I have a malicious query. Let's execute this.

00:08:39.620 | So this one is requesting to insert a product of hair hair oil product, um, with the name special oil and

00:08:49.860 | price of $10,000 category is home. Fantastic hair oil is the description. And let's also add

00:08:58.900 | the product ID and say this is 7001. Okay. Let's execute this.

00:09:05.380 | So as you can see, it has generated a GraphQL query of type insert mutation.

00:09:16.180 | But what we see is that it has also inserted the query. So let's go back to our table and console and look for product ID.

00:09:28.580 | equal to 7001, which is remove the codes, because this is our integer field. And there you go. We have

00:09:42.020 | the product, which has gotten inserted into the database. Um, this was not the intended behavior.

00:09:49.300 | This is not what should have happened. So let us quickly go back to our Hasura console again.

00:09:54.900 | And this time we are going to be defining a new role with very restricted permissions so that we

00:10:00.980 | only provide select permission and such that this does not happen again. So I'm going to create a new role.

00:10:07.220 | Let's call it product search

00:10:09.060 | bot. And I'm going to provide only search permission. Let's go without any checks. I'm going to keep it

00:10:17.220 | really simple. Let me allow all the product, all the columns to be accessible for this role.

00:10:23.380 | That's about it. Nice. So the role has gotten inserted. Now let's query the same thing with the new role.

00:10:32.100 | So let's say product search bot. But this time, let me just modify this query a little bit and say

00:10:40.020 | seven thousand two. Okay, so let's execute this and see what happens.

00:10:45.220 | Nice. So we got the same insert mutation query to be generated. But this time there was an error

00:10:55.460 | executing this rightly so because we have defined a role which does not have the permission for insert queries.

00:11:04.580 | Great. So this is all from me. Thank you, everyone.

00:11:07.140 | Thank you once again. So let us really quickly recap. In this demo, we learned how we can use Hasura

00:11:14.500 | to build hybrid query context for your sophisticated RAG applications like product search.

00:11:20.420 | If you like the demo or would like to use Hasura for your RAG application, please reach out to me.

00:11:25.860 | These are my contact details and thank you so much.