Hey hey hey, how's everyone? This is Simrit Hanspal, Technical Evangelist at Hasura and today I'm going to talk to you about building efficient hybrid drag queries. Let us understand this with the use case of product search in e-commerce domain. Present day, product search is mostly keyword based. Keywords are not great at capturing the complete intent of the user's search query.
So you want to move to using natural language. But product search can be either contextual where you're looking for, where you're searching for product based on the descriptive nature or it can be completely structured where you're querying based on the structured fields or it can be both. Large language models are great but they're frozen in time and they cannot solve tasks on data they have not seen before.
One of the ways to expose the unseen data to large language model is by providing context to the question alongside the question. This helps the large language model generate more accurate and grounded answers. This powerful technique is called Retrieval Augmented Generation or RAG in short. So you see, we need to build a RAG pipeline for our product search use case.
We also need to make sure that our RAG pipeline is production ready and will not leak any sensitive data even if prompted. This security concern has been one of the primary concerns of enterprises when building Gen AI applications. Data driven applications have been around for a while. Then why are we talking about secure data retrieval all over again for Gen AI applications?
Well, this is because we are seeing a paradigm shift in application development. With data driven applications, data is mostly constant and it is the application or the software that evolves for any different or new functionality. For example, product search on current e-commerce websites would pick constant data fields only the records or the results would change.
While in context context driven or RAG application, the data is no longer a constant data packet and it needs to adapt to the dynamic needs of the user's natural language query. With natural language query, there is no structural limitations and it can and it gives a scope for malicious attack.
Good news, Hasura enables you to build secure data API over your multiple different data sources in no time. Hasura APIs are GraphQL APIs and hence they are dynamic in nature. So you get unified, dynamic, secure data API in no time. Just what we needed. So let's get started with building a RAG pipeline for our product search use case.
Let us again look at what are the different queries that we can expect for our RAG applications. We can have semantic search where we are searching based on semantic similarity with product description from product vector DB. We can also have structured search where we are searching based on structured fields fields in the relation database like for example price and category in Postgres and this requires converting the natural language query into a structured query like SQL or GraphQL.
Then we can also have hybrid queries. These searches have the elements of both semantic and structured queries. With Hasura, we don't need to build separate data APIs for each of them. We can build a unified data API for all three of them. So let's get started. We start by connecting our multiple different data sources with Hasura and then we query it using a single GraphQL API.
I've also built a streamlet application which takes in the user input, calls the large language model, generates a GraphQL API query which then gets executed on Hasura. So let's head over to Hasura console to get a feel of what it looks like. To start, we'll go to the data tab to connect all of our different data sources.
I'm not going to do that because I have my product Postgres table and product vector table already integrated. As I mentioned before, you can use Hasura to query both your relational and vector DB and multiple data sources using a single GraphQL API. But for the sake of simplicity of this demo, I'm going to be using only the vector DB.
So I'm using VV8 in this case where I have my vectors and I have also got my price and category structured fields here. One thing to note here is that I have used Hasura's event to auto vectorize my records into my vector DB, which means as and when a new record got inserted into my Postgres table, it got auto vectorized and saved in my vector DB.
Amazing, I know. So let's go back to our API tab. This is where you can play around, execute different queries and see the results. Nice. Now that we have gotten a fairly decent sense of what Hasura console is like, we can move to the Streamlit app that I have created.
As you can see, there are a few configurations on the left hand side panel. So you have Hasura's endpoint and admin secret. This is required to connect with Hasura securely. And then I also have OpenAI's API key. This is required for the chat completion API that I'm using. So let's begin.
Let's begin with querying the three different contexts that we were talking about that we want to fetch. So let's start with purely semantic one. Let's look at the different product descriptions that we have and pick something. Let us pick products on essential oils. So let me say, show me essential oils for relaxation.
Great. So we've gotten the GraphQL query, which has identified essential oils for relaxation as the descriptive part of the query, which we want to find in our vector DB by doing a semantic search. And we can also see that we have gotten the results for this query. Nice. Let's go over and execute a structured query price is a good field to execute a structured query.
So let's say, um, actually, show me all products for less than price. 500 Wait, so it has rightly identified that there is a price filter with the less than condition. And it shows you all the different products with price less than 500. Nice. Let's execute a hybrid query now.
Let's see, looking for essential oil diffusers in the price range of 500 to 1000 dollars. Nice. So we got a GraphQL query where it identified amazing essential oil diffuser as the semantic search query and then the price filter, which is between 500 to 1000 and we received our results.
Nice. So far we have executed only the happy flows. Um, we have not looked at any other query where of unhappy flows, but let's say I had an evil intent and I wanted to execute a malicious query, uh, which is not the typical queries that we just looked at.
So I have a malicious query. Let's execute this. So this one is requesting to insert a product of hair hair oil product, um, with the name special oil and price of $10,000 category is home. Fantastic hair oil is the description. And let's also add the product ID and say this is 7001.
Okay. Let's execute this. So as you can see, it has generated a GraphQL query of type insert mutation. But what we see is that it has also inserted the query. So let's go back to our table and console and look for product ID. equal to 7001, which is remove the codes, because this is our integer field.
And there you go. We have the product, which has gotten inserted into the database. Um, this was not the intended behavior. This is not what should have happened. So let us quickly go back to our Hasura console again. And this time we are going to be defining a new role with very restricted permissions so that we only provide select permission and such that this does not happen again.
So I'm going to create a new role. Let's call it product search bot. And I'm going to provide only search permission. Let's go without any checks. I'm going to keep it really simple. Let me allow all the product, all the columns to be accessible for this role. That's about it.
Nice. So the role has gotten inserted. Now let's query the same thing with the new role. So let's say product search bot. But this time, let me just modify this query a little bit and say seven thousand two. Okay, so let's execute this and see what happens. Nice. So we got the same insert mutation query to be generated.
But this time there was an error executing this rightly so because we have defined a role which does not have the permission for insert queries. Great. So this is all from me. Thank you, everyone. Thank you once again. So let us really quickly recap. In this demo, we learned how we can use Hasura to build hybrid query context for your sophisticated RAG applications like product search.
If you like the demo or would like to use Hasura for your RAG application, please reach out to me. These are my contact details and thank you so much.