back to index"Data readiness" is a Myth: Reliable AI with an Agentic Semantic Layer — Anushrut Gupta, PromptQL

Chapters
0:0 Introduction
1:30 Data Domain
6:10 Tribal Knowledge
7:50 PromptQL
12:20 Learning
15:40 Day Zero
00:00:00.000 |
Hey folks, I am Anushrath. I lead the applied research team here at PromQL. PromQL, you might 00:00:23.400 |
have seen as the sponsor for the reliability track here at the AI Engineers World Fair. 00:00:30.300 |
So today I'll talk about the data readiness is a myth. How many of you are trying to deploy 00:00:37.400 |
some kind of AI system on some kind of data in a production environment? Okay. Awesome. 00:00:44.840 |
Who is trying to work with more than documents and vector databases? Okay. Whose data is 00:00:53.160 |
perfect? Clean, annotated, perfect column names, table names? Anyone? See, no hands. Okay. 00:01:02.660 |
And how much time does everyone... Okay, I'm not going to make this a question. You all spend 00:01:06.500 |
a lot of time making your data ready, right? So that the AI can understand it, so that the 00:01:11.600 |
AI understands the meanings and the relationships in your data. But it's just a pipe dream that 00:01:18.160 |
we are all chasing, right? We are all chasing this perfect data dream so that our AI can finally 00:01:22.920 |
work reliably on it. And that's never going to happen. So how do we still make it reliable, 00:01:27.420 |
no matter how messy data we have? Okay. So tell me if this is a fact. Is this what your data 00:01:34.420 |
looks like? Your... That's how you name your tables. That's how you name your columns. Sometimes 00:01:38.920 |
they have null values. Sometimes you have old values, old column names. Sometimes you have 00:01:45.920 |
shorthand. CST_NM. Does that customer name? Is it custom nomenclature? I don't know what that 00:01:51.920 |
is. Rev_amount_ust. That's how you name your revenue. Is active. Now, is this binary? Is 00:02:00.420 |
a Boolean field? Is it zero, one, true, false, null, not null? I don't know what it is, right? 00:02:04.420 |
Then you have other systems. That's just one system. One table in one system. You have other 00:02:08.420 |
systems which have similar things, right? It has organization name. It has total revenue. 00:02:12.420 |
How does that map to your other systems? Is the revenue in cents? Is it in dollars? Decimal 00:02:19.120 |
value? Floating points? What is it, right? You have no idea. Okay. So in 2019, everyone 00:02:25.420 |
was saying, let's standardize everything. Move everything in Snowflake. Let's move everything 00:02:28.620 |
in Databricks. And finally, our problems will be fixed. 40% complete. MDM will fix this. Master 00:02:35.740 |
Data Management Team will fix this. That's their responsibility. They're still implementing their 00:02:39.920 |
information. So in 2023, with the rise of AI, rise of agents, we'll create semantic layers 00:02:45.460 |
that understand our data domain. I mean, it breaks every quarter. Your data domain changes 00:02:50.220 |
every quarter, right? You change your tables. You change your schemas. You change your workflows. 00:02:55.760 |
So in 2025, we are saying AI needs perfect data to work. And it's still waiting. And it's never 00:03:01.720 |
going to happen. Right? And McKinsey said that on an average, a Fortune 500 company loses $250 00:03:07.920 |
million because of poor data quality. So how do we fix this? 00:03:13.460 |
So who has tried playing with semantic layers? Mem0, Atlans, semantic kernel, you have tried 00:03:20.000 |
playing with it, right? How has your experience been? 00:03:22.000 |
We actually use Jando to create a flexible model. 00:03:27.000 |
Okay. Gotcha. Lot of pruning. Okay. And I'm assuming you're manually adding information to it, 00:03:38.540 |
Sure. Yeah. Okay. So let's say you've added a definition, like customer acquisition cost means 00:03:45.540 |
marketing spend divided by new customers. Okay? Now, this is some information your AI needs to 00:03:51.080 |
answer your questions. But that's not enough, right? Like, which marketing spend? Coming from 00:03:57.080 |
the brand team, from the performance team? What does a new customer even mean? Right? First purchase 00:04:02.080 |
customer, reactivated customer. For what time period are we talking about? Does it include failed 00:04:07.080 |
trials or not? Is it accounting for seasonality? There are so many things that you need to do, 00:04:12.620 |
and you can never capture all of that in a semantic layer just by -- if you think you can manually 00:04:17.240 |
add everything, you can't. Right? So you can't redefine every edge case. Knowledge graphs. Who's played 00:04:23.920 |
with knowledge graphs? Graph rag? Heard of it, at least. Okay. A bunch of people. Okay. Awesome. So, let's take 00:04:31.860 |
a very simple example. Assume a customer's -- assume a sales data set where you have defined this graph 00:04:40.160 |
that deals map to stage, a date, and an owner. Okay? And a very simple question I ask. Show me deals at risk. 00:04:47.460 |
Right? Very simple questions. The graph knows that deals map to stages, stages map to close dates. But what 00:04:52.960 |
does at risk mean? Is it mean that the champion has just left? Is it mean it has been stuck in that 00:04:58.800 |
stage for two months? Like, what does at risk mean to my business? Right? How do you capture that in 00:05:04.080 |
a graph database? Right? How do you capture a billion rows of snowflake table in a graph database? You can't. 00:05:13.040 |
So graph is also -- knowledge graph is also another solution. So the real problem here is not that we need a better 00:05:21.840 |
semantic layer solution. We need a better graph rack solution. We need a better named database systems. No. 00:05:29.680 |
The problem is that the AI does not speak your business language. Like, a GM in a finance domain can mean gross margin. 00:05:37.680 |
But, you know, HR domain might mean general manager. Right? What does conversion mean to you? What does quarter mean to you? 00:05:41.680 |
What is the definition of your quarter? I'll show you an example here with an AI system not working 00:05:47.520 |
with that. And what is an active customer? Every team has their own definition. Right? So how does an AI 00:05:53.520 |
speak this tribal knowledge, this tacit knowledge that you have developed while being in your company for so 00:05:58.880 |
many years? Right? Your AI does not know that. Your vanilla LLMs don't know that. They're super smart, 00:06:03.920 |
incredible at doing so many cool stuff. So much cool stuff. But they don't understand your business, 00:06:09.520 |
your domain. Right? So traditionally, we had these analysts, these engineers, right? Whenever a business 00:06:16.960 |
user or a customer had a problem, had a task, they had a question, they would go to this analyst or an 00:06:22.400 |
engineer who knew about the business, who had this tribal knowledge in their head. They knew how to 00:06:27.120 |
write code, SQL, whatever. They can talk to your underlying data systems. SQL, no SQL, doesn't matter. 00:06:32.480 |
Any kind of data source. Right? And they have this tribal knowledge, which they use to answer your 00:06:38.560 |
question with 100% reliability. They explain what they're doing. Right? And that's how you have all 00:06:43.280 |
built trust in your colleagues, in your peers. Right? That's what is missing with AI. Right? This tribal 00:06:50.240 |
knowledge piece that doesn't exist today with AI. And that's the problem. So the solution. The same 00:06:57.520 |
semantically, but let's make it agentic. What that means is let's not try to improve it. Let's not try 00:07:03.200 |
to manually add context to there continuously. No. How about we make an AI system that behaves like the 00:07:12.080 |
analyst you just hired today. Day one, day zero, the analyst comes to your company, super smart, 00:07:16.880 |
can do a bunch of things. Doesn't know a lot about your business yet. They start working with you. 00:07:21.360 |
They mess up somewhere. You tell them, no, that's not what you should have done. This is what I mean 00:07:24.720 |
when I say this. It learns, learns, learns, learns. Now this analyst 10 years later is an experienced 00:07:29.600 |
analyst in your company. They know everything about your business. Right? Let's make an AI like that. 00:07:34.640 |
An AI that keeps improving, keeps learning as you use it more and more, as you course correct it, 00:07:39.520 |
as you steer it. But assumption is your AI needs to be correctable, explainable, steerable, already accurate 00:07:47.120 |
in what it knows. Right? So let's see how you build such an AI. Right? So we're trying to replace this 00:07:53.600 |
human part of the AI. Right? So that's what we have been trying to do with PromQL. It's like a day zero 00:07:59.840 |
smart analyst. Right? So we take a foundational LLM and that's the whatever LLM you bring. We make it 00:08:07.280 |
create PromQL plans. PromQL is basically a domain specific language which can do three tasks. Data 00:08:14.320 |
retrieval, data compute aggregation, and semantics. And this is a deterministic domain language. 00:08:23.120 |
And vanilla LLMs are incredible at generating. We don't have to fine tune them. Right? Now, 00:08:28.080 |
within this DSL, I can ask the LLM to create this DSL whenever I ask the user to ask the question. 00:08:34.480 |
Now I can execute this DSL in a deterministic runtime. I do not involve the LLM in actual execution, 00:08:40.480 |
actual generation of an answer. Because if I let the LLM generate the answer, it's by default 00:08:45.760 |
hallucinating. And I'm just hoping the hallucination is correct. That's how LLMs work. Right? So I'm saying, 00:08:50.560 |
decouple it. Let the LLM generate the plan, and we will execute this plan in a deterministic runtime. 00:08:56.720 |
And let it work on a distributed query engine, which will talk to the different data sources, 00:09:01.360 |
pull out data, do whatever composition was required inside that DSL, show the answer directly to the 00:09:06.080 |
user. Don't give it back to the LLM. Let's not do rag. Right? Let's not give the LLM data back to the LLM 00:09:10.640 |
and make it generate the answer. Right? That's what the PromQL design is. Let me show you PromQL 00:09:17.040 |
working in action. I have five minutes. So let's make this quick. Simple question. Who are my top 00:09:23.760 |
five customers by revenue? You'll be like, any AI system can answer that question, dude. It's a simple 00:09:28.400 |
text to SQL question. Okay. So PromQL is like, first of all, it understands what revenue means. Revenue 00:09:35.200 |
means your invoice items. Okay. Cool. I'll do the math. Execute the math. And here are your top five 00:09:43.120 |
customers. Oops. Nope. I didn't notice we did not get any results. Right? That's what a smart analyst 00:09:48.480 |
does. In their first attempt, they realized they messed up somewhere. Okay. I see the issue now. We're 00:09:52.960 |
looking for succeeded status, but the actual statuses are paid and pending. See, your data is messy. It did not 00:09:57.760 |
know what was happening. Right? It figured that out. And now these are your top five based on the actual 00:10:02.320 |
data that's under the hood. So that's possible. Let's run this query now. Okay? Find the unique 00:10:08.720 |
customers we serve. The org-guided data is messed up, so we don't use that. Find unique orgs based on the 00:10:13.440 |
email domains of the individual users. Then find the org with the third highest revenue. I'll let it run 00:10:18.640 |
because it'll take time. Find the unique orgs based on the email domains of the individual users. Then 00:10:24.160 |
for the org with the third highest revenue, take a look at the latest 30 support tickets. So multiple 00:10:29.600 |
database, then your Zendesk support system, including the comments on those tickets, then summarize each 00:10:35.600 |
ticket. Then use those summaries and extract their feelings towards our product. Create five categories 00:10:40.400 |
from bad to great. And then tell me what is going well, what can be improved. And then issue up to 00:10:45.600 |
$5,000 to this orgs project as, like, credits with the highest usage, which means $5,000 if their feeling 00:10:52.960 |
is bad towards us and $1,000 if it's great towards us. And how do you think an AI can do this? Spread 00:10:59.280 |
across your databases, your SaaS application like Zendesk, your API, like, Stripe to issue the credits, stuff 00:11:07.280 |
like that. So they'll be cool. First, I'll get all the users, extract their email domains. See, this is an 00:11:12.960 |
analyst explaining their thought process. You see that tiny pencil icon? I can edit their brain. I 00:11:18.320 |
can tell them, no, this is not what I wanted you to do. Don't do step three. Instead of that, do these 00:11:23.440 |
three steps instead. Right? I can, I can be in charge of my AI, but still every single time if I 00:11:29.360 |
have to nudge my AI, the AI is going to learn. They will understand. Okay, that's what you wanted me to 00:11:34.640 |
do. Makes sense. That's how you do your business. I didn't know that. Sorry. I learned now. Right? 00:11:39.040 |
So, okay, cool. I should have said, just show me intermediate results step by step. But anyway, 00:11:46.240 |
it says I'm about to issue $3,000 refund. Probably got a neutral sentiment from it. 00:11:52.480 |
And see, it's exactly what's happening. It's saying, I got the top five domains by revenue. 00:11:56.160 |
This is the sentiment. I summarized a bunch of tickets. I classified, extracted the sentiment 00:12:01.120 |
out of that. And then, finally, I'm figuring out what is happening there. 00:12:04.640 |
Okay. So, peritissue details. Perfect. Based on analysis. 00:12:11.840 |
Petersthompson.bizzer, third IS revenue customer, with this much revenue, sentiment analysis, 00:12:15.760 |
recent support tickets, blah, blah, blah. See, this AI just moved as an analyst with such a complicated 00:12:21.040 |
prompt. So, now, let's look back at the learning. Right? Just because we had two and a half minutes. 00:12:28.240 |
Okay. So, that's day zero. Day zero, it had to figure stuff out, figure it out, did well. Day X, 00:12:35.440 |
once it's become a veteran analyst, right? It has learned a bunch. Let me make this bigger for you. 00:12:40.000 |
Learned a bunch, right? So, as it's using, as you are working with it, right? It learns from it. 00:12:45.680 |
There's a prompt QL learning layer, which basically improves the semantic graph and starts creating 00:12:51.200 |
your company's business language. This Acme QL, Acme, assume it's the name of a company. Right? 00:12:56.480 |
And now, suddenly, prompt QL becomes Acme QL. It becomes Google QL, Microsoft QL, Apple QL, 00:13:01.200 |
Cisco QL, whatever company you come from. Right? And I'll show you that learning process in action. Okay? 00:13:07.760 |
So, this is an example where I have purposefully named my tables extremely bad. Right? And I asked 00:13:12.960 |
the question, which employees are working in departments with more than U.S. dollars, $10,000 budget. 00:13:17.760 |
Okay? It says, I have no idea what you're talking about. Your data says there are three tables called 00:13:22.640 |
Mork, Plug and Zorp. I have no idea what that means. I can't answer your question. I'm like, 00:13:27.280 |
no worries. Can you sample a few rows from each table and figure out what table contains the employees? 00:13:32.400 |
Like, cool. I think that's what I tell man. Let's go figure it out. Right? It's like, okay, 00:13:36.640 |
now I see Zorp contains employee information, Plug contains department information, and Mork is like 00:13:41.040 |
a junction table that you have. Okay? So, now this is your answer that you were asking for. And I'm like, 00:13:46.720 |
okay, but the data is in cents, not in dollars. The budget is in cents. So, can you divide by 100, 00:13:52.320 |
please? And give me the right answer now. There shouldn't be five employees. It says, cool. There 00:13:57.520 |
are two employees. Perfect. Now, this is manual, but this also runs agentically in the background. 00:14:06.000 |
But all I have to do is suggest my data improvements based on the recent trends. Right? Look at what 00:14:10.720 |
have we spoken about. How much I had to guide you. Whatever hints I had to give you. Learn from it 00:14:17.440 |
and improve your semantic layer. Right? It's like, cool. So, based on the interaction that we have, 00:14:23.440 |
now I have this improved semantic layer where it's like, okay, Zorp and Plug are two tables. I need to 00:14:29.600 |
add a lot more context to it in my own semantic layer. Right? And the department budget is in cents. 00:14:36.320 |
Right? Cool. Apply the suggestion. Every single instance of your semantic layer is version control. 00:14:41.360 |
So, a new build is created. You can always fall back to a previous build. 00:14:44.720 |
And now, the next time, it's generated by Autograph. That's what we call the feature. 00:14:48.320 |
And the next time, I ask the same question. Which employees are working in departments with more 00:14:51.600 |
than US$10,000 in budget? It creates the right plan. And I get the right answer. Right? So, same. 00:14:59.520 |
If I had to say something like this. Find accounts with the maximum suspicious, anti-money laundering, 00:15:05.920 |
outgoing amounts for the first quarter for each print the account ID and name. Right? If I let my AI do this, 00:15:13.680 |
the internet is a little bad. But I have this thread preloaded here. Okay. So, it gives me the answer. 00:15:22.000 |
And then I'm like, no, my quarter starts in February, not in January. So, you should know that. Right? 00:15:28.160 |
So, I just tell it this time. Next time, the semantic layer learns from it. Exactly the same way. 00:15:32.560 |
As it will infer the meanings of these tables. It will find the relationships across your tables. Right? 00:15:37.280 |
And finally, with all of this, what you have is day zero. Your AI does not know what an enterprise 00:15:48.240 |
customer means. It does not know how to match customer IDs across systems. It does not know 00:15:52.800 |
when your financial quarter starts. Day 30. It has figured out 47 business terms. 00:15:57.680 |
It has mapped relationships across the six systems. It has discovered 12 calculation variants and 100% 00:16:03.200 |
accurate on your complex tasks. That's what an agentic semantic layer allows you to do. So, 00:16:08.480 |
it reduces months of work into immediate start. Like, just deploy your AI today. Let your AI 00:16:14.960 |
start working on your data. Let it improve itself. No wait time. No lag time with your AI deployments. 00:16:20.800 |
Right? It's self-improving and gets to 100% accuracy. That's what we've been hearing from our 00:16:26.080 |
customers. It's a Fortune 500 food chain company. They evaluate 100 vendors. They realize, no, none of them work. 00:16:32.720 |
Finally, they saw PromQL and it worked 100% level AI. Same with the high growth Fintech company. 00:16:38.000 |
On the hardest questions, we were able to demonstrate 100% accuracy. So, reach out to us if you have these 00:16:46.560 |
big problems that you want to solve and you want 100% accurate AI. On top of that, we're there for you. Thank you.