"Data readiness" is a Myth: Reliable AI with an Agentic Semantic Layer

00:00:00.000 | Hey folks, I am Anushrath. I lead the applied research team here at PromQL. PromQL, you might

00:00:23.400 | have seen as the sponsor for the reliability track here at the AI Engineers World Fair.

00:00:30.300 | So today I'll talk about the data readiness is a myth. How many of you are trying to deploy

00:00:37.400 | some kind of AI system on some kind of data in a production environment? Okay. Awesome.

00:00:44.840 | Who is trying to work with more than documents and vector databases? Okay. Whose data is

00:00:53.160 | perfect? Clean, annotated, perfect column names, table names? Anyone? See, no hands. Okay.

00:01:02.660 | And how much time does everyone... Okay, I'm not going to make this a question. You all spend

00:01:06.500 | a lot of time making your data ready, right? So that the AI can understand it, so that the

00:01:11.600 | AI understands the meanings and the relationships in your data. But it's just a pipe dream that

00:01:18.160 | we are all chasing, right? We are all chasing this perfect data dream so that our AI can finally

00:01:22.920 | work reliably on it. And that's never going to happen. So how do we still make it reliable,

00:01:27.420 | no matter how messy data we have? Okay. So tell me if this is a fact. Is this what your data

00:01:34.420 | looks like? Your... That's how you name your tables. That's how you name your columns. Sometimes

00:01:38.920 | they have null values. Sometimes you have old values, old column names. Sometimes you have

00:01:45.920 | shorthand. CST_NM. Does that customer name? Is it custom nomenclature? I don't know what that

00:01:51.920 | is. Rev_amount_ust. That's how you name your revenue. Is active. Now, is this binary? Is

00:02:00.420 | a Boolean field? Is it zero, one, true, false, null, not null? I don't know what it is, right?

00:02:04.420 | Then you have other systems. That's just one system. One table in one system. You have other

00:02:08.420 | systems which have similar things, right? It has organization name. It has total revenue.

00:02:12.420 | How does that map to your other systems? Is the revenue in cents? Is it in dollars? Decimal

00:02:19.120 | value? Floating points? What is it, right? You have no idea. Okay. So in 2019, everyone

00:02:25.420 | was saying, let's standardize everything. Move everything in Snowflake. Let's move everything

00:02:28.620 | in Databricks. And finally, our problems will be fixed. 40% complete. MDM will fix this. Master

00:02:35.740 | Data Management Team will fix this. That's their responsibility. They're still implementing their

00:02:39.920 | information. So in 2023, with the rise of AI, rise of agents, we'll create semantic layers

00:02:45.460 | that understand our data domain. I mean, it breaks every quarter. Your data domain changes

00:02:50.220 | every quarter, right? You change your tables. You change your schemas. You change your workflows.

00:02:55.760 | So in 2025, we are saying AI needs perfect data to work. And it's still waiting. And it's never

00:03:01.720 | going to happen. Right? And McKinsey said that on an average, a Fortune 500 company loses $250

00:03:07.920 | million because of poor data quality. So how do we fix this?

00:03:13.460 | So who has tried playing with semantic layers? Mem0, Atlans, semantic kernel, you have tried

00:03:20.000 | playing with it, right? How has your experience been?

00:03:22.000 | We actually use Jando to create a flexible model.

00:03:27.000 | Okay. Gotcha. Lot of pruning. Okay. And I'm assuming you're manually adding information to it,

00:03:35.540 | maintaining it, and stuff like that. Right?

00:03:38.540 | Sure. Yeah. Okay. So let's say you've added a definition, like customer acquisition cost means

00:03:45.540 | marketing spend divided by new customers. Okay? Now, this is some information your AI needs to

00:03:51.080 | answer your questions. But that's not enough, right? Like, which marketing spend? Coming from

00:03:57.080 | the brand team, from the performance team? What does a new customer even mean? Right? First purchase

00:04:02.080 | customer, reactivated customer. For what time period are we talking about? Does it include failed

00:04:07.080 | trials or not? Is it accounting for seasonality? There are so many things that you need to do,

00:04:12.620 | and you can never capture all of that in a semantic layer just by -- if you think you can manually

00:04:17.240 | add everything, you can't. Right? So you can't redefine every edge case. Knowledge graphs. Who's played

00:04:23.920 | with knowledge graphs? Graph rag? Heard of it, at least. Okay. A bunch of people. Okay. Awesome. So, let's take

00:04:31.860 | a very simple example. Assume a customer's -- assume a sales data set where you have defined this graph

00:04:40.160 | that deals map to stage, a date, and an owner. Okay? And a very simple question I ask. Show me deals at risk.

00:04:47.460 | Right? Very simple questions. The graph knows that deals map to stages, stages map to close dates. But what

00:04:52.960 | does at risk mean? Is it mean that the champion has just left? Is it mean it has been stuck in that

00:04:58.800 | stage for two months? Like, what does at risk mean to my business? Right? How do you capture that in

00:05:04.080 | a graph database? Right? How do you capture a billion rows of snowflake table in a graph database? You can't.

00:05:13.040 | So graph is also -- knowledge graph is also another solution. So the real problem here is not that we need a better

00:05:21.840 | semantic layer solution. We need a better graph rack solution. We need a better named database systems. No.

00:05:29.680 | The problem is that the AI does not speak your business language. Like, a GM in a finance domain can mean gross margin.

00:05:37.680 | But, you know, HR domain might mean general manager. Right? What does conversion mean to you? What does quarter mean to you?

00:05:41.680 | What is the definition of your quarter? I'll show you an example here with an AI system not working

00:05:47.520 | with that. And what is an active customer? Every team has their own definition. Right? So how does an AI

00:05:53.520 | speak this tribal knowledge, this tacit knowledge that you have developed while being in your company for so

00:05:58.880 | many years? Right? Your AI does not know that. Your vanilla LLMs don't know that. They're super smart,

00:06:03.920 | incredible at doing so many cool stuff. So much cool stuff. But they don't understand your business,

00:06:09.520 | your domain. Right? So traditionally, we had these analysts, these engineers, right? Whenever a business

00:06:16.960 | user or a customer had a problem, had a task, they had a question, they would go to this analyst or an

00:06:22.400 | engineer who knew about the business, who had this tribal knowledge in their head. They knew how to

00:06:27.120 | write code, SQL, whatever. They can talk to your underlying data systems. SQL, no SQL, doesn't matter.

00:06:32.480 | Any kind of data source. Right? And they have this tribal knowledge, which they use to answer your

00:06:38.560 | question with 100% reliability. They explain what they're doing. Right? And that's how you have all

00:06:43.280 | built trust in your colleagues, in your peers. Right? That's what is missing with AI. Right? This tribal

00:06:50.240 | knowledge piece that doesn't exist today with AI. And that's the problem. So the solution. The same

00:06:57.520 | semantically, but let's make it agentic. What that means is let's not try to improve it. Let's not try

00:07:03.200 | to manually add context to there continuously. No. How about we make an AI system that behaves like the

00:07:12.080 | analyst you just hired today. Day one, day zero, the analyst comes to your company, super smart,

00:07:16.880 | can do a bunch of things. Doesn't know a lot about your business yet. They start working with you.

00:07:21.360 | They mess up somewhere. You tell them, no, that's not what you should have done. This is what I mean

00:07:24.720 | when I say this. It learns, learns, learns, learns. Now this analyst 10 years later is an experienced

00:07:29.600 | analyst in your company. They know everything about your business. Right? Let's make an AI like that.

00:07:34.640 | An AI that keeps improving, keeps learning as you use it more and more, as you course correct it,

00:07:39.520 | as you steer it. But assumption is your AI needs to be correctable, explainable, steerable, already accurate

00:07:47.120 | in what it knows. Right? So let's see how you build such an AI. Right? So we're trying to replace this

00:07:53.600 | human part of the AI. Right? So that's what we have been trying to do with PromQL. It's like a day zero

00:07:59.840 | smart analyst. Right? So we take a foundational LLM and that's the whatever LLM you bring. We make it

00:08:07.280 | create PromQL plans. PromQL is basically a domain specific language which can do three tasks. Data

00:08:14.320 | retrieval, data compute aggregation, and semantics. And this is a deterministic domain language.

00:08:23.120 | And vanilla LLMs are incredible at generating. We don't have to fine tune them. Right? Now,

00:08:28.080 | within this DSL, I can ask the LLM to create this DSL whenever I ask the user to ask the question.

00:08:34.480 | Now I can execute this DSL in a deterministic runtime. I do not involve the LLM in actual execution,

00:08:40.480 | actual generation of an answer. Because if I let the LLM generate the answer, it's by default

00:08:45.760 | hallucinating. And I'm just hoping the hallucination is correct. That's how LLMs work. Right? So I'm saying,

00:08:50.560 | decouple it. Let the LLM generate the plan, and we will execute this plan in a deterministic runtime.

00:08:56.720 | And let it work on a distributed query engine, which will talk to the different data sources,

00:09:01.360 | pull out data, do whatever composition was required inside that DSL, show the answer directly to the

00:09:06.080 | user. Don't give it back to the LLM. Let's not do rag. Right? Let's not give the LLM data back to the LLM

00:09:10.640 | and make it generate the answer. Right? That's what the PromQL design is. Let me show you PromQL

00:09:17.040 | working in action. I have five minutes. So let's make this quick. Simple question. Who are my top

00:09:23.760 | five customers by revenue? You'll be like, any AI system can answer that question, dude. It's a simple

00:09:28.400 | text to SQL question. Okay. So PromQL is like, first of all, it understands what revenue means. Revenue

00:09:35.200 | means your invoice items. Okay. Cool. I'll do the math. Execute the math. And here are your top five

00:09:43.120 | customers. Oops. Nope. I didn't notice we did not get any results. Right? That's what a smart analyst

00:09:48.480 | does. In their first attempt, they realized they messed up somewhere. Okay. I see the issue now. We're

00:09:52.960 | looking for succeeded status, but the actual statuses are paid and pending. See, your data is messy. It did not

00:09:57.760 | know what was happening. Right? It figured that out. And now these are your top five based on the actual

00:10:02.320 | data that's under the hood. So that's possible. Let's run this query now. Okay? Find the unique

00:10:08.720 | customers we serve. The org-guided data is messed up, so we don't use that. Find unique orgs based on the

00:10:13.440 | email domains of the individual users. Then find the org with the third highest revenue. I'll let it run

00:10:18.640 | because it'll take time. Find the unique orgs based on the email domains of the individual users. Then

00:10:24.160 | for the org with the third highest revenue, take a look at the latest 30 support tickets. So multiple

00:10:29.600 | database, then your Zendesk support system, including the comments on those tickets, then summarize each

00:10:35.600 | ticket. Then use those summaries and extract their feelings towards our product. Create five categories

00:10:40.400 | from bad to great. And then tell me what is going well, what can be improved. And then issue up to

00:10:45.600 | $5,000 to this orgs project as, like, credits with the highest usage, which means $5,000 if their feeling

00:10:52.960 | is bad towards us and $1,000 if it's great towards us. And how do you think an AI can do this? Spread

00:10:59.280 | across your databases, your SaaS application like Zendesk, your API, like, Stripe to issue the credits, stuff

00:11:07.280 | like that. So they'll be cool. First, I'll get all the users, extract their email domains. See, this is an

00:11:12.960 | analyst explaining their thought process. You see that tiny pencil icon? I can edit their brain. I

00:11:18.320 | can tell them, no, this is not what I wanted you to do. Don't do step three. Instead of that, do these

00:11:23.440 | three steps instead. Right? I can, I can be in charge of my AI, but still every single time if I

00:11:29.360 | have to nudge my AI, the AI is going to learn. They will understand. Okay, that's what you wanted me to

00:11:34.640 | do. Makes sense. That's how you do your business. I didn't know that. Sorry. I learned now. Right?

00:11:39.040 | So, okay, cool. I should have said, just show me intermediate results step by step. But anyway,

00:11:46.240 | it says I'm about to issue $3,000 refund. Probably got a neutral sentiment from it.

00:11:52.480 | And see, it's exactly what's happening. It's saying, I got the top five domains by revenue.

00:11:56.160 | This is the sentiment. I summarized a bunch of tickets. I classified, extracted the sentiment

00:12:01.120 | out of that. And then, finally, I'm figuring out what is happening there.

00:12:04.640 | Okay. So, peritissue details. Perfect. Based on analysis.

00:12:11.840 | Petersthompson.bizzer, third IS revenue customer, with this much revenue, sentiment analysis,

00:12:15.760 | recent support tickets, blah, blah, blah. See, this AI just moved as an analyst with such a complicated

00:12:21.040 | prompt. So, now, let's look back at the learning. Right? Just because we had two and a half minutes.

00:12:28.240 | Okay. So, that's day zero. Day zero, it had to figure stuff out, figure it out, did well. Day X,

00:12:35.440 | once it's become a veteran analyst, right? It has learned a bunch. Let me make this bigger for you.

00:12:40.000 | Learned a bunch, right? So, as it's using, as you are working with it, right? It learns from it.

00:12:45.680 | There's a prompt QL learning layer, which basically improves the semantic graph and starts creating

00:12:51.200 | your company's business language. This Acme QL, Acme, assume it's the name of a company. Right?

00:12:56.480 | And now, suddenly, prompt QL becomes Acme QL. It becomes Google QL, Microsoft QL, Apple QL,

00:13:01.200 | Cisco QL, whatever company you come from. Right? And I'll show you that learning process in action. Okay?

00:13:07.760 | So, this is an example where I have purposefully named my tables extremely bad. Right? And I asked

00:13:12.960 | the question, which employees are working in departments with more than U.S. dollars, $10,000 budget.

00:13:17.760 | Okay? It says, I have no idea what you're talking about. Your data says there are three tables called

00:13:22.640 | Mork, Plug and Zorp. I have no idea what that means. I can't answer your question. I'm like,

00:13:27.280 | no worries. Can you sample a few rows from each table and figure out what table contains the employees?

00:13:32.400 | Like, cool. I think that's what I tell man. Let's go figure it out. Right? It's like, okay,

00:13:36.640 | now I see Zorp contains employee information, Plug contains department information, and Mork is like

00:13:41.040 | a junction table that you have. Okay? So, now this is your answer that you were asking for. And I'm like,

00:13:46.720 | okay, but the data is in cents, not in dollars. The budget is in cents. So, can you divide by 100,

00:13:52.320 | please? And give me the right answer now. There shouldn't be five employees. It says, cool. There

00:13:57.520 | are two employees. Perfect. Now, this is manual, but this also runs agentically in the background.

00:14:06.000 | But all I have to do is suggest my data improvements based on the recent trends. Right? Look at what

00:14:10.720 | have we spoken about. How much I had to guide you. Whatever hints I had to give you. Learn from it

00:14:17.440 | and improve your semantic layer. Right? It's like, cool. So, based on the interaction that we have,

00:14:23.440 | now I have this improved semantic layer where it's like, okay, Zorp and Plug are two tables. I need to

00:14:29.600 | add a lot more context to it in my own semantic layer. Right? And the department budget is in cents.

00:14:36.320 | Right? Cool. Apply the suggestion. Every single instance of your semantic layer is version control.

00:14:41.360 | So, a new build is created. You can always fall back to a previous build.

00:14:44.720 | And now, the next time, it's generated by Autograph. That's what we call the feature.

00:14:48.320 | And the next time, I ask the same question. Which employees are working in departments with more

00:14:51.600 | than US$10,000 in budget? It creates the right plan. And I get the right answer. Right? So, same.

00:14:59.520 | If I had to say something like this. Find accounts with the maximum suspicious, anti-money laundering,

00:15:05.920 | outgoing amounts for the first quarter for each print the account ID and name. Right? If I let my AI do this,

00:15:13.680 | the internet is a little bad. But I have this thread preloaded here. Okay. So, it gives me the answer.

00:15:22.000 | And then I'm like, no, my quarter starts in February, not in January. So, you should know that. Right?

00:15:28.160 | So, I just tell it this time. Next time, the semantic layer learns from it. Exactly the same way.

00:15:32.560 | As it will infer the meanings of these tables. It will find the relationships across your tables. Right?

00:15:37.280 | And finally, with all of this, what you have is day zero. Your AI does not know what an enterprise

00:15:48.240 | customer means. It does not know how to match customer IDs across systems. It does not know

00:15:52.800 | when your financial quarter starts. Day 30. It has figured out 47 business terms.

00:15:57.680 | It has mapped relationships across the six systems. It has discovered 12 calculation variants and 100%

00:16:03.200 | accurate on your complex tasks. That's what an agentic semantic layer allows you to do. So,

00:16:08.480 | it reduces months of work into immediate start. Like, just deploy your AI today. Let your AI

00:16:14.960 | start working on your data. Let it improve itself. No wait time. No lag time with your AI deployments.

00:16:20.800 | Right? It's self-improving and gets to 100% accuracy. That's what we've been hearing from our

00:16:26.080 | customers. It's a Fortune 500 food chain company. They evaluate 100 vendors. They realize, no, none of them work.

00:16:32.720 | Finally, they saw PromQL and it worked 100% level AI. Same with the high growth Fintech company.

00:16:38.000 | On the hardest questions, we were able to demonstrate 100% accuracy. So, reach out to us if you have these

00:16:46.560 | big problems that you want to solve and you want 100% accurate AI. On top of that, we're there for you. Thank you.

00:17:00.000 | Thank you.

"Data readiness" is a Myth: Reliable AI with an Agentic Semantic Layer — Anushrut Gupta, PromptQL

Chapters