back to index

EyeLevel Launch: Your RAG is Tripping, Here's the Real Reason Why


Whisper Transcript | Transcript Only Page

00:00:00.000 | Hi, my name's Ben. I'm co-founder of iLevel.ai, and I've been building
00:00:17.440 | applications powered with AI for the last 15 years. First at IBM Research, then at
00:00:24.240 | IBM Watson, later working with major brands like the Weather Channel, and now
00:00:29.400 | at iLevel, where we've built the world's most accurate and scalable RAG platform.
00:00:35.780 | Using our no-code tools and APIs, our users can upload documents and receive
00:00:41.760 | the most accurate retrievals in minutes. We've been developing our solution for
00:00:46.500 | the last four years, and we're among the first early users that were admitted to
00:00:50.640 | the GPT-3 beta program. We found it easy to get started with RAG and very difficult
00:00:56.940 | to master. In our own experience, RAG applications can have error or hallucination
00:01:02.580 | rates as high as 35%, especially when the knowledge base consists of the kinds of
00:01:07.800 | complicated documents that are commonly found in the enterprise. The source of
00:01:14.760 | these errors is rarely the LLMs or the prompts. Instead, it's typically RAG itself, or more
00:01:21.520 | specifically, the quality and relevance of retrieved content. And the problems with
00:01:26.980 | content generally fall into one of three categories: bad or improperly extracted text,
00:01:33.880 | missing information from the surrounding parts of the document that's lost during
00:01:39.040 | chunking, or visual elements that are not extracted at all. Most commonly, the problems
00:01:46.160 | with RAG are content ingestion problems, and advanced RAG techniques that help you solve
00:01:51.920 | these problems can take hundreds of hours to implement. We've spent the last four years
00:01:57.020 | tackling these difficult data engineering problems and have built the solutions to them into our
00:02:02.720 | ingestion pipeline. As a result, our users are able to build the most accurate RAG applications
00:02:09.220 | in just minutes, and our customers such as Air France and Dartmouth tell us that their RAG applications
00:02:14.980 | respond correctly more than 95% of the time. In a recent study, our platform achieved 98% accuracy
00:02:23.980 | against complicated real-world documents and outperformed some of the most popular solutions
00:02:29.980 | in market by as much as 120%. I'm going to quickly walk you through the unique approach we take to achieve this high level of accuracy,
00:02:38.560 | and I'll start by telling you that we don't use vector databases at all, and in fact, we think
00:02:44.320 | they may not be the best technology solution for a lot of RAG applications. Instead, what we do is we
00:02:50.160 | create what we call semantic objects, and we do a multi-field search across the attributes of this object.
00:02:56.320 | I'll show you what that means with a real example from Air France. Air France has been using our platform for the last year
00:03:05.280 | to build a chat GPT-like copilot for their call center agents.
00:03:09.040 | They wanted to understand
00:03:11.040 | their knowledge base, which consists of hundreds of thousands of documents just like this one,
00:03:15.040 | filled with tables, figures, and texts scattered across the pages.
00:03:21.040 | In our ingestion pipeline, the first thing we do is run a vision model that we fine-tune with millions of documents
00:03:27.040 | to identify where the images, the tables, and the text are. Then we run them through
00:03:32.800 | dedicated multimodal processing pipelines to extract the visual and written information.
00:03:38.800 | When you do RAG, you have to break apart this document into smaller chunks.
00:03:44.800 | When you do that, you quite often lose information from around the chunks. Things like what section of the document it came from,
00:03:52.560 | or even which document it came from.
00:03:54.560 | If you were to ask questions about a book and receive random paragraphs from the book, chances aren't great you'd get good answers.
00:04:02.560 | And that's kind of what's happening with chunking and the loss of the context.
00:04:07.040 | That's why we created semantic objects. It consisted of the original chunk text as well as auto-generated metadata
00:04:14.320 | that preserves the information around the text. And then we rewrite the text into two ideal formats,
00:04:20.640 | one for search and one for completion.
00:04:22.560 | Thank you.
00:04:30.240 | Let me show you what that looks like with an example.
00:04:36.320 | So this is a figure from one of Air France's documents.
00:04:39.440 | If you were to OCR this and extract the text from it,
00:04:42.640 | vectorize it, put it in your vector database, it would look something like this.
00:04:46.560 | Look at how much information is lost in the process, though.
00:04:50.000 | Instead, what comes out of our ingestion pipeline is something like this.
00:04:55.600 | And this includes both the search version as well as the completion version of the text.
00:05:04.080 | When we receive a search query, we do something similar.
00:05:07.200 | We rewrite the query into a format that's compatible with the objects.
00:05:11.680 | Then we search the entire object, the original text, the auto-generated metadata,
00:05:17.040 | and the search version of the text.
00:05:19.200 | We use a fine-tuned LLM to re-rank the results and improve the accuracy.
00:05:24.800 | And in total, in our ingestion and search, there are more than nine models that are fine-tuned
00:05:31.360 | to help deliver this kind of accuracy.
00:05:34.400 | The end result is the world's most accurate RAG platform.
00:05:38.880 | And our users are able to build enterprise-quality, production-ready applications in minutes,
00:05:44.720 | not months.
00:05:45.280 | I invite you to try it for yourself, though.
00:05:48.720 | iLevel.ai/Xray.
00:05:50.720 | iLevel.ai/Xray.
00:05:51.360 | Thank you very much.
00:05:52.400 | Thank you, Ben.
00:06:01.440 | For more information, visit us at www.Xray.com/Xray.com/Xray