Back to Index

How BlackRock Builds Custom Knowledge Apps at Scale — Vaibhav Page & Infant Vasanth, BlackRock


Chapters

0:0
0:30 Introduction to BlackRock's AI Initiatives
1:31 Classifying AI Applications
2:22 Use Case: New Issue Operations
3:59 Challenges with Scaling AI Knowledge Apps
7:2 Architecture of BlackRock's AI Framework
8:32 Demonstration of the Sandbox
15:52 Key Takeaways from the Discussion

Transcript

. Hi, everyone, thank you for having us. Infant, director of engineering at BlackRock. This is my colleague, Viber principal engineer, and we both work for the data teams at BlackRock. And today we are going to talk about how we can scale building custom applications in BlackRock. Specifically, we're talking about AI applications and knowledge apps at BlackRock.

So just to level set before I get into the details. So BlackRock is an asset management firm, the world's largest asset manager. What we do is our portfolio managers, analysts, get a torrent of information on a daily basis. They synthesize this information, they develop an investment strategy, and then they rebalance their portfolios, which ultimately results in a particular trade.

Now the investment operations teams, you can think of that as the teams that are the backbone or the engine that make sure that all of the activities that the investment managers actually perform on a day-to-day basis, like, run smoothly. Right? So these teams are kind of responsible for, like, acquiring the data that you kind of need, right?

To actually executing a trade, running through compliance, all the way to, like, all of the post-trading activities, right? So all of these teams actually have to build these internal tools that are actually fairly complex for each of their domains. Right? So building apps and pushing out these apps relatively quickly is, like, of utmost importance to us.

Right? So if you move on to the next slide, again, if you actually classify what kind of apps we are talking about, what you'll see is that it kind of falls into, like, four different buckets. Right? One is everything to do with document extraction. So I have an app.

I kind of want to, like, extract entities out of it in that bucket. Second has to do everything with, like, "Hey, I kind of want to define a complex workflow or an automation." So I could have a case where I kind of want to run through X number of steps and then integrate to my downstream systems.

And then you have the normal, like, Q&A-type systems that you look at, like, this is your chat interfaces. And finally, like, the agentic systems. Right? So in each of these domains, what we see is we have this, like, big opportunity to leverage your models and LLMs to either augment our existing systems or, like, kind of, like, supercharge those.

Right? So that is, like, the domain we are speaking about. So I'll move quickly to one particular use case. So this is a use case that came to us, like, about, like, three to four months back. Right? And we have a team within the investment operations space. It's known as the new issue operations team.

Right? So this team is kind of responsible for setting up securities whenever there is, like, a market event. Right? So a company goes IPO. Or, like, there is, like, a stock split for a particular organization. Right? The team actually has to take the security and they have to set it up in our internal systems before our portfolio managers or traders can actually action upon it.

Right? So we kind of have to build this tool for the investment operations team. Right? To set up a particular security. This is, like, actually, honestly, this is, like, a super simplified version of what happens. But at a super high level, we have to build an app that is able to, like, ingest your prospectus or a term sheet.

It pushes it through a particular pipeline. Right? Then you talk to your domain experts. And these are, like, your business teams, your equity teams, ETF teams, et cetera. They actually know how to set up these complex instruments. You get some kind of structured output. And now that team works with the engineering teams to actually build this transformation logic and the like.

And then integrate it with our downstream applications. So you can see that this process actually takes a long time. Right? So building an app and then you're introducing new model providers. You're trying to put in, like, new strategies. There are a lot of challenges to get a single app out.

Right? We tried this with agentic systems. Doesn't quite work right now because of the complexity and the domain knowledge that's imbued in the human head. Right? So the big challenges with scale are, again, these three categories. Right? One is, you're spending a lot of time with our domain experts.

Prompt engineering. Right? So in the first phase where we have to extract these documents. Right? They're very complex. Right? Your prompt itself, in our simplest case, like, started with a couple of sentences. Before you knew it, you're trying to describe this financial instrument and it's, like, three paragraphs long.

Right? So there's this challenge of, like, hey, I have to iterate over these prompts. I have to version and compare these prompts. How do I manage it effectively? And I think even the previous speaker had mentioned, you kind of need to eval and have this data set, how good is your prompt performing?

So that's the first set of challenges in creating, like, AI apps itself. Right? How are you going to manage this? In what direction? Second set of challenges is around, like, LLM strategies. Right? What I mean by this is, like, when you're building an AI app, so to speak, you have to choose what strategy.

Am I going to use, like, a drag-based approach? Right? Or am I going to use a chain-of-thought-based approach? Even for a simple task of, like, data extraction, depending on what your instrument is, this actually varies very highly. Right? If you take, like, an investment corporate bond, like the vanilla one, it's fairly simple.

I can do this with, like, in context, pass it to model, I'm able to get my stuff back if the document size is small. Right? Some documents are, like, thousands of pages long, 10,000 pages long. Now suddenly you're like, oh, okay, I don't know if I can pass more than a million tokens into, say, the open AI models.

What do I do then? Right? Then, okay, I need to choose a different strategy. And often what we do is we have to choose different strategies and kind of mix them with your prompts to kind of build this iterative process where, like, I have to play around with my prompts, I have to play around with the different LLM strategies, and we kind of want to make that process as quickly as well.

That's a challenge. Right? Then you have, obviously, the context limitations, model limitations, different vendors, and you're trying and testing things for quite a while. And this kind of goes into the month. Right? Then the biggest challenge is, like, okay, fine. I've kind of built this app. Now what? How do I get this to deployment?

And it's this whole other set of challenges. Right? You have your traditional challenges, which has to do with distribution, access control. How am I going to fit the app to the users? But then in the AI space, it's like you have this new challenge of, like, what type of cluster am I going to deploy this to?

Right? So our equity team would come and say something like, hey, I need to analyze, you know, 500 research reports. Like, overnight, can you help me do this? Right? So, okay. If you're going to do that, I probably have to have, like, a GPU-based inference cluster that I can kind of spin up.

Right? This is the use case that I kind of described, which is the new issue set up. In that case, what we do is, okay, I don't really want to use my GPU inference cluster, et cetera. What I do instead is I use, like, a burstable cluster. Right? So all those have to be kind of, like, defined so that our app deployment phase is, like, as close to, like, a CI/CD pipeline as possible.

Then you have, like, cost controls. So these are, again, it's not an exhaustive list. I think what I'm trying to highlight is the challenges with kind of building AI apps. Right? So what we did at BlackRock is what I'm going to do is I'll kind of give you a high-level architecture, and then maybe you can dive into the details and mechanics of how this works and how we are able to build apps relatively quickly.

Right? We're able to -- we took this -- an app took us close to, like, eight months -- somewhere between three to eight months to build a single app for a complex use case, and we're able to compress time, bring it down to, like, a couple of days. Right?

We achieved that by building up this framework. What I kind of want to focus on is on the top two boxes that you see, which is your sandbox in your app factory. Right? So to the -- the data platform and the developer platform, it's like -- the names suggest, hey, the platform is for someone for ingesting data, et cetera, right?

You have an orchestration layer that has a pipeline that kind of, like, transforms it, brings it into some new format, and then you kind of distribute that as an app or a report. What kind of accelerates app development is, like, if you are able to federate out those pain points or those bottlenecks, which is, like, prompt creation or extraction templates, choosing an LLM strategy, right?

Having extraction plans, or, like -- and then building out these logic pieces, which you're calling transformer and executors. If you can get that sandbox out into the hands of the domain experts, then your iteration speed becomes really fast. Right? So you're kind of saying that, hey, I have this modular component.

Can I move across the situation really quickly? And then pass it along to an app factory, which is, like, our cloud-native operator, which takes a definition and spits out an app. Right? That's super high level. With that, quick demo. Perfect. All right. Cool. So what I'm going to show you guys is pretty slimmed down version of the actual tool we used internally.

So to start with, when the -- so we have, like, two different core components. One is the sandbox, another one is the factory. So think of sandbox as a playground for the operators to sort of, like, quickly build and refine the extraction templates, sort of run the extraction on the set of documents, and then compare and contrast the results of these extractions.

So it's sort of, like, to get started with the extraction template itself, you might have seen in the other tools, both closed and open source, they have similar concept like prompt template management, where you have certain fields that you want to extract out of the documents, and you have their corresponding prompts and some metadata that you can associate with them, such as the data type that you expect of the final result values.

But when these operators sort of, like, trying to run extractions on these documents, they need far more sort of, like, greater configuration capabilities than just, like, configuring prompts and configuring the data types that they expect for the end result. So they need, like, hey, I need to have multiple QC checks on the result values.

I need to have a lot of validations and constraints on the fields. And there might be, like, inter-field dependencies, what the fields that are getting extracted. So as Infant mentioned, with the new security operation issuance basically onboarding that stuff, there could be a case where the security or the bond is callable, and you have other fields such as call data and call price, which now needs to have a value.

So there is, like, this inter sort of, like, field dependencies that operators sort of, like, need to -- they need to take that into consideration and be able to configure that. So here's, like, what a sample extraction template looks like. So here is how a -- again, this is an example template where we have, like, issuer, callable, call price, and call data, this field set up.

And to sort of, like, add new fields, we will define the field name, define the data type that is expected out of that, define the source, whether it's extracted or derived. Not every time you want to sort of, like, run an extraction for a field, there might be a derived field that operator expects, which is sort of, like, populated through some transformation downstream.

And once -- again, whether the field is required and the field dependencies. Here is where you define what sort of, like, dependencies this field have and sort of validations, right? So this is how they set up the extraction. The next thing is the document management itself. So this is where the documents are ingested from the data platform, they are tagged according to the business category, and they're labeled, they're embedded, all of that stuff.

Okay, while -- I think while Wiver kind of brings it up, so I think what -- in essence what we're saying is we have kind of built this tool which has, like, a UI component and, like, a framework that actually lets you take these different pieces and these modular components and give it to the hands of, like, the domain experts to build out their app really quickly, right?

I think something happened. It's just saying. So let me just sort of walk you guys the -- what happens next. So, like, once you have set up the extraction templates and documents management, the operators basically run the extractions. That's where they basically see the values that they expect from these documents and sort of, like, review them.

The thing we have seen with these operators trying to use other tools -- no, this is just saying. Yeah, I did. The thing we have seen with these operators is that most of the tools that they have used in past, these tools basically does extraction. They do a pretty good job at extraction.

But when it comes to, like, hey, I need to now use this result that has been presented to me and pass it to the downstream processes, the process right now is very manual, where they have to, like, download a CSV or JSON file that run manual or add up transformation and then push it to the downstream process.

So what we have done -- and, again, I can't show you -- but what we have done is, like, build this sort of, like, low-code, no-code framework where the operators can basically essentially run the -- sort of build this transformation and execution workflows and sort of, like, have this end-to-end pipeline running.

I think, yeah, so I think we'll conclude by saying that, like, key takeaways of this, right? I would say there are, like, three key takeaways. Invest heavily on your, like, prompt engineering skills for your domain experts, especially in, like, the financial space and world. Defining and describing these documents is really hard, right?

A second is, like, educating the firm and the company on what an LLM strategy means and how to actually fix these different pieces for your particular use case. And I think the third one I would say is, hey, the key takeaway that we had is all of this is great in experimentation and prototyping mode, but if you kind of want to bring this, you have to really evaluate what your ROI is and is it going to be, like, more expensive actually spinning up an AI app versus just having, like, an off-the-shelf product that does it quicker and faster, right?

So those are the three key takeaways in terms of, like, building apps at scale. And what we have realized was, like, hey, this notion of, like, human in the loop. And the one more thing I'll add is human in the loop is super important, right? We all are, like, really tempted, like, let's go all agent tech with this.

But in the financial space with compliance, with regulations, you kind of need those four eyes check and you kind of need the human in the loop. So design for human in the loop first if you are in a highly regulated environment. Yeah, and as Info said, one thing we couldn't show is the whole app factory sort of, like, component, which is all the things that operators do through this iteration cycle through the sandbox.

They take all that knowledge, the extraction templates, the transformers and executors that build through this workflow pipeline, and through our app ecosystem within BlackRock, they sort of, like, build this custom applications that are then exposed to the users, where users of this app don't have to worry about how to configure templates or how to basically figure out how to integrate the result values into final downstream processes.

They are presented with this whole end-to-end app where they can just go and, like, sort of, like, upload documents and run extraction and sort of get the whole pipeline set up running. Yeah. With that, we'll open up for questions. I think we have, like, a minute or two left.

Yeah. Yeah. So I have a question which may directly be related to -- Good morning. Good morning. I have a question which may directly be related to the architecture that you developed. You can tell me. I can discuss later. But the question is going to be, you have developed the key takeaways.

One of those key takeaways had been in invest heavily on prompt engineering. So you have essentially automated the process from the leaf level. For example, a company is coming to an IPO from that level all the way to cataloging through ETL processes and then to finally to the data analytics.

So now, your CEO who looks at the balance sheet, assets and liability, will be using your AI the most. And for your CEO, now, what are the features involved here at the lowest level? For example, term, maturity, duration. There are so many metrics at the leaf level. How are you transforming those features from the lowest level to highest level?

I'm looking for an answer in reference to decentralized data. Yeah. I mean, I can give you a quick answer and then we can discuss in detail, like, offline. I think real quickly, like, the framework that we built was specifically targeting, like, the investment operation domain experts who are trying to build applications.

To your question of, like, hey, what does the CEO care about? Can I construct a memo that gives me my asset liabilities X, Y, Z? Those would be, like, different initiatives which may or may not use our particular framework. Thank you. But, yes, there are many reusable components in here that people can use.

Yeah. So I'm kind of wondering, you know, for something similar for each problem. Yeah. So I do, like, a lot of document products and for insurance company. Pretty much the same problems as you guys run into. So I wonder, how do you build a walls around your information extraction from the documents, right?

Because there are so many things that can go wrong, such as from ACR. Like, OLM doesn't understand what all these terms actually mean, no matter how you prompt it, right? All this stuff. So that's kind of what bothers me. Yeah. Again, I mean, we had all of that that we wanted to show.

I think a short answer to your question is in terms of, like, information security and what are the boundaries that we are putting in terms of, like, hey, we are not having data leakage or errors or understanding of, like, in terms of security, you can think of it as different layers.

All the way from, like, your infra, platform, application, right? And the user levels, there are different controls and policies in place. And it's also within your SDN network. Like, I think there are policies across the stack that we can get into in detail later that kind of addresses your concerns.

And also to your point, I think we have, like, different sort of, like, strategies that we use based on the sort of, like, the use case at hand. So it's not just, like, hey, one rag versus this. There are multiple model providers that we use, multiple different strategies, et cetera, different, like, engineering sort of tweaks.

So it's a quite complex sort of, yeah, process. All right. They're cool. Awesome. Thank you. All right. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. We'll see you next time.