back to indexSpotlight on Databricks | Code w/ Claude

00:00:00.000 |
Thank you for sticking around in here, I suppose. 00:00:16.240 |
It'd probably be the most apropos thing to say. 00:00:20.540 |
I wanted to talk a little bit about how all of this technology 00:00:25.980 |
inside large organizations and large businesses. 00:00:29.640 |
As it would turn out, the ability for us to go prototype cool stuff 00:00:34.420 |
versus us go and deliver these things into the critical path 00:00:46.600 |
Before that was at Google, where I was the leader of product 00:00:51.840 |
and before that I was the founding general manager of AWS SageMaker. 00:00:55.800 |
So I've been, as my wife says, continuing to strike out 00:00:59.920 |
as I try and get better and better at helping enterprises build AI. 00:01:07.000 |
I wanted to quickly just set a little bit of context on who Databricks, 00:01:11.140 |
why Databricks, and why is Databricks here talking to you, 00:01:15.340 |
We are a leading data platform, cross-cloud data platform, 00:01:21.020 |
tens of thousands of customers, billions of dollars in revenue, 00:01:26.240 |
and moreover, the creator of a number of open source, 00:01:36.120 |
You know, we live in a world, Brad just a minute ago, 00:01:44.420 |
And the enterprises we work with have a kind of nightmarish data scenario 00:01:49.540 |
because, you know, you talk to these large multinational banks 00:01:54.980 |
and they've done dozens, if not scores of acquisitions over the years, 00:02:00.940 |
and they have data on every cloud, in every possible vendor, 00:02:12.140 |
of this kind of transformational technological moment, 00:02:16.980 |
but they're doing it with kind of a mess in the back end, if you will, right? 00:02:21.140 |
And it turns out the problem is actually much worse than this 00:02:23.860 |
because it's not like they just have one data warehouse 00:02:29.540 |
And often the experts in one or two of these systems 00:02:33.620 |
are only experts in one or two of these systems, 00:02:40.740 |
or your streaming person isn't a Gen AI person, 00:02:46.180 |
of being able to bring your data into these systems 00:02:51.840 |
Now, I'm not going to go head on into Databricks. 00:02:53.980 |
Databricks, ultimately, we help you manage your data, 00:02:56.900 |
and then on top of that management of your data, 00:03:01.360 |
I'm going to really focus on our AI capabilities with Mosaic AI today. 00:03:06.480 |
Now, we think of this as a difference between what we call general intelligence 00:03:13.940 |
Both of these things are extraordinarily useful and extraordinarily important. 00:03:20.000 |
But as Brad talked about, particularly for businesses or large enterprises, 00:03:27.000 |
as they want to move into using this technology to automate more of their systems 00:03:32.560 |
or drive greater insights within their organization, 00:03:35.780 |
almost always it comes back to connecting it. 00:03:39.780 |
We saw here Brad connecting it to the web or connecting it to MCP servers, 00:03:44.480 |
but inevitably it comes back to trying to connect it to their data estate, right? 00:03:49.720 |
So for a really good example of this, FactSet. 00:03:52.820 |
I don't know if you guys have heard of FactSet. 00:03:54.300 |
FactSet is a financial services company that sells data about other companies. 00:03:59.420 |
They sell financial data about companies to banks and hedge funds and what have you. 00:04:07.560 |
which is now a yellow flag to me when considering employers. 00:04:12.900 |
If your employer has their own query language, 00:04:15.460 |
you've got to think about whether or not this is the right place to be. 00:04:21.160 |
who I think probably has a dozen of their own query languages. 00:04:28.060 |
which is that any customer they had who wanted to access their data, 00:04:32.380 |
they had to learn FQL, FactSet Query Language, creative name in there. 00:04:43.860 |
what if we could translate English into FactSet query language? 00:04:48.660 |
And so they went to their favorite cloud of choice. 00:04:54.600 |
I think they did a little more than the one-click rag button, 00:04:57.160 |
but they basically showed up with this massive prompt 00:05:00.900 |
of a bunch of examples and a bunch of documentation 00:05:03.900 |
and then a massive VectorDB of a bunch of prompts 00:05:07.660 |
and a bunch of documentation or a bunch of examples 00:05:14.560 |
They ended up with 59% accuracy in about 15 seconds of latency. 00:05:22.540 |
not just because it's an important customer experience metric 00:05:30.040 |
it's probably the closest thing we have to a cost metric, right? 00:05:35.700 |
And so that 15 seconds is basically 15 seconds of cost, right? 00:05:49.500 |
It's just slightly better than a coin flip kind of thing, right? 00:05:56.580 |
and tried to understand what the opportunity was, 00:06:01.440 |
And really what we did was we just decomposed the prompt 00:06:09.740 |
that that prompt was being asked to use, right? 00:06:12.380 |
So effectively what we did was we took that prompt 00:06:22.300 |
to be able to solve this problem more wholly. 00:06:55.640 |
Last I talked to them, they had it into the 90s. 00:07:00.200 |
transitioning to Claude was one of their next roadmap items. 00:07:09.440 |
from the Berkeley Artificial Intelligence Research Lab, 00:07:15.520 |
yes, there's a little bit of cross-pollination 00:07:23.100 |
right after Gen.ai kind of really hit its stride, 00:07:26.420 |
they went out and they looked at all the popular AI systems 00:07:32.620 |
And what they found was that none of these systems 00:08:10.300 |
where there is financial and reputational risk. 00:08:32.320 |
But if what you want to do is build something 00:09:01.740 |
is how do we help them drive those levels of, 00:09:45.080 |
And so being able to really start to quantify 00:09:52.740 |
We're talking about really governing the access, 00:10:34.940 |
And hopefully, we'll have news for you there. 00:10:37.320 |
So how do we get all of this to reason over your data? 00:10:42.460 |
we do that by injecting it with either the vector store 00:11:26.320 |
parameterizable function kind of thing, right? 00:11:28.420 |
So we see them creating access to these tools.