back to indexThe "Normsky" architecture for AI coding agents — with Beyang Liu + Steve Yegge of SourceGraph

Chapters
0:0 Intros & Backgrounds
6:20 How Steve's work on Grok inspired SourceGraph for Beyang
8:53 From code search to AI coding assistant
13:18 Comparison of coding assistants and the capabilities of Cody
16:49 The importance of context (RAG) in AI coding tools
20:33 The debate between Chomsky and Norvig approaches in AI
25:2 Code completion vs Agents as the UX
30:6 Normsky: the Norvig + Chomsky models collision
36:0 How to build the right context for coding
42:0 The death of the DSL?
46:15 LSP, Skip, Kythe, BFG, and all that fun stuff
62:0 The SourceGraph internal stack
68:46 Building on open source models
74:35 SourceGraph for engineering managers?
86:0 Lightning Round
00:00:04.200 | 
This is Alessio, partner and CTO on Residents 00:00:07.600 | 
And I'm joined by my co-host, Sweets, founder of Small.ai. 00:00:10.760 | 
Hey, and today we're christening our new podcast studio 00:00:16.200 | 
And we have Biang and Steve from Sourcegraph. 00:00:24.480 | 
We also are just celebrating the one year anniversary of ChatGPT 00:00:30.360 | 
But also we'll be talking about the GA of Cody later on today. 00:00:34.480 | 
But we'll just do a quick intros of both of you. 00:00:37.320 | 
Obviously, people can research you and check the show notes 00:00:40.880 | 
But Biang, you worked in computer vision at Stanford, 00:00:55.120 | 
Well, the end user thing was Google Code Search. 00:00:58.100 | 
That's what everyone called it, or just like CS. 00:01:00.680 | 
But the brains of it were really the Trigram index and then 00:01:08.720 | 
Today it's called Kythe, the open source Google one. 00:01:15.640 | 
you've interviewed a bunch of other code search developers, 00:01:18.760 | 
including the current developer of Kythe, right? 00:01:24.200 | 
although we would love to if they're up for it. 00:01:27.480 | 
We had Kelly Norton, who built a similar system at Etsy. 00:01:43.120 | 
--I think heavily inspired by the Trigram index that 00:02:11.040 | 
I guess the back story was, I used Google Code Search 00:02:19.360 | 
and worked elsewhere, it was the single dev tool 00:02:23.840 | 
I felt like my job was just a lot more tedious and much more 00:02:29.840 | 
And so when Quinn and I started working together at Palantir, 00:02:32.420 | 
he had also used various code search engines in open source 00:02:38.440 | 
And it was just a pain point that we both felt, 00:02:49.120 | 
large financial institutions, folks like that. 00:02:57.840 | 
made our pain points feel small by comparison. 00:03:11.960 | 
And revealed-- and you've told many, many stories. 00:03:15.160 | 
I want every single listener of "Latent Space" 00:03:17.040 | 
to check out Steve's YouTube, because he effectively 00:03:25.240 | 
You just hit record and just went on a few rants. 00:03:34.640 | 
had some interesting thoughts on just the overall Google 00:03:38.320 | 
You joined Grab as head of Eng for a couple of years. 00:03:40.720 | 
I'm from Singapore, so I have actually personally 00:04:04.560 | 
about as a good startup that people admire or look up 00:04:08.880 | 
to, on the league that you, with all your legendary experience, 00:04:18.440 | 
They actually didn't even know that they were as good 00:04:22.600 | 
They started hiring a bunch of people from Silicon Valley 00:04:28.880 | 
could have been a little better, operational excellence 00:04:32.680 | 
And the only thing about Grab is that they get criticized a lot 00:04:41.240 | 
By Singaporeans who don't want to work there. 00:04:44.400 | 
OK, well, I guess I'm biased because I'm here, 00:04:54.520 | 
because they were more Westernized than the Sanders 00:04:57.880 | 
I mean, they had their success because they are laser-focused. 00:05:02.960 | 
I mean, they're executing really, really, really well. 00:05:23.200 | 
because they're just out there with their sleeves rolled up, 00:05:35.400 | 
Yeah, in the way that super apps don't exist in the West. 00:05:38.000 | 
It's one of the greatest mysteries, enduring mysteries 00:05:48.160 | 
And it was primarily because of bandwidth reasons 00:06:04.760 | 
Any-- I think-- and that's also where you discover some need 00:06:11.360 | 
Better programming languages, better databases, 00:06:15.000 | 
I mean, I started in '95, where there was kind of nothing. 00:06:21.400 | 
you first went to Grab, because you wrote that blog post, 00:06:41.560 | 
Yeah, so I guess the back story, from my point of view, 00:06:44.880 | 
is I had used Code Search and Grok while at Google. 00:06:49.360 | 
But I didn't actually know that it was connected to you, Steve. 00:06:52.720 | 
Like, I knew you from your blog posts, which were always 00:06:55.160 | 
excellent, kind of like inside, very thoughtful takes on-- 00:06:59.640 | 
from an engineer's perspective, on some of the challenges 00:07:08.000 | 
within the context of code intelligence and code 00:07:10.120 | 
understanding, was I watched a talk that you gave, 00:07:13.720 | 
I think, at Stanford about Grok when you were first 00:07:20.640 | 
who writes the extremely thoughtful, ranty blog posts, 00:07:27.520 | 
And so that's how I knew you were kind of involved in that. 00:07:57.400 | 
I had this dagger of jealousy stabbed through me, 00:08:00.400 | 
piercingly, which I remember, because I am not 00:08:11.580 | 
I got sucked back into the ads vortex and whatever. 00:08:14.440 | 
So thank god, Sourcegraph actually kind of rescued me. 00:08:27.560 | 
Is there anything else that people should know about you 00:08:51.840 | 
this has been a company 10 years in the making. 00:08:54.480 | 
And as Sean said, now you're at the right place. 00:08:59.520 | 
Now exactly, you spent 10 years collecting all this code, 00:09:02.480 | 
indexing, making it easy to surface it, and how-- 00:09:05.640 | 
And also learning how to work with enterprises 00:09:07.960 | 
and having them trust you with their code bases. 00:09:10.360 | 
Because initially, you were only doing on-prem, right, like VPC, 00:09:15.880 | 
So in the very early days, we were cloud only. 00:09:22.960 | 
And that was, I think, related to the nature of the problem 00:09:27.600 | 
just a critical, unignorable pain point once you're 00:09:32.920 | 
And now Kodi is going to be GA by the time this releases. 00:09:38.360 | 
Congrats to your future self for launching this in two weeks. 00:09:42.440 | 
Can you give a quick overview of just what Kodi is? 00:09:45.280 | 
I think everybody understands that it's an AI coding agent. 00:09:49.440 | 
But a lot of companies say they have an AI coding agent. 00:09:57.680 | 
from the several dozen other AI coding agents 00:10:04.320 | 
when we thought about building a coding assistant that 00:10:08.360 | 
would do things like code generation and question 00:10:11.800 | 
think we came at it from the perspective of we've 00:10:14.600 | 
spent the past decade building the world's best code 00:10:17.880 | 
understanding engine for human developers, right? 00:10:26.280 | 
if you want to go and dive into a large, complex code base. 00:10:30.360 | 
And so our intuition was that a lot of the context 00:10:35.640 | 
would also be useful context for AI developers to consume. 00:10:43.560 | 
Kodi is very similar to a lot of other assistants. 00:10:49.640 | 
It does specific commands that automate tasks 00:10:55.640 | 
like generating unit tests or adding detailed documentation. 00:11:01.080 | 
But we think the core differentiator is really 00:11:08.280 | 
It's a bit like saying, what's the difference between Google 00:11:12.520 | 
There's not a quick checkbox list of features 00:11:15.880 | 
But it really just comes down to all the attention and detail 00:11:19.000 | 
that we've paid to making that context work well and be 00:11:24.760 | 
For human devs, we're now kind of plugging into the AI coding 00:11:30.020 | 
I mean, just to add, just to add my own perspective 00:11:40.920 | 
that the LLM has available that knows about your code. 00:11:45.000 | 
RAG provides basically a bridge to a lookup system 00:11:49.520 | 
Whereas fine-tuning would be more like on-the-job training 00:11:54.000 | 
If the LLM is a person, and you send them to a new job, 00:12:05.480 | 
because the expert knows your particular code base, 00:12:12.620 | 
And there's a chicken-and-egg problem, because we're like, 00:12:15.160 | 
well, I'm going to ask the LLM about my code. 00:12:34.640 | 
and using code search, and then starting to feel like without 00:12:40.760 | 
Once you start using these-- do you guys use coding assistants? 00:12:44.400 | 
I mean, we're getting to the point very quickly, right? 00:12:50.640 | 
almost like you're programming without the internet, right? 00:12:53.480 | 
It's like you're programming back in the '90s 00:12:59.480 | 
who have no idea about coding systems, what they are. 00:13:08.920 | 
We had Codium and Codium, very similar names. 00:13:13.180 | 
Griblet, Find, and then, of course, there's Copilot. 00:13:26.760 | 
And I think it really shows the context improvement. 00:13:43.880 | 
Versus Cody was like, oh, these are the major functions 00:13:51.280 | 
And then the other one was, how do I start this up? 00:13:56.440 | 
even though there was no start command in the package JSON. 00:14:01.680 | 
Most projects use NPM start, so maybe this does too. 00:14:05.720 | 
How do you think about open source models and private-- 00:14:12.520 | 
And I think you guys use Starcoder, if I remember right. 00:14:21.080 | 
I don't think they've officially announced what model they use. 00:14:24.000 | 
- And I think they use a range of models based on what you're 00:14:28.960 | 
No one uses the same model for inline completion 00:14:31.260 | 
versus chat, because the latency requirements for-- 00:14:44.960 | 
to get it to output just the code and not, like, hey, 00:14:48.480 | 
here's the code you asked for, like that sort of text. 00:14:54.320 | 
We've kind of designed Kodi to be especially model-- 00:15:07.680 | 
want to be able to integrate the best in class models, 00:15:11.040 | 
whether they're proprietary or open source, into Kodi, 00:15:15.200 | 
because the pace of innovation in the space is just so quick. 00:15:21.760 | 
Like today, Kodi uses Starcoder for inline completions. 00:15:25.640 | 
And with the benefit of the context that we provide, 00:15:29.440 | 
we actually show comparable completion acceptance rate 00:15:35.840 | 
that folks use to evaluate inline completion quality. 00:15:39.840 | 
what's the chance that you actually accept the completion 00:15:45.080 | 
which is at the head of the industry right now. 00:15:47.920 | 
And we've been able to do that with the Starcoder model, which 00:15:50.420 | 
is open source, and the benefit of the context fetching stuff 00:15:55.020 | 
And of course, a lot of like prompt engineering 00:16:03.640 | 
"Cheating is All You Need" about what you're building. 00:16:07.460 | 
that everybody's fighting on the same axis, which 00:16:10.000 | 
is better UI and the IDE, maybe like a better chat response. 00:16:14.400 | 
But data modes are kind of the most important thing. 00:16:22.280 | 
How do you kind of think about what other companies are 00:16:31.840 | 
I feel like you see so many people, oh, we just 00:16:34.560 | 
got a new model, and it's like a bit human eval. 00:16:36.920 | 
And it's like, wow, but maybe like that's not 00:16:42.960 | 
the importance of like the actual RAG in code? 00:16:47.040 | 
Yeah, I mean, I think that people weren't doing it much. 00:16:56.200 | 
so within the last year, I've heard a lot of rumblings 00:16:59.840 | 
Because they're undergoing a huge transformation 00:17:02.240 | 
to try to, of course, get into the new world. 00:17:07.160 | 
to go and train their own models or fine-tune their own models, 00:17:24.120 | 
Google loves to compete with themselves, right? 00:17:27.440 | 
And they had a paper on Duet, like, from a year ago. 00:17:29.880 | 
And they were doing exactly what Copilot was doing, 00:17:32.040 | 
which was just pulling in the local context, right? 00:17:38.440 | 
because we were talking about the splitting of the models. 00:17:40.840 | 
In the early days, it was the LLM did everything. 00:17:44.160 | 
And then we realized that for certain use cases, 00:17:47.000 | 
like completions, that a different, smaller, faster 00:17:53.040 | 
actually, we expected to continue and proliferate, 00:17:56.440 | 
Because fundamentally, we're a recommender engine right now. 00:18:02.080 | 
We're saying, may I interest you in this code 00:18:04.200 | 
right here so that you can answer my question? 00:18:09.180 | 
I mean, who are the best recommenders, right? 00:18:11.020 | 
There's YouTube, and Spotify, and Amazon, or whatever, right? 00:18:14.320 | 
Yeah, and they all have many, many, many, many, many models, 00:18:20.640 | 
and that's where we're headed in code, too, absolutely. 00:18:24.040 | 
Yeah, we just did an episode we released on Wednesday, 00:18:26.880 | 
which we said RAG is like Rexis, or like LLMs. 00:18:30.720 | 
You're basically just suggesting good content. 00:18:40.240 | 
is you embed everything through a vector database. 00:18:42.720 | 
You embed your query, and then you find the nearest neighbors, 00:18:49.720 | 
there's sample diversity and that kind of stuff. 00:18:52.360 | 
And then you're slowly gradient-descending yourself 00:18:58.040 | 
which has been traditional ML for a long time, 00:19:02.840 | 
Yeah, I almost think of it as a generalized search problem, 00:19:11.080 | 
and get all the potential things that could be relevant, 00:19:13.840 | 
and then there's typically a layer 2 re-ranking mechanism 00:19:20.240 | 
to get the relevant stuff to the top of the results list. 00:19:24.400 | 
Have you discovered that ranking matters a lot? 00:19:26.400 | 
So the context is that I think a lot of research 00:19:37.600 | 
and then apparently, Cloud uses the bottom better. 00:19:44.360 | 
The skill with which models are able to take advantage 00:19:47.040 | 
of context is always going to be dependent on how 00:19:49.720 | 
that factors into the impact on the training loss. 00:19:53.400 | 
So if you want long context window models to work well, 00:19:56.240 | 
then you have to have a ton of data where it's 00:20:01.200 | 
and I'm going to ask a question about something that's 00:20:04.080 | 
embedded deeply into it, and give me the right answer. 00:20:09.560 | 
then of course you're going to have variability in terms 00:20:15.320 | 
the thing that you're talking about right now, 00:20:18.280 | 
to be something that we talked about recently. 00:20:20.840 | 
Did you really just say gradient dissenting yourself? 00:20:24.640 | 
Actually, I love that it's entered the casual lexicon. 00:20:28.520 | 
My favorite version of that is how you have to p-hack papers. 00:20:43.320 | 
I think the other interesting thing that you have 00:20:45.360 | 
is inline-assist UX that is, I wouldn't say async, 00:20:53.240 | 
So you can ask Kodi to make changes on a code block, 00:20:55.840 | 
and you can still edit the same file at the same time. 00:21:08.040 | 
messing each other up as they make changes in the code? 00:21:12.920 | 
and what do you think about where the attack is going? 00:21:18.200 | 
So we actually had this feature in the very first launch 00:21:25.040 | 
And you could have multiple basically LLM requests 00:21:31.200 | 
And he wrote a bunch of codes to handle all of the diffing 00:21:40.960 | 
And it just felt like it was just a little before its time. 00:21:47.480 | 
was able to be reused for where inline's sitting today. 00:22:02.360 | 
and have the code update, to really like targeted features 00:22:11.320 | 
And the reason for that is, I think the challenge 00:22:16.120 | 
and we do want to get to the point where you could just 00:22:18.440 | 
fire it, forget, and have half a dozen of these running 00:22:24.720 | 
early on that a lot of people are running into now 00:22:27.200 | 
when they're trying to construct agents, which 00:22:29.920 | 
is the reliability of working code generation 00:22:36.280 | 
is just not quite there yet in today's language models. 00:22:40.920 | 
And so that kind of constrains you to an interaction 00:22:45.360 | 
where the human is always like in the inner loop, 00:22:56.840 | 
have to constrain it to a domain where today's language models 00:23:02.120 | 
So generating unit tests, that's like a well-constrained problem, 00:23:05.520 | 
or fixing a bug that shows up as a compiler error or a test 00:23:15.440 | 
this class that does x, y, and z using the libraries 00:23:21.080 | 
even with the benefit of really good context. 00:23:46.120 | 
you don't have to have a human in the loop every time. 00:23:48.440 | 
And there's also kind of like an LLM call at each stage, 00:24:15.880 | 
on the feasibility of agents with purely kind 00:24:20.680 | 
To your original question, like the inline interactions 00:24:24.960 | 
to be more targeted, like fix the current error 00:24:38.880 | 
and this is based on the user feedback that we've gotten-- 00:24:45.680 | 
you don't want to have a long chat conversation 00:24:50.200 | 
You'd rather just have it write the right thing 00:24:52.900 | 
and then move on with your life or not have to think about it. 00:24:55.480 | 
And that's what we're trying to work towards. 00:24:57.360 | 
I mean, yeah, we're not going in the agent direction. 00:25:03.600 | 
Instead, we're working on sort of solidifying 00:25:06.640 | 
our strength, which is bringing the right context in. 00:25:12.060 | 
to plug in your own context, ways for you to control 00:25:16.440 | 
happens before the request goes out, et cetera. 00:25:30.720 | 
They really mean greater automation, fully automated. 00:25:36.720 | 
And I don't have to think about it as a human. 00:25:41.840 | 
I think it's specifically the approach of, hey, 00:25:59.440 | 
It's just a reality of the behavior of language models 00:26:04.840 | 
And I think that's just a reflection of reality. 00:26:08.680 | 
Because if you look at the way that a lot of other AI tools 00:26:14.680 | 
have implemented context fetching, for instance, 00:26:23.080 | 
supposedly provides code-based level context, 00:26:27.040 | 
it has an agentic approach, where you kind of look 00:26:32.920 | 
And it feels like they're making multiple requests to the LLM, 00:26:43.480 | 
And it's a multi-hop step, so it takes a long while. 00:26:51.800 | 
And then at the end of the day, the context it fetches 00:26:59.280 | 
and then maybe crawl through the reference graph a little bit. 00:27:04.840 | 
That doesn't require any sort of LLM invocation at all. 00:27:08.520 | 
And we can pull in much better context very quickly. 00:27:13.040 | 
So it's faster, it's more reliable, it's deterministic, 00:27:20.000 | 
We just don't think you should cargo cult or naively go, 00:27:25.240 | 
try to implement agents on top of the LLMs that exist today. 00:27:29.760 | 
I think there are a couple of other technologies 00:27:35.800 | 
before we can get into these multi-stage, fully automated 00:27:40.520 | 
We're very much focused on developer inner loop right now. 00:27:50.680 | 
tackling the agents problem that you don't want to tackle? 00:27:56.960 | 
are after maybe like the same high level problem, which 00:28:05.320 | 
And can an automated system go build that software for me? 00:28:20.440 | 
Coding in some senses, it's similar and dissimilar 00:28:25.620 | 
think producing code is more difficult than playing chess 00:28:33.560 | 
And if you look at the best AI chess players, 00:28:38.440 | 
People have showed demos where it's like, oh, yeah, 00:28:40.560 | 
GPT-4 is actually a pretty decent chess move suggester. 00:28:44.760 | 
But you would never build a best-in-class chess player 00:28:58.400 | 
And then you have a way to explore that search space 00:29:02.880 | 
There's a bunch of search algorithms, essentially, 00:29:04.920 | 
where you're doing tree search in various ways. 00:29:11.840 | 
You might use an LLM to generate proposals in that space 00:29:18.840 | 
But the backbone is still this more formalized tree search 00:29:31.800 | 
that the way that we get to this more reliable multi-step 00:29:36.000 | 
workflows that can do things beyond generate unit test, 00:29:41.400 | 
it's really going to be like a search-based approach, where 00:29:43.960 | 
you use an LLM as kind of like an advisor or a proposal 00:29:54.560 | 
But it's probably not going to be the thing that 00:29:58.400 | 
Because I guess it's not the right tool for that. 00:30:07.300 | 
the words, the philosophical Peter Norvig type discussion. 00:30:11.560 | 
Maybe you want to introduce that divided in software. 00:30:20.400 | 
They're probably familiar with the classic Chomsky 00:30:24.120 | 
No, actually, I was prompting you to introduce that. 00:30:27.760 | 
So if you look at the history of artificial intelligence, 00:30:33.800 | 
I don't know, it's probably as old as modern computers, 00:30:40.680 | 
to producing a general human level of intelligence. 00:30:51.320 | 
which, roughly speaking, includes large language 00:30:58.840 | 
Basically, any model that you learn from data 00:31:04.400 | 
most of machine learning would fall under this umbrella. 00:31:06.700 | 
And that school of thought says, just learn from the data. 00:31:10.800 | 
That's the approach to reaching intelligence. 00:31:16.000 | 
like compilers, and parsers, and formal systems. 00:31:22.320 | 
about how to construct a formal, precise system. 00:31:26.120 | 
And that will be the approach to how we build 00:31:31.080 | 
Lisp, for instance, was originally an attempt to-- 00:31:38.400 | 
could create rules-based systems that you would call AI. 00:31:42.360 | 
Yeah, and for a long time, there was this debate. 00:31:47.840 | 
and others that were more in the Norvig camp. 00:31:53.760 | 
is that Norvig definitely has the upper hand right now 00:31:56.840 | 
with the advent of LLMs, and diffusion models, 00:31:59.280 | 
and all the other recent progress in machine learning. 00:32:03.840 | 
But the Chomsky-based stuff is still really useful, 00:32:17.260 | 
that you want to explore with your AI dev tool. 00:32:25.600 | 
It's a lot of what we've invested in the past decade 00:32:28.040 | 
at Sourcegraph, and what you built with Grok. 00:32:34.480 | 
construct these very precise knowledge graphs that 00:32:37.640 | 
are great context providers, and great guardrails enforcers, 00:32:41.400 | 
and safety checkers for the output of a more data-driven, 00:32:48.720 | 
fuzzier system that uses like the Norvig-based models. 00:32:57.500 | 
Basically, it's like, OK, so when I was in college, 00:33:02.000 | 
I was in college learning Lisp, and Prolog, and Planning, 00:33:04.500 | 
and all the deterministic Chomsky approaches to AI. 00:33:08.240 | 
And I was there when Norvig basically declared it dead. 00:33:12.440 | 
I was there 3,000 years ago when Norvig and Chomsky 00:33:29.160 | 
He's got so many famous short posts, amazing things. 00:33:32.080 | 
He had a famous talk, "The Unreasonable Effectiveness 00:33:38.560 | 
convinced everybody that the deterministic approaches had 00:33:41.360 | 
failed, and that heuristic-based, data-driven, 00:33:44.280 | 
statistical approaches, stochastic were better. 00:33:53.360 | 
--was that, well, the steam-powered engine-- no. 00:33:58.080 | 
The reason was that the deterministic stuff didn't 00:34:01.800 | 
They were using Prolog, man, constraint systems 00:34:07.400 | 
Today, actually, these Chomsky-style systems do scale. 00:34:11.080 | 
And that's, in fact, exactly what Sourcegraph has built. 00:34:19.240 | 
the marriage of the Chomsky and the Norvig models, 00:34:22.360 | 
conceptual models, because we have both of them. 00:34:26.260 | 
And, in fact, there's this really interesting overlap 00:34:29.760 | 
between them, where the AI or our graph or our search engine 00:34:33.400 | 
could potentially provide the right context for any given 00:34:35.720 | 
query, which is, of course, why ranking is important. 00:34:38.360 | 
But what we've really signed ourselves up for 00:34:46.760 | 
you were saying that GPT-4 tends to the front of the context 00:34:53.580 | 
Yeah, and so that means that if we're actually 00:35:00.920 | 
to test putting it at the beginning of the window 00:35:04.280 | 
make the right decision based on the LLM that you've chosen. 00:35:15.400 | 
We're generating tests, filling the middle type tests 00:35:19.320 | 
to basically fine-tune Cody's behavior there, yeah? 00:35:25.080 | 
I also want to add, I have an internal pet name 00:35:28.400 | 
for this hybrid architecture that I'm trying to make catch on. 00:35:45.120 | 
I mean, it's obviously a portmanteau of Norvig 00:35:52.280 | 
for non-agentic, rapid, multi-source code intelligence. 00:36:07.000 | 
that we're not trying to pitch you on agent hype, right? 00:36:12.040 | 
The things it does are really just use developer tools 00:36:17.680 | 
like parsers and really good search indexes and things 00:36:23.200 | 
Rapid, because we place an emphasis on speed. 00:36:25.440 | 
We don't want to sit there waiting for multiple LLM 00:36:28.920 | 
requests to return to complete a simple user request. 00:36:35.600 | 
about what pieces of information and knowledge 00:36:43.680 | 
and then you add in the reference graph, which 00:36:49.920 | 
But then even beyond that, sources of information, 00:37:01.680 | 
in your production logging system, in your chat, 00:37:09.520 | 
Like there's so much context that's embedded there. 00:37:12.840 | 
and you're trying to be productive in your code base, 00:37:15.080 | 
you're going to go to all these different systems 00:37:16.600 | 
to collect the context that you need to figure out 00:37:21.520 | 
And I don't think the AI developer will be any different. 00:37:32.760 | 
We hope through kind of like an open protocol 00:37:38.420 | 
And this is something else that should be, I guess, 00:37:41.960 | 
like accessible by December 14th in kind of like a preview 00:37:48.400 | 
this notion of the code graph beyond your Git repository 00:37:51.480 | 
to all the other sources where technical knowledge 00:38:03.080 | 
How do you guys think about the importance of-- 00:38:05.600 | 
it's almost like data pre-processing in a way, 00:38:07.800 | 
which is bring it all together, tie it together, make it ready. 00:38:14.640 | 
that good, what some of the innovation you guys have made? 00:38:18.240 | 
We talk a lot about the context fetching, right? 00:38:20.900 | 
I mean, there's a lot of ways you could answer this question. 00:38:23.400 | 
But we've spent a lot of time just in this podcast 00:38:33.340 | 
and you've got more context than you can fit. 00:38:42.320 | 
by an embedding or a graph call or something? 00:38:46.640 | 
Or do you just need the top part of the function, 00:38:53.920 | 
to get each piece of context down into its smallest state, 00:39:04.800 | 
And so recursive summarization and all the other techniques 00:39:07.840 | 
that you've got to use to stuff stuff into that context window 00:39:12.200 | 
And you have to test them across every configuration of models 00:39:22.160 | 
to a lot of the cool stuff that people are shipping today, 00:39:26.760 | 
whether you're doing like RAG or fine tuning or pre-training. 00:39:34.800 | 
because it is basically garbage in, garbage out, right? 00:39:39.440 | 
Like if you're feeding in garbage to the model, 00:39:53.680 | 
able to extract the key components of a particular file 00:39:58.320 | 
of code, separate the function signature from the body, 00:40:00.760 | 
from the doc string, what are you even doing? 00:40:17.760 | 
We've had a tool since computers were invented 00:40:20.120 | 
that understands the structure of source code 00:40:28.760 | 
is to know about the code in terms of structure. 00:40:39.400 | 
just because now we have really good data-driven models that 00:40:45.800 | 
When I called it a data moat in my cheating post, 00:40:53.000 | 
because data moat sort of sounds like data lake 00:41:00.080 | 
on this giant mountain of data that we had collected. 00:41:06.400 | 
that can very quickly and scalably basically dissect 00:41:09.600 | 
your entire code base into very small, fine-grained semantic 00:41:20.000 | 
Yeah, if anything, we're hypersensitive to customer data 00:41:24.880 | 
So it's not like we've taken a bunch of private data 00:41:42.000 | 
I think that's a very real concern in today's day and age. 00:41:50.720 | 
it's very easy both to extract that knowledge from the model 00:42:01.560 | 
About a year ago, I wrote a post on LLMs for developers. 00:42:05.040 | 
And one of the points I had was maybe the depth of the DSL. 00:42:13.640 | 
But it's not as performant, but it's really easy to read. 00:42:18.560 | 
maybe they're faster, but they're more verbose. 00:42:21.760 | 
And when you think about efficiency of the context 00:42:39.240 | 
Do you see in the future the way we think about DSL and APIs 00:42:48.520 | 
Whereas maybe it's harder to read for the human, 00:42:52.400 | 
but the human is never going to write it anyway. 00:42:57.400 | 
There are some data science things, like spin-up the spandex. 00:43:07.880 | 
Well, so DSLs, they involve writing a grammar and a parser. 00:43:18.600 | 
And we do them that way because we need them to compile, 00:43:23.240 | 
and humans need to be able to read them, and so on. 00:43:30.600 | 
more or less unstructured, and they'll deal with it. 00:43:35.600 | 
for communicating with the LLM or packaging up 00:43:42.560 | 
like that that are sort of peeking into DSL territory, 00:43:48.480 | 
have to learn DSLs, like regular expressions, 00:43:53.600 | 
I think you're absolutely right that the LLMs are really, 00:43:57.000 | 
And I think you're going to see a lot less of people 00:44:01.080 | 
They just have to know the broad capabilities, 00:44:07.560 | 
I think we will see kind of like a revisiting of-- 00:44:13.400 | 
is that it makes it easier to work with a lower level 00:44:17.320 | 
language, but at the expense of introducing an abstraction 00:44:22.280 | 
And in many cases today, without the benefit of AI co-generation, 00:44:36.800 | 
I think there's still places where that trade-off 00:44:40.280 | 
But it's kind of like, how much of source code 00:44:45.320 | 
through natural language prompting in the future? 00:44:56.200 | 
Maybe for a large portion of the code that's written, 00:45:00.800 | 
the DSL that is Ruby, or Python, or basically 00:45:04.840 | 
any other programming language that exists today. 00:45:07.000 | 
I mean, seriously, do you guys ever write SQL queries now 00:45:14.920 | 
And so we have kind of passed that bridge, right? 00:45:18.200 | 
Yeah, I think to me, the long-term thing is like, 00:45:25.360 | 
It's like, hey-- the basic thing is like, hey, 00:45:33.080 | 
And the follow-on question, do you need the engineer 00:45:38.880 | 
That's kind of the agent's discussion in a way, 00:45:42.960 | 
but slowly you're getting more of the atomic units 00:45:48.400 | 
I kind of think of it as like, do you need a punch card 00:45:52.640 | 
And so I think we're still going to have people 00:46:02.600 | 
versus the higher-level, more creative tasks is going to 00:46:20.040 | 
And the first step is the AI-enhanced engineer 00:46:22.440 | 
that is that software developer that is no longer doing 00:46:28.280 | 
because they're just enhanced by tools like yours. 00:46:35.960 | 
And because we're releasing this as you go GA, 00:46:40.040 | 
you hope for other people to take advantage of that? 00:46:48.820 | 
to make your system, whether it's chat, or logging, 00:46:52.760 | 
or whatever, accessible to an AI developer tool like Kodi, 00:46:58.840 | 
here is kind of like the schema by which you can provide 00:47:08.200 | 
did this for kind of like standard code intelligence. 00:47:10.600 | 
It's kind of like a lingua franca for providing 00:47:16.200 | 
There might be also analogs to kind of the original OpenAI 00:47:20.720 | 
kind of like plugins API, where it's like, hey, 00:47:27.440 | 
that might be useful for an LM-based system to consume. 00:47:31.520 | 
And so at a high level, what we're trying to do 00:47:33.920 | 
is define a common language for context providers 00:47:38.560 | 
to provide context to other tools in the software 00:47:43.640 | 
Do you have any critiques of LSP, by the way, 00:47:48.200 | 
One of the authors wrote a really good critique recently. 00:47:59.360 | 
I think LSP is great for what it did for the developer 00:48:08.120 | 
it's much easier now to get code navigation up and running 00:48:13.440 | 
--in a bunch of editors by speaking this protocol. 00:48:17.440 | 
is looking at the different design decisions made, 00:48:30.560 | 
I think the critique of LSP from a Kithe point of view 00:48:34.920 | 
have an actual model, a symbolic model, of the code. 00:48:51.200 | 
And that's the thing you feed into the language server. 00:48:56.860 | 
that you should jump to if you click on that range. 00:48:59.000 | 
So it kind of is intentionally ignorant of the fact 00:49:02.400 | 
that there's a thing called a reference underneath your 00:49:04.760 | 
cursor, and that's linked to a symbol definition. 00:49:07.100 | 
Well, actually, that's the worst example you could have used. 00:49:09.640 | 
You're right, but that's the one thing that it actually 00:49:18.240 | 
Whereas Kithe attempts to model all these things explicitly. 00:49:25.520 | 
And so Google's internal protocol is gRPC-based. 00:49:34.440 | 
Basically, you make a heavy query to the back end, 00:49:40.920 | 
So we've looked at LSP, and we think that it's just-- 00:49:45.960 | 
I mean, it's a great protocol, lots and lots of support 00:49:48.740 | 
But we need to push into the domain of exposing 00:49:59.160 | 
developed a protocol of our own called Skip, which is, I think, 00:50:02.020 | 
at a very high level, trying to take some of the good ideas 00:50:04.440 | 
from LSP and from Kithe, and merge that into a system that, 00:50:10.540 | 
but I think in the long term, we hope it will 00:50:13.840 | 
And I would say, OK, so here's what LSP did well. 00:50:20.840 | 
"dumb" in air quotes, because I'm not ragging on it-- 00:50:30.060 | 
to kind of bypass the hard problem of modeling language 00:50:35.040 | 
So if all you want to do is jump to definition, 00:50:37.200 | 
you don't have to come up with a universally unique naming 00:50:40.320 | 
scheme for each symbol, which is actually quite challenging. 00:50:57.800 | 
you're fetching this from, whether it's the public one 00:51:03.800 | 
And by just going from a location-to-location-based 00:51:07.680 | 
approach, you basically just throw that out the window. 00:51:11.720 | 
Just make that work, and you can make that work 00:51:14.240 | 
without having to deal with all the complex global naming 00:51:29.760 | 
And I want to incorporate that semantic model of how 00:51:32.800 | 
the code operates, or how the code relates to each other 00:51:35.880 | 
at a static level, you can't do that with LSP, 00:51:44.560 | 
in order to do a find references and then jump to definition, 00:51:53.600 | 
And it just adds a lot of latency and complexity 00:51:58.000 | 
this thing clearly references this other thing. 00:52:02.440 | 
And I think that's the thing that Kite does well. 00:52:04.440 | 
But then I think the issue that Kite has had with adoption 00:52:07.520 | 
is, because it's a more sophisticated schema, I think. 00:52:15.960 | 
that you have to implement to get a Kite implementation 00:52:24.280 | 
Kite also has the problem-- all these systems 00:52:26.560 | 
have the problem, even Skip, or at least the way 00:52:30.560 | 
that they have to integrate with your build system 00:52:36.520 | 
the code in a special mode to generate artifacts instead 00:52:41.440 | 
by the way, earlier I was saying that xrefs were in LSP, 00:52:46.240 | 
but it's actually-- I was thinking of LSP plus lsif. 00:53:00.040 | 
It's supposed to be sort of a model, a serialization 00:53:04.360 | 
But it basically just does what LSP needs, the bare minimum. 00:53:13.440 | 
to kind of quickly bootstrap from cold start. 00:53:15.840 | 
But it's a graph model with all of the inconvenience of the API 00:53:23.960 | 
So one of the things that we try to do with Skip 00:53:32.120 | 
some of the more symbolic characteristics of the code 00:53:34.960 | 
that would allow us to essentially construct this 00:53:39.560 | 
useful for both the human developer through SourceGraph 00:53:44.600 | 
So anyway, just to finish off the graph comment 00:54:07.240 | 
I should probably have to do a blog post about it 00:54:09.920 | 
to walk you through exactly how they're doing it. 00:54:12.600 | 
But it's a very AI-like, iterative, experimentation 00:54:16.800 | 
sort of approach, where we're building a code graph based 00:54:23.640 | 
But we're building it quickly with zero configuration, 00:54:25.880 | 
and it doesn't have to integrate with your build system 00:54:30.680 | 
And so it just happens when you install the plug-in 00:54:38.240 | 
and providing that knowledge graph in the background 00:54:42.320 | 
This is a bit of secret sauce that we haven't really-- 00:54:46.800 | 
I don't know, we haven't advertised it very much lately. 00:54:49.800 | 
But I am super excited about it, because what they do 00:54:52.480 | 
is they say, all right, let's tackle function parameters 00:54:56.000 | 
Kodi's not doing a very good job of completing function call 00:54:58.800 | 
arguments or function parameters in the definition, right? 00:55:03.840 | 
And then we can actually reuse those tests for the AI context 00:55:07.760 | 
So fortunately, things are kind of converging on. 00:55:10.040 | 
We have half a dozen really, really good context sources. 00:55:16.880 | 
So anyway, BFG, you're going to hear more about it probably, 00:55:24.240 | 
Yeah, I think it'll be online for December 14th. 00:55:29.640 | 
BFG is probably not the public name we're going to go with. 00:55:32.720 | 
I think we might call it Graph Context or something like that. 00:55:46.480 | 
look at current AI inline code completion tools 00:55:50.760 | 
and the errors that they make, a lot of the errors 00:55:53.400 | 
that they make, even in kind of the easy single line case, 00:56:04.120 | 
And it suggests a variable that you define earlier, 00:56:08.480 | 
And that's the sort of thing where it's like, well, 00:56:23.280 | 
without the context of the types or any other broader 00:56:36.920 | 
that any baseline intelligent human developer would 00:56:43.440 | 
click some find references, and pull in that graph context 00:56:53.480 | 
So that's sort of like the MVP of what BFG was. 00:57:02.920 | 
that AI coding tools make just by pulling in that context. 00:57:06.840 | 
Yeah, but the graph is definitely our Chomsky side. 00:57:15.200 | 
And I think it's just a very useful and also kind of nicely 00:57:18.960 | 
nerdy way to describe the system that we're trying to build. 00:57:25.640 | 
was trying to make earlier to your question, Alessio, about, 00:57:31.520 | 
they thought, oh, are compilers going to replace programming? 00:57:36.920 | 
And I think AI is just going to level us up again. 00:57:39.240 | 
So programmers are still going to be building stuff 00:57:42.120 | 
until agents come along, but I don't believe. 00:57:47.680 | 
Yeah, to be clear, again, with the agent stuff 00:57:52.460 | 
I think that's still the kind of long-term target. 00:57:57.160 | 
you can have Kodi draft up an execution plan. 00:58:00.160 | 
It's just not going to be the sort of thing where you can't 00:58:05.880 | 
Like, we think that with Kodi, it's like, you guys Kodi, 00:58:10.340 | 
It would do a reasonable job of fetching context and saying, 00:58:16.480 | 
can actually suggest co-changes to make to those files. 00:58:19.200 | 
And that's a very nice way to resolve issues, 00:58:21.640 | 
because you're kind of on the rails for most of the time, 00:58:24.720 | 
but then now and then you have to intervene as a human. 00:58:28.960 | 
to get to complete automation, where it's like the sort 00:58:31.720 | 
of thing where a non-software engineer, someone 00:58:41.520 | 
that is still, I think, several key innovations away 00:58:47.400 | 
And I don't think the pure transformer-based LLM 00:58:51.400 | 
orchestrator model of agents that is kind of dominant today 00:58:58.960 | 
Just what you're talking about triggered a thread 00:59:04.480 | 
I've been working on for a little bit, which is we're 00:59:15.520 | 
to need a bigger moat, which is great JAWS reference for those 00:59:22.300 | 
FRANCESC CAMPOY: --how quickly models are evolving. 00:59:36.680 | 
FRANCESC CAMPOY: And actually, there's a pretty good cadence 00:59:39.240 | 
from GPT-2, 3, and 4 that you can-- if you project out. 00:59:42.360 | 
So 4 is based on George Hotz's concept of 20 petaflops 00:59:52.080 | 
GPT-4 took about 100 years in terms of human years 01:00:10.680 | 
And if you just project it out, 9 is every human on Earth, 01:00:18.960 | 
And he thinks he'll reach there by the end of the decade. 01:00:32.160 | 
We're at the start of the curve with Moore's law. 01:00:37.080 | 
George Moore, I think, thought it would last 10 years. 01:00:45.600 | 
And we're just trying to extrapolate the curve out 01:00:50.040 | 
So all I'm saying is this Asian stuff that we dealt 01:00:56.240 | 
And I don't know how you plan when things are not 01:01:20.240 | 
we hear things like things are not practical today, 01:01:30.220 | 
I do think that there will be something like a Moore's law 01:01:34.920 | 
I mean, definitely, I think, at the hardware level, like GPUs. 01:01:39.800 | 
I think it gets a little fuzzier the higher you move up 01:01:44.400 | 
But for instance, going back to the chess analogy, 01:01:50.000 | 
at what point do we think that GPT-X or whatever, 01:01:54.520 | 
a pure transformer-based LLM model will be state of the art 01:02:00.440 | 
or outperform the best chess-playing algorithm today? 01:02:07.480 | 
FRANCESC CAMPOY: Where you completely overlap 01:02:13.960 | 
I think that would kind of disprove the thesis that I just 01:02:16.320 | 
stated, which is kind of like the pure transformer, 01:02:25.000 | 
versus, oh, we actually have to take a step back and think-- 01:02:37.200 | 
is going to be one piece of a system of intelligence 01:02:41.740 | 
that's going to take advantage-- that we'll have to take 01:02:44.120 | 
advantage of, like many other algorithms and approaches? 01:02:53.800 | 
FRANCESC CAMPOY: All right, sorry for that digression. 01:02:57.480 | 
So one thing I did actually want to check in on, 01:03:00.000 | 
because we talked a little bit about code graphs and reference 01:03:08.480 | 
MARK MANDEL: Well, I mean, how would you find graph database? 01:03:18.420 | 
that Postgres was performing as well as most of the graph 01:03:35.640 | 
But we basically tried to dump a non-trivially sized data set, 01:03:40.260 | 
but also not the whole universe of code, right? 01:03:46.180 | 
compared to what we're indexing now into the database. 01:03:55.360 | 
And we're like, OK, let's try another approach. 01:04:08.620 | 
I mean, at the end of the day, all the databases, 01:04:14.660 | 
If all your queries are single hops in this-- 01:04:20.060 | 
MARK MANDEL: Which they will be if you denormalize 01:04:27.100 | 
MARK MANDEL: Seventh normal form is just a bunch of files. 01:04:36.460 | 
about the actual query load, or the traffic patterns, 01:04:46.020 | 
just go with the tried and true, dumb, classic tools 01:04:52.260 | 
FRANCESC CAMPOY: I mean, there's a bunch of stuff 01:04:54.260 | 
like that in the search domain, too, especially right now, 01:04:56.700 | 
with embeddings, and vector search, and all that. 01:05:00.900 | 
But classic search techniques still go very far. 01:05:04.020 | 
And I don't know, I think in the next year or two maybe, 01:05:10.680 | 
start to see the gap emerge, or become more obvious to more 01:05:17.060 | 
people about how many of the newfangled techniques 01:05:20.100 | 
actually work in practice, and yield a better product 01:05:27.880 | 
a bunch of other people trying to build AI tooling. 01:05:34.320 | 
Obviously, you build a lot proprietary in-house, 01:05:42.020 | 
do you have a prompt engineering management tool? 01:05:48.540 | 
Pre-processing orchestration, do you use Airflow? 01:05:54.500 | 
Ours is very duct-taped together at the moment. 01:06:06.460 | 
There's the knowledge graph, the code knowledge graph 01:06:09.220 | 
that we built, which is using indexers, many of which 01:06:12.620 | 
are open source, that speak the skip protocol. 01:06:21.860 | 
Traditionally, we supported regular expression search 01:06:24.540 | 
and string literal search with a trigram index. 01:06:28.060 | 
And we're also building more fuzzy search on top of that 01:06:31.300 | 
now, kind of like natural language or keyword-based 01:06:36.820 | 
And we use a variety of open source and proprietary models. 01:06:40.140 | 
We try to be pluggable with respect to different models, 01:06:42.820 | 
so we can easily swap the latest model in and out 01:06:49.460 | 
I'm just hunting for, is there anything out there 01:06:52.620 | 
that you're like, these guys are really good. 01:06:56.700 | 
So for example, you talked about recursive summarization, 01:06:59.500 | 
which is something that LangChain and LlamaIndex do. 01:07:05.500 | 
I think the stuff that LlamaIndex and LangChain 01:07:12.420 | 
like we're still in the application end user use case 01:07:17.060 | 
And so adopting an external infrastructure or middleware 01:07:25.020 | 
tool just seems overly constraining right now. 01:07:29.540 | 
need to be able to iterate rapidly up and down the stack. 01:07:32.260 | 
But maybe at some point, there'll be a convergence, 01:07:34.620 | 
and we can actually merge some of our stuff into theirs 01:07:50.620 | 
Also, plug for Fireworks as an inference platform. 01:08:06.140 | 
She was the co-manager of PyTorch for five years. 01:08:22.900 | 
And that's made it so that we just don't have 01:08:24.820 | 
to think about building up an inference stack. 01:08:27.860 | 
And so that's great for us, because it allows us to focus 01:08:30.340 | 
more on the data fetching, the knowledge graph, 01:08:35.500 | 
and model fine-tuning, which we've also invested a bit in. 01:08:40.820 | 
We've got multiple AI workstreams in progress now, 01:08:51.700 | 
And the guy we hired, Rashab, is absolutely world-class. 01:08:56.140 | 
And he immediately started multiple workstreams, 01:09:17.140 | 
run against the benchmark, or we'll make our own benchmark 01:09:20.740 | 
But we'll be forcing people into the quantitative comparisons. 01:09:24.740 | 
And that's all happening under the AI program 01:09:30.420 | 
heard that there's a v2 of Starcoder coming on. 01:09:41.320 | 
Can you guys believe how amazing it is that the open source 01:09:44.420 | 
models are competitive with GPT and Anthropic? 01:09:50.260 | 
I mean, that one Googler that was predicting that open source 01:09:53.420 | 
would catch up, at least he was right for completions. 01:10:06.100 | 
We still use Cloud and GPT-4 for chat and also commands. 01:10:11.980 | 
But the ecosystem is going to continue to evolve. 01:10:24.620 | 
that they're doing in kind of driving the ecosystem forward. 01:10:31.300 | 
It's always kind of like a constant evaluation process. 01:10:33.980 | 
I don't want to come out and say, hey, this model's 01:10:39.580 | 
for the sorts of context that we're fetching now 01:10:42.460 | 
and given the way that our prompt's constructed now. 01:10:44.580 | 
And at the end of the day, it was like a judgment call. 01:10:53.140 | 
Like, if someone comes up with a neat new context fetching 01:10:55.680 | 
mechanism-- and we have a couple coming online soon-- 01:11:00.820 | 
against the kind of array of models that are available 01:11:04.860 | 
and see how this moves the needle across that set. 01:11:14.260 | 
What did we have to build that we wish we could have used? 01:11:25.700 | 
like a very nice, clean data set of both naturally occurring 01:11:34.820 | 
Yeah, could someone please give us their data mode? 01:11:39.100 | 
It's just like, I feel like most models today, 01:11:41.380 | 
they still use a combination of the stack and the pile 01:11:55.020 | 
I think there's still more alpha in synthetic data. 01:12:01.020 | 
think fine-tuning some models on specific coding tasks 01:12:08.500 | 
where it's reliable enough that we can fully automate it, 01:12:14.700 | 
And synthetic data is playing a part of that. 01:12:17.060 | 
But I mean, if there were like a synthetic data provider-- 01:12:19.760 | 
I don't think you could construct a provider that has 01:12:25.200 | 
No company in the world would be able to sell that to you. 01:12:35.940 | 
I don't know if there's a business around that. 01:12:37.860 | 
But that's something that we definitely love to use. 01:12:41.320 | 
I mean, but that's also like the secret weapon, right? 01:12:48.220 | 
So I doubt people are going to be, oh, we'll see. 01:12:57.940 | 
I would say that would be the bull case for Repl.it, 01:13:01.500 | 
that you want to be a coding platform where you also offer 01:13:05.980 | 
And then you eventually bootstrap your own proprietary 01:13:14.580 | 
this is from nobody at Repl.it that I'm hearing. 01:13:17.680 | 
But also, they're just not leveraging that actively. 01:13:21.660 | 
They're actually just betting on OpenAI to do a lot of that, 01:13:30.540 | 
Yeah, they're definitely great at executing and-- 01:13:50.340 | 
And this whole room in the new room was just like, 01:13:58.060 | 
I mean, it would have real implications for us, too. 01:14:07.140 | 
Yeah, I mean, that would have been the break glass plan. 01:14:13.180 | 
think we'd have a lot of customers the day after being 01:14:16.140 | 
like, how can you guarantee the reliability of your services 01:14:22.020 | 
But I'm really happy they got things sorted out 01:14:31.340 | 
So we kind of went through everything, right? 01:14:37.300 | 
why inline completion is better, all of these things. 01:14:42.180 | 
How does that bubble up to who manages the people, right? 01:14:46.820 | 
Because as engineering managers, and I never-- 01:14:52.140 | 
I was mostly helping people write their own code. 01:14:55.020 | 
So even if you have the best inline completion, 01:15:04.220 | 
Yeah, so that's a really interesting question. 01:15:07.580 | 
And I think it sort of gets at this issue, which 01:15:10.420 | 
is I think basically every AI dev tools creator or producer 01:15:22.700 | 
kind of focusing on the wrong problem in a way. 01:15:26.340 | 
Because the real problem of modern software development, 01:15:30.340 | 
I think, is not how quickly can you write more lines of code. 01:15:34.180 | 
It's really about managing the emergent complexity 01:15:41.340 | 
and how to make efficient development tractable again. 01:15:47.060 | 
Because the bulk of your time becomes more about understanding 01:15:51.540 | 
how the system works and how the pieces fit together currently 01:15:56.140 | 
so that you can update it in a way that gets you 01:16:00.220 | 
your added functionality, doesn't break anything, 01:16:03.340 | 
and doesn't introduce a lot of additional complexity 01:16:08.100 | 
And if anything, the inner loop developer tools 01:16:15.020 | 
yes, they help you get your feature done faster. 01:16:19.780 | 
But they might make this problem of managing large complex code 01:16:25.820 | 
Just because now, instead of having a pistol, 01:16:33.100 | 
And there's going to be a bunch of natural language prompted 01:16:35.740 | 
code that is generated in the future that was produced 01:16:38.500 | 
by someone who doesn't even have an understanding of source 01:16:43.460 | 
And so how are you going to verify the quality of that 01:16:45.780 | 
and make sure it not only checks the low-level boxes, 01:16:49.820 | 
but also fits architecturally in a way that's 01:16:57.980 | 
have a lot of ideas around how to make code bases, 01:17:01.260 | 
as they evolve, more understandable and manageable 01:17:05.020 | 
to the people who really care about the code base as a whole-- 01:17:08.300 | 
tech leads, engineering leaders, folks like that. 01:17:11.340 | 
And it is kind of like a return to our ultimate mission 01:17:16.820 | 
at Sourcegraph, which is to make code accessible to all. 01:17:19.340 | 
It's not really about enabling people to write code. 01:17:21.640 | 
And if anything, the original version of Sourcegraph 01:17:29.220 | 
because there's already enough people doing that. 01:17:34.700 | 
I mean, Quinn, myself, and you, Steve, at Google-- 01:17:54.020 | 
And any developer who falls below a threshold, 01:17:56.180 | 
a button lights up where the admin can fire them. 01:18:02.940 | 
But I'm kind of only half tongue-in-cheek here. 01:18:06.260 | 
We've got some prospects who are kind of sniffing down 01:18:15.320 | 
like Bian was saying-- much greater whole code-based 01:18:17.700 | 
understanding, which is actually something that Kody is, 01:18:20.260 | 
I would argue, the best at today in the coding assistance space, 01:18:23.020 | 
right, because of our search engine and the techniques 01:18:27.880 | 
is so important for any sort of a manager who just 01:18:34.340 | 
or whether people are writing code that's well-tested 01:18:42.580 | 
This is not the developer inner loop or outer loop. 01:18:48.540 | 
The manager inner loop is staring at your belly button, 01:18:54.220 | 
Waiting for the next Slack message to arrive? 01:18:58.280 | 
What they really want is a batch mode for these assistants 01:19:00.700 | 
where you can actually take the coding assistant 01:19:08.180 | 
it's told you all the security vulnerabilities. 01:19:11.980 | 
It's an insanely expensive proposition, right? 01:19:14.060 | 
You know, just the GPU cost, especially if you're 01:19:17.580 | 
So it's better to do it at the point the code enters 01:19:20.380 | 
And so now we're starting to get into developer outer loop 01:19:23.220 | 
And I think that's where a lot of the-- to your question, 01:19:25.900 | 
A lot of the admins and managers and the decision makers, 01:19:28.820 | 
anybody who just kind of isn't coding but is involved, 01:19:32.540 | 
they're going to have, I think, well, a set of tools, right? 01:19:40.980 | 
Our code search actually serves that audience as well, 01:19:48.300 | 
And they use our search engine and they go find it. 01:19:50.380 | 
And AI is just going to make that so much easier for them. 01:19:56.180 | 
to put my anecdote of how I used Kodi yesterday. 01:19:59.380 | 
I was actually trying to build this Twitter scraper thing. 01:20:02.020 | 
And Twitter is notoriously very challenging to work with 01:20:11.960 | 
It was really big that had the Twitter scraper thing in it. 01:20:20.420 | 
But then I noticed that on your landing page, 01:20:24.100 | 
Like, I typically think of Kodi as a VS Code extension. 01:20:27.900 | 
But you have a web version where you just plug in any repo 01:20:44.800 | 
The search thing is like, oh, this is old source graph. 01:20:55.880 | 
that's hidden in the upper right hand corner. 01:21:05.660 | 
Well, you didn't embed it, but you indexed it. 01:21:09.720 | 
that have emerged among power users where they kind of do-- 01:21:15.780 | 
You can kind of replicate that, but for arbitrary frameworks 01:21:20.340 | 
Because there's also an equally hidden toggle, which you may 01:21:22.900 | 
not have discovered yet, where you can actually 01:21:30.540 | 
let's say you want to build a stock ticker that's 01:21:33.280 | 
React-based, but uses this one tick data fetching API. 01:21:42.480 | 
Track the tick data of Bank of America, Wells Fargo 01:21:55.160 | 
just because the wow factor of that is just pretty incredible. 01:21:58.360 | 
It's like, what if you can speak apps into existence 01:22:00.800 | 
that use the frameworks and packages that you want to use? 01:22:07.380 | 
It's just taking advantage of your RAG pipeline. 01:22:20.700 | 
Yeah, but I guess getting back to the original question, 01:22:25.620 | 
I think would be interesting for engineering leaders. 01:22:32.100 | 
that you really ought to be doing with respect to, like, 01:22:34.520 | 
ensuring code quality, or updating dependencies, 01:22:42.680 | 
that humans find toilsome and tedious and just don't want 01:22:45.800 | 
to do, but would really help uplevel the quality, security, 01:22:51.480 | 
Now we potentially have a way to do that with machines. 01:23:08.520 | 
to do it in the same way that you can measure marketing, 01:23:11.920 | 
or sales, or other parts of the organization. 01:23:14.560 | 
And I think, what is the actual way you would do this 01:23:18.000 | 
that is good, if you had all the time in the world? 01:23:20.960 | 
I think, as an engineering manager or an engineering 01:23:23.320 | 
leader, what you would do is you would go read 01:23:25.660 | 
through the Git log, maybe like line by line. 01:23:28.160 | 
Be like, OK, you, Sean, these are the features 01:23:31.560 | 
that you built over the past six months or a year. 01:23:36.680 | 
These are the things that delivered that you helped drive. 01:23:39.120 | 
Here's the stuff that you did to help your teammates. 01:23:52.760 | 
Now connect that to the things that matter to the business. 01:24:05.960 | 
on the metrics that moved the needle for the business 01:24:08.200 | 
and ultimately show up in revenue, or stock price, 01:24:12.480 | 
or whatever it is that's at the very top of any for-profit 01:24:29.380 | 
Plus, it's also tedious, like reading through Git log 01:24:32.660 | 
and trying to understand what a change does and summarizing 01:24:36.620 | 
It's just-- it's not the most exciting work in the world. 01:24:46.260 | 
does a lot of the tedium and helps you actually 01:24:50.140 | 
And I think that is maybe the ultimate answer to how 01:24:55.580 | 
that a CFO would be like, OK, I can buy that. 01:24:59.380 | 
The work that you did impacted these core metrics 01:25:10.420 | 
And that's what we really want to drive towards. 01:25:12.020 | 
I think that's what we've been trying to build all along, 01:25:21.820 | 
now just puts that much sooner in reach, I think. 01:25:26.740 | 
But I mean, we have to focus, also, small company. 01:25:30.460 | 
And so our short-term focus is lovability, right? 01:25:41.460 | 
about enabling all of the non-engineering roles, 01:25:59.820 | 
Which we always forget to send the questions ahead of time. 01:26:04.300 | 
So we usually have three, one around acceleration, 01:26:11.780 | 
something that already happened in AI that is possible today 01:26:16.740 | 
I mean, just LLMs and how good the vision models are now. 01:26:24.740 | 
Well, I mean, back in the day, I got my start machine learning 01:26:35.020 | 
And in those days, everything was statistical-based. 01:26:43.160 | 
And so I was very bearish after that experience 01:26:54.800 | 
So yeah, it came up faster than I expected it to. 01:27:04.340 | 
that we're not tapping into, potentially even 01:27:12.940 | 
is probably not the steady state that we're seeing long-term. 01:27:18.900 | 
and you'll always have chat, and commands, and so on. 01:27:21.420 | 
But I think we're going to discover a lot more. 01:27:25.820 | 
some kind of new ways to get your stuff done. 01:27:30.540 | 
So yeah, I think the capabilities are there today. 01:27:35.720 | 
When I sit down, and I have a conversation with the LLM 01:27:41.340 | 
talking to a senior engineer, or an architect, or somebody. 01:27:46.740 | 
And I think that people have very different working models 01:27:50.460 | 
Some people are just completion, completion, completion. 01:27:55.000 | 
they write a comment, and then telling them what to do. 01:27:58.340 | 
But I truly think that there are other modalities that we're 01:28:01.040 | 
going to stumble across, and just kind of latently, 01:28:14.960 | 
I mean, the one we talked about earlier, nonstop coding 01:28:19.140 | 
a whole bunch of requests to refactor, and so on. 01:28:24.540 | 
We talk about agents, that's kind of out there. 01:28:26.540 | 
But I think there are kind of more inner loop type ones 01:28:31.220 | 
And we haven't looked at all that multimodal yet. 01:28:41.260 | 
One, which is effectively architecture diagrams 01:28:47.180 | 
There's probably more alpha in synthesizing them 01:28:49.700 | 
for management to see, which is, you don't need AI for that. 01:29:13.260 | 
about how someone just had an always-on script, 01:29:16.540 | 
just screenshotting and sending it to GPTVision 01:29:21.620 | 
And it would just autonomously suggest stuff. 01:29:27.300 | 
and just being a real co-pilot, rather than having 01:29:39.660 | 
So the reason I know this is we actually did a hackathon, 01:29:41.980 | 
where we wrote that project, but it roasted you while you did 01:29:46.820 | 
it, so it's like, hey, you're on Twitter right now. 01:29:52.820 | 
And that can be a fun co-pilot thing, as well. 01:29:57.860 | 
Exploration, what do you think is the most interesting 01:30:02.900 | 
It used to be scaling, right, with CNNs and RNNs, 01:30:15.120 | 
I feel like-- do you mean like the pure model, like AI layer? 01:30:21.120 | 
how do you get reliable first try working code generation? 01:30:30.380 | 
Because I think if you want to get to the point 01:30:33.340 | 
where you can actually be truly agentic or multi-step 01:30:40.540 | 
is the single step has to be robust and reliable. 01:30:49.400 | 
Because once you have that, it's a building block 01:30:51.400 | 
that you can then compose into longer chains. 01:31:02.780 | 
I mean, I think for me it's just like the best 01:31:11.700 | 
to leverage many different forms of intelligence. 01:31:14.780 | 
Calling back to that like Normski architecture, 01:31:19.740 | 
You should call it something cool like S* or R*. 01:31:24.500 | 
Just one letter and then just let people speculate. 01:31:37.660 | 
And I think Normski encapsulates the two big technology areas 01:31:46.140 | 
will be very important for producing really good DevTools. 01:31:51.460 | 
And I think it's a big differentiator that we 01:32:00.900 | 
that not all developers today are using coding assistants. 01:32:08.380 | 
and it didn't immediately write a bunch of beautiful code 01:32:12.060 | 
And they were like, ah, too much effort, and they left. 01:32:29.640 | 
to actually make coding assistants work today. 01:32:33.880 | 
they'll give you the runaround, just like doing a Google search 01:32:36.720 | 
But if you're not putting that effort in and learning 01:32:39.560 | 
the sort of footprint and the characteristics of how 01:32:42.600 | 
LLMs behave under different query conditions and so on, 01:32:46.040 | 
if you're not getting a feel for the coding assistant, 01:32:48.560 | 
then you're letting this whole train just pull out 01:32:54.560 | 
Yeah, thank you guys so much for coming on and being