Talking AI and Agentic Coding with Yury Selivanov

Hello, Juri. Hi, Armin. How's it going? Good. How about you? Pretty good. So we're going to discuss the Chantik coding and how useful AI is for us. And potentially we're taking opposing sides here, is my guess. I guess that's the idea, right? That's the idea, yeah. So maybe also to set maybe some sort of level here.

Right now I'm basically alone with a co-founder, so I don't have a team more or less, and so my Chantik coding experience is a lot of just two people, right? And so you're probably in a different position too. I'm a slightly different position. Sad. Well, first of all, I'm joining this to play devil's advocate because I kind of believe in AI coding tools.

I'm using them myself, sometimes productively, sometimes less. That said, the stuff that I want to discuss, it's real. I think those problems are real and it's good. I think ultimately it will help if people talk about those issues. So I'm not going to be aggressive, but I'm going to be aggressively making my point that those issues are important.

Even maybe sometimes sort of focusing too much on things that ultimately I personally don't care much about. But that's, that's my role in this conversation. I don't want pro AI people to, to find where I live and like, to a different discussion. I love AI, but, but, but yeah, there are some, some, some issues with it that we need to discuss.

And, uh, and, uh, I'm excited about this. So I'm Yuri, Yuri Salivanov. Uh, I'm a Python core developer for, uh, uh, since 2013. Get a lot of stuff to Python, like async await syntax, uh, and, uh, um, in famous asynchronous generators. Maintained async IO, created UV loop. And, uh, uh, uh, I, uh, I'm a co-founder and CEO of, uh, gel data where we try to fix some things in Postgres.

Um, so that's me. Uh, I've been, uh, a software engineer for, uh, for a long, long time. Um, and, uh, I'm taking, uh, full advantage of AI. At least I wanna, wanna, wanna think of myself this way. Like I, I'm super excited about this technology. I think this is, this is something new, uh, and, uh, that's why it's exciting.

Uh, and, uh, sometimes, uh, I, uh, I have some wins with it. Yeah. That said, Armin invited me to, uh, be devil's advocate. So, uh, that's my role. Yeah. And so I am basically unemployed since April this year. I did 10 years of century before. Four years. Did a lot of Python trust over the years.

built Flask and a bunch of other things. And in the last couple of months, fell into a hole of going really, really deep on authentic things. And started a company just basically two months ago, uh, where I'm now taking full advantage of authentic coding, maybe to, to a degree that's slightly unhealthy.

But I was very excited about the technology. Do you want to play devil's advocate for a change? So let's maybe start with, I mean, maybe one, one extra piece of context here. is like, because I did a lot of Python over the years, I've also sort of decided that this time around, I'm just going to, uh, sub check myself to the choices of the computer.

And I had the AI sort of run evals a couple of months ago, and I was trying to figure out like, which language should I use if I want to end up with a chat decoding. And, and the AI had the best result with Go. So now I'm a Go programmer.

I think your, your chances of finding a co-founder, like dropping the more things. I already have a co-founder, so I'm done, I'm past that, past that point. Oh yeah, AI is your co-founder, I think. Yeah, AI is my co-founder. Yeah. Yeah. Sunnet 4.5. Wasn't there like a, I think it was like a Y Combinator where it was like the, the goal is like Y Combinator 1 is like the single person, billion dollar company.

Right? Yeah. So that doesn't count if AI is the co-founder. Well, you have a few, you have JPD5 and you have three co-founders. Okay. Let's start with building prototypes with AI. I think that that's probably the part where there's the most agreement on that there's some value in it.

I'm guessing because seemingly Lovable exists and they have a bunch of money that they make and, and bolt and whatever. Maybe we start with, is it possible to take what seemingly we all agree on, which is like, you can actually prototype really quickly with this thing and apply that also to prototyping in the concept context of an existing code base or at the very least using it there.

And then maybe like to which degree does the same apply from your perspective also for just building in an existing code base. Yeah. So this, so this is basically where probably you and I agree. I think that the coding tools, all of them, not necessarily Lovable, but Cloud Code as well.

And the cursor, obviously like even like vanilla ChatGPT, chat editor, if you want to copy, paste code, they work amazing for prototyping. My problem with that is that when you attempt to prototype something or build something within the existing, within the context of existing code base, suddenly, suddenly it's, it's, it's much harder.

Suddenly you might find yourself spending a lot of time unproductively trying to replicate your recent success with prototyping something that worked and basically just face, face a wall. And this, this is kind of the, the start of the, I guess, of the conversation is that I think that everybody agrees that AI is absolutely amazing for creating prototypes.

Productizing them sometimes is much, much harder. And, uh, if, uh, and if you have a team of people, this is where, uh, things start slowly sort of, um, falling out of, of context, uh, for you. Um, yeah, I had, I had, I had a lot of success with, uh, with building prototypes, but again, uh, So why, why does the prototype work for you?

And why does it not work for you in, in, in the context of a code base? Because I can actually very simply take the opposing view here. I swear to you, like more than 90% of my, my infrastructure code right now is like a really big infrastructure component. Uh, it, it basically is a, is a piece that gives agents mailboxes for emails and it's all.

And it's not slop. It's good code, I think. But my point is mostly like, this is no longer a prototype. Now it's real. Right. And sort of like every iterative change that I'm doing to it is, it's like writing to a code base that has a pretty big size at this point.

I actually, I, I, I, in, in some way, I'm actually impressed how well it still works, despite it being basically at this point, more 40,000 lines of code code, um, which is maybe not the largest code base in the world, but it's definitely past the context size. I kind of wonder like to which, where, where, where do you, like, why does it not work for you?

And why does it seemingly still work for me? What's the, what's the challenge? I can basically, uh, give an example. So I'm building a new tool actually for AI, uh, and to, to, to have it better integrated in my workflow. And, uh, to do that, we're, uh, this is like a big spoiler or something that we want to launch, uh, relatively soon.

But that thing plugs into your terminal, uh, like literally it acts as a transparent proxy. And I wanted to understand how to build this thing first. And there are multiple different approaches. One approach is, uh, to basically take T-Max for example, and, and just instrument, uh, around it. And, uh, the other approach would be to tap into Pty and whatnot, uh, and then like actually do the proxy when you sort of like intercept, start in, start out and do all of that.

So my first task was to understand, uh, to what degree I can push T-Max. And I'm not an active T-Max user. I've used it a couple of times, but this is where, uh, I think cloud code, uh, shined. At that time, um, uh, there was SunNet 3.5, I think, and, uh, uh, Opus 4.1.

So SunNet, uh, did not work for me at all. So I had to use the expensive one, but the expensive one was amazing. I had one day where I built six prototypes, uh, of this thing. The first one was using T-Max, the other was like a slightly different thing.

And then I realized, let, let, let me try to tap into those things at like at a more fundamental sort of IO level, because T-Max, I just couldn't see how I can have decent user experience. Again, thanks to AI, the prototype worked quickly. Started using Python and obviously hit the performance wall immediately.

So, uh, just, uh, just realized that, hey, like I'm gonna have milliseconds of latency on every keystroke. This will not work. And Claude and I were managed to improve the performance to be decent, but then, uh, like in pure shell, but then you run Vim inside that thing. And like, suddenly, like, you can just see, like, you press a key and it takes some time for it to appear on the screen.

Okay. So that didn't work. So then, well, obviously rust, right. But I'm not an active, I'm not, I'm not a good rust engineer. I can read rust code. I can write some rust code. You know what's funny about this? Sorry, I got to interrupt you for a second. My first vibe coding, like actual proper experiment, I did this sort of night of like, not sleeping with Mario Zechner and Peter Steinberger is we basically built the thing called Bipe Tunnel, which was a PTY interceptor that streams the keystrokes to the internet.

So it sounds very familiar. It's in the air. And we reviewed it three times. Once in Node, once in Rust, once in Swift. Oh my God. Oh my God. It's, it's, it's, it's in the air. So then, then I tried to make it in Rust. And, uh, there are like a lot of things that I wanted to tap into.

Like for example, I wanted it to react to certain shortcuts. I wanted it to be able to sort of like multiplex different terminals on one screen, kind of like T-Max is doing, but do it like on a specific shortcut. And amazingly, I built all that, all those things in one day.

But then I showed it to Rust engineer, the company who is actually extremely qualified. And I had two reactions from him. Uh, like, first of all, like, oh my God, how much did you pay for this? And I'm like 600 bucks in a day, which is insane, obviously. But then, uh, his reaction was like, first he was shocked, but then, hmm, this probably sounds about right because it would take us probably a couple of weeks to build all those prototypes.

And we just saved all this time. But then we started looking at the code and obviously like it's, it's, it's just, just, just bad code. So, but for me, for me, it's not possible. Like I'm an expert Python engineer and I'm like decent, I guess at Rust, but this is where the gap was basically for me, the Rust looked fine for him.

It was, it looked like a mess. So then it's easy to extrapolate. If, if I had, had to do this myself and I would just be, uh, uh, vibe coding this thing further and further and further building product, I would 100% hit the wall where I would not be able, uh, to progress much without me building myself deeper understanding of the, uh, of the programming language and the problem of, of, of, of everything essentially.

And, uh, and we're going to chat about like that specific trap, uh, in, in a little bit, but that's my complaint about building prototypes is that at some point of time, uh, you have to evolve that thing. You have to productize it. And this is where currently there are limits.

If you are an expert engineer and you know, well, the domain and you know, well, uh, the programming language, then you can harness the power, I guess, of, of, of the, of the tool. And this is what you're doing. This is your success. I'm an expert programmer now. Yeah.

You are. Well, that's interesting actually. Uh, I mean, for me to chat about that. For me, the interesting thing is like, I think you're right. Like you can definitely, um, I call this once, like you can, we can Vibe code a code base to death, which we've definitely done.

So my co-founder built a really great prototype for what it's worth. For like, for, for, for a particular thing that we're exploring, which is basically like a machine that connects to, to your email. And it does interesting things in the context of email, but, but it was built on top of a combination of Mailgun and Cloudflare workers and TypeScript and D1.

And it was very, very, very little input of me. Um, and after three weeks, it created the code base. It had like multiple tables, duplicates and all kinds of stuff in it. But you can have many views on this, right? One of which is, well, that's irresponsible, but it's not really because like the whole point is like figure out, do we even want to build this in the first place?

And, and so my version of this was then to say, okay, so, so what, what is it that we're actually doing here? Like what are the pieces that we want? And I also used, um, a combination of Codex and Cloud for it. But I think like in a responsible way where, where like there's certain things in, in software engineering where it's really, really critical that you get it right.

Where's your data? How is your data laid out? Is the problem that you're solving even scaling in the right way? And not everything is a scaling problem, but like you can definitely build a bunch of shit in a way that doesn't make it very far. And then obviously the code, the code can't look like complete trash.

What I found at least is that there's a huge difference in my experience between the kind of code that you get when you get an AI to program Go as an example, or even Java or PHP versus for instance, when I get an AI to write when they write JavaScript or when they write to some degree Python, but definitely Rust.

And I think it has a lot to do with the complexity of the abstractions that are typically in place. Because one of the reasons I think the TypeScript code base is just in a terrible state is because it just created the same abstractions over and over in different files and never found them.

And in Go it didn't even get the idea to write the abstraction in first place because Go is just such a simple language. So I think that to some degree, I think it, like you definitely have to use your brain. Like even in the Go code base, I think if I wouldn't have refacted the hell out of it, it would probably have made a terrible mess too.

But I think for as long as you commit yourself to wanting to uphold a standard and a quality bar, I think it works. It doesn't come for free though, but I'm still quicker than if I would have to write every single line by hand. I think that's my... You think that.

That's the main problem is that it's like, I think we should move like to my next talking point that I have like a little list of things. Like my next talking point is directly related to this. This is time and expectation management, which is that you have no idea will it work for you this day or not.

Sometimes you want to do something complex, you prompt it and it just punch shots it for you. And it's amazing. It's an amazing feeling. Like it's great. It just worked. You maybe do a little bit of cleanup, write the commit message to sort of make it not look like it's AI rhythm.

And that's it. Like go solve your next task or go walk in the park. But sometimes something easy can take hours of work. Like it doesn't work this time, but it's like almost there. So it's like, it feels like 15 more minutes and you're done. So you spend the time and suddenly like three hours pass and it's still like where it started essentially.

And then you can end up in a situation where the whole day is wasted and you haven't done it yourself. Like maybe you would solve this task in two hours of coding. And that is the problem. I sometimes look at AI coding tools like this perfect dopamine cycle. It gives you a kick when it works and then it doesn't work.

So it gets you slightly depressed, but then it works again. And I'm not sure that ultimately it saves time. Like again, like there are clear examples when it does. I think AI perfectly replaces stock overflow for a lot of times. Like seven times out of 10, like it will give you the correct result.

It saves you time. And when it doesn't, it's kind of easy to see and check. So ultimately 100% time savings. I can buy that. With coding though, I'm not so sure anymore. Perceptually, I have a feeling that I can save a lot of time using AI. But when I'm objectively looking back at that time, I'm no longer sure that that was the case.

So I think it saves me time and I think objectively so. But I think that my strong suspicion of how to make it work, and this is sort of like greatly extrapolated from a data point of me, is I think, so first of all, my opinion has dramatically changed.

And I've told this people before, prior to me, I was like, all right, this is just a bunch of nonsense. But what I think like the reason why I feel like I can read the machine, like I have an understanding of what the hell it's doing is because the weird thing is that there is no learning curve.

There's a hill to walk up on, a very, very, very long hill. And you walk up this hill for like two months and then you feel like there's enlightenment, because now you know what the machine sort of does and doesn't do. So it's not about like you, there's a learning curve and you have to learn the machine.

It's like, you have to feel it. It sounds so ridiculous, but because like the whole point about is that you need to understand what is a task that it will not be able to do, because otherwise you're going to run into a situation that you spent three hours on a goddamn thing that will have taken you 15 minutes by hand.

And because of this dopamine thing, you don't really feel like, because there's progress all the fucking time, it's just a pull request. And like, if you actually measure the time that you iterate on the damn pull request, it would actually be really bad. But the whole like, if you actually spent the time sort of like to onboard yourself into that properly, the number of cases where you're going to run into this kind of greatly diminishes because you recognize ahead of time what's going to work or not.

And actually, I made the same experience at one point with Rust, where I felt the only way in which I could be productive at all in Rust was to figure out what I couldn't do in the language. And whenever I had a problem, that was like, you have to self-borrow.

You have to, I don't know, do a bunch of stuff that languages can't do. You just have to recognize very early on that you're doing something that will not lead to success because otherwise you just, you're grinding out there with no progress. And so I think like for as long as you steer away from the shit that doesn't work, it does actually save you time.

My problem is that again, like you walk up that hill and then your module is released. And then the hill suddenly is slightly different for you. And sometimes you have to go downhill now. It's uneven. Like that's the thing. Like with any tool that we've had before, be that the programming language or an IDE or something like that.

All the knowledge that you have accumulates and you build on that knowledge. And with LLMs, you don't know. Sometimes, sometimes if you forget to like reset context and then the context overflows of Claude and then just like starts mumbling and doing stupid things, you have to actively pull yourself and realize, hmm, I should probably just start a new chat.

Like this, this thing is done. Like no matter what I say anymore. And then sometimes weird stuff happens. Like it just gives up and like, you should do that. And I'm like, no, you should do that. Like do this yourself. You are the machine. And no, I will not.

You do that. And it's, it's, it's kind of ridiculous and cool. And sometimes even refreshing to find yourself in this, but oftentimes it's also just frustrating. It's like, you know what I mean? And you can end up in this situation pretty easily. Again, my main point here is that it's not clearly a hill.

It's something that you, some, some surface that you see like 30 meters in front of you. And you have no idea if it's uphill or downhill. What is even going on? Am I learning anything? Am I learning how to use it or not? Or maybe I just forgot to reset the context.

The irony is that I think like, unless I would have actually had the time to, to like not have pressure, I mean, I had a pressure to start a company, right? That was an internal pressure, but it wasn't like, it has to happen by day X. And so I basically had a bunch of time to deal with a bunch of shit and to figure this out.

And what I realized when I talked to a bunch of people in a company, I was like, well, my leadership makes me want to use AI tools and it doesn't work for me. I was like, yeah, if, if, if you would basically have to use like 20% time, whatever they call it at Google to figure out how this shit works.

Like you wouldn't make a progress because it just, it, it's like, it requires trial and error and experimentation. Hey, I think you're right in one sense. That's like, there's, there's no guarantee that your problem tomorrow is going to be like within sort of your, your feeling of like what the machine can do or not.

And the model might change and they might, I don't know, regress something. Like there was a time when Antropic had a bunch of server errors where like the quality of the model went down and you wouldn't know. It's just like, you got to post more than a month after.

It's like, yeah, your shit didn't work quite as well as it used to. So I, and, and individually you, you, you feel like gaslit by this thing all the time because like, I feel like this kind of problem I did successfully before. And now it doesn't want to do it.

So I think that is correct. And to some degree it doesn't fully go away. But, um, but man, I think the reason it's called vibe coding to some degree is because there are these, these feelings where you get a sense of like, does it work or not? And it's, it's, it's mind bending in a way because as a programmer, you're used to determinism or you want to, you chase determinism as much as you can.

And now we have like, yeah, fuck determinism. It's just whatever. It just wipes. So that is really frustrating. Particularly if you try to use these things for like, also like building an agent or something. I'm with you on the, on the, like, there's this, this part about it. But I think the question is like, what's the percentage of it?

For me, the feeling of the percentage of like the shittiness and the weirdness, it's like 20%. And it's not 90% because it's, it's only that small thing. I still feel like I get a lot of value out of it. But I heard a lot of people say like, well, nice.

This is like this 90% of my problem is the shit doesn't work like I want. And so the 10% improvement is just not worth it. Like, yeah, if that's how you feel about it, I can see it. It's just doesn't really, I don't feel like that. My vibes are different.

By the way, I have a feeling that by the end of this conversation, you might stop using AI because I'm feeling like I'm convincing you. So if you want to interrupt the call early. So I have, I have, I have, the list goes on. I do have a thing here, which is a time management and expectation management.

And it's like, does it save your time? And I think like the main way in which it saves me time is concurrency and parallelism. I feel like I'm solving multiple problems simultaneously in one way or another. I think a pretty big way in which I do actually save a lot of time is that there's a bunch of problems that are going on, which are fully solvable while I'm solving another problem.

And for, for someone who runs a very lean organization right now, it feels like I can do more, right? And it's like, even, even Mitchell Hashimoto mentioned this recently. It's like, you can spend time with your kids and you feel like productive still. And I leverage that a lot.

Like I really leverage this a lot. This idea that even when I'm doing something, maybe it's unhealthy too. Like there's probably some version of this where like, you should really turn off your brain at a certain time of the day. But the fact that I can sort of multitask is for me really the thing that gives me the biggest feeling of like, this is good.

I'm not an expert on this, but I do know. I've heard multiple times that people who multitask have perceived the perception of them being productive and in reality, they are not. So I think this is like one of the traps again, when you can be fully convinced that you're saving time, but in reality, you are not.

And the problem here is that it's really, really hard to really benchmark it and get a definitive answer if it works for you or not. You can't solve the same problem twice. One week with AI and another week without AI, it's just not going to work, right? Finding equivalent problems and then sort of benchmarking against that is also like almost impossible.

So like, we will never know. I guess, I guess what will happen is that those tools will keep improving. And at some point of time, the advantage of using them is going to be so clear. That's just going to be undeniable. That's going to be hard for people like me to even make this argument.

But like, at least right now, I find myself as a senior engineer, software engineer, to be not fully convinced that it saves me time. I'm still doing it because I'm enjoying part of this process. And sometimes I want to be this dopamine junkie and play with the stack. And sometimes I have a feeling that I 100% know that AI will work here.

So it would be silly of me to not use it. But definitively, I can't say yet that it's 100% improvement in like me writing code myself. Sometimes, yes, like this example with prototype, 100% AI helped. Day-to-day in a more complex code base, I'm not so sure. Like I had plenty examples where I wasted, it felt like I wasted days of time trying to make AI work.

I got something in return, so it's not completely like a loss. But this is why I feel like it is because this is a problem that's also hard for an engineer, or is it a problem that's hard for the AI? Because I, for instance, like I don't work at Sentry anymore, but I did play a lot.

The last couple of months was like, how good can I do authentic coding on the Sentry code base? But since it's sort of fair source out there on GitHub, so I can still fuck around with it. And I found it equally frustrating to work on that code base with the agent as myself.

Because we have created a monster of like a multi-service complex thing. Who knew collecting errors is hard, right? If you build a large thing with lots of components at scale, a bunch of stuff is not fun. But I think it's also not fun. I'm not joking. I'm just pointing out that it is actually hard.

Like production services, yeah, it's complicated. And I think like if I were to take a lens and I said like, well, the problems that the agent runs into are the problem every engineer runs into. And one of the ways in which to become productive with a large company is to figure out which things not to do because there's pain on the other side of it.

But in some ways, the reason we have developer experience teams and companies is just to reduce the total amount of like, I don't know, like potential code changes that you have to do where you'll run into a wall and the wall is just terrible developer experience. So I wanted to which degree sort of what you run into is also in parts that it's actually not great for a human either to work on this.

I'm not saying you have a shitty code base, just to be clear. I just wondered to which degree this is limited. Yeah, I think it's quite related. But I want to shift gears a little bit and talk about the implementation of those tools. And I think it's a little bit related to this problem.

Okay. We have a super complex code base. It will not fit in the context. And even if the context is huge or advertised to be huge of like millions of tokens, in reality, we know that it's like the performance degrades after like 32,000 tokens or something like that. So you, and this is the problem of context management.

And this is the weirdest part of me of the, of this whole AI revolution, because we have this amazing LLMs and it really feels like the future, but under the hood, it's powered by principles from 1980s, like expert systems and whatnot. Like you as an engineer pick up what you want to feed that beast.

And that feeding happens on a lot of assumptions, a lot of hard code and luck. Sometimes it works. Sometimes it doesn't. You can trust an LLM to itself form the context. So you have to feed it with something, with your prompt. And then your ID adds some context around it.

Maybe your open tabs in your editor. Maybe some settings in your, I don't know, PyProjectTOML or CargoTOML or something like that. And then that's it. Like you just, just, just wait for magic to happen. But it's an important bit that actually feels like we don't have any progress there at all.

This, this, this notion of forming the context. And when you are just building a prototype, the context is empty. All you have is your prompt and maybe a couple of desires, like write it to me and go or whatever. Or when the project is small, it might all fit there.

In the compass code base, context forming is extremely hard and it might not necessarily be. Those files that you have in your ID might not be necessarily relevant. And asking you to just drag and drop tabs meticulously for every prompt when you select which files it should focus on also feels like a drag, right?

So that's my, that's my problem with it. I mean, the whole context management thing, it feels completely sort of out of sync with the whole forward thinking. So I guess my question to some degree is like, how do you work? Because, um, I, I'm the kind of person that really only makes progress in anything that I'm doing when I talk to another person in some bizarre way.

And, and maybe I talk to like another version of me, but I have to talk for a problem. Like I have to, I have, sometimes there's weird ideas like this shit is going to work. And then as I'm sort of like working myself for it. Like I just actually, it's just like, there's a bunch of holes in it.

And so for me, almost a natural part of solving any problem is to work my way through it. And that lends itself really well to agenda coding because that turns out to feed all the shitty context into my brain. And now I just need to get the machine to do it too.

So like, I just really just have, I just talk to my computer a lot and it's a byproduct. The context problem is almost in quote solved because it's just, I don't know, it, it doesn't, to me in a way, it feels like I'm almost always anyways, working in a way where I just feel the context either for myself or for the machine.

So it's not such a huge departure, I guess, to, to be a little bit more descriptive to me of what I want the machine to do. And so maybe that also is why I maybe encountered a little bit fear of problems because I don't, I think you call it a drag of like providing all the files in context.

I don't know, for me that has always been natural. I used to maintain this, like this, this files where I write down all the steps that I want to do prior to doing it. Because I feel like that gives me a better understanding of what my change is or how I have to structure my change.

Like a lot of it refactoring, it's always like, you got to do this first because otherwise the second step is much harder. And so that to me is like constant engineering that you even do without AI. And so that is always how I worked in a way. So I don't feel it's a drag, I think.

And I don't know if I even want a machine that figures the shit out itself because then like, what is my role in it? I feel like my role is to provide the context. Right. So I don't know, is it not, is it not related to you in a way that I feel like that's the part of engineering that's actually like.

No, no, no. It's how you piece it together. It's true. Like what you're saying, like for example, I have ultra wide monitor. Why? Because like when I'm writing software, I want to have multiple columns and I want the context for myself. I have sometimes the same file open like side by side at different locations because I need to stare at them to make progress.

So just like you, I also have to have context before I'm writing code. I have to understand what files it will touch and what files are related in the project and all of that. Like, like naturally, I think any engineer does that. Maybe like one of the failure points for me is that I'm not talking to the computer.

I just recently started experimenting with that. Just like bound, like double tap on a fan key to capture audio and transcribe it. And it worked fine. But like, it's not a habit for me yet. So I don't yet have definitive answer if it works or not. But still, sometimes you might have a file that has 5,000 lines of code in it, which for what it's worth is questionable.

Like, why do you have files that long? It's, it's, it's, it's, it's, it's also weird, but I can see Claude losing the threat often in a complex code base. When you have too much of that stuff. Sometimes the problem is just like that you have to touch 20 files to solve it.

And it's hard for you to make the. Context smaller, like to subdivide this problem into smaller problems. Sometimes I realize, hmm, it will take me like three times longer to write this prompt, to explain it, everything in nitty gritty details, like what it's supposed to be doing. But if I do all of that work, maybe just easier for me to just go and do it myself at this point when I'm explaining everything.

So it's, it's like really, really hard, at least for me. Maybe, maybe our project is specific. Did you, did you see a difference between, I know that you also work on CPython, right? So did you see a difference from working on CPython versus like on GEL, for instance, in your effectiveness of AI?

So I'm not actively working on CPython, but recently I rewrote UUID implementation from Python to C. It definitely helped. Like, I have a feeling that it helped, but also it was a clean slate problem. And also LLMs are amazing at translating a thing from one language to another language.

So this is where it helped. It still made like a lot of non-idiomatic to CPython things. So ultimately not, I doubt that there is like a single line of code that survived ultimately. I rewrote the whole thing, but it did save time because a lot of boilerplate was, was, was, was reduced for me.

I didn't have to copy, paste and change things as I usually do. LLM can do that for me and then I can edit. So that's my recent contribution. And again, like it's a bad example because it felt like a new thing. I should try to use it more often.

I found, I found one kind of interesting failure case. And I talked with Lukas at Europython this, which is I actually found it to be much harder to work on code, which is in the training set, then on code that's not in the training set. Interesting. Um, and it's, it, I think it's sort of counterintuitive, but it, it might also make sense because one of the problems is that you basically work on something that's overrepresented in the training sets.

Like the CPython code base, probably there's, there's a lot of it in the training set, but, but the one that you're working on is actually not the one that has seen. It's the, it's, it's like the nine months newer version of it. And I, I've also encountered this with the century code basis.

Like it, it, there are millions of forks of century out there, some of which are really, really old. So it, it just hallucinates code together, which has not been in the code base for a long time because it thinks it's still there. So that I found really interesting because that's an entire class of problems that where, where, where, like, I don't know if it's going to be a representative going forward, but for instance, like you get this idea that like, if you have this huge context and like all of your shit is in the training data, then it should be so much better.

But the reality for me at least is that unless it's sort of day-to-day, up-to-date, which unlike to be is, it's actually not helpful. So now I actually remember there was one problem with my UUID work at the Python core sprint, writing it in C. The problem is that Python C API has like an old style way of declaring a module, Python module in C, and there is a new way.

New way is like two phases in civilization. There is like a notion of module context where you sort of put your global variables and whatnot. There is something required for you to do if you want to use free-threaded Python in the future or sub-interviews, stuff like that. So it's required.

And obviously if you are working on Python code base, anything new that you do must, must follow that new thing. And it's quite different API, like internally, just the different arrangement of the code and type declaration. And there are some gotchas there. And this is where it felt miserably.

So essentially it was insisting on doing things the older way. And yeah, I had to actively fight it. And ultimately, I think I give up and just focus on like little things, like write me the body of this function. That's the part where I feel like I kind of want to talk about open source a little bit and the impact that this whole thing has on it.

Because that's the part where I might actually take the opposing view and say, it's going to be terrible. Yeah, yeah, yeah. I think we might align. And again, you might stop using AI for this conversation. I have a lot of thoughts on that. I'm not sure. But I actually, I worry about this quite a bit because I actually feel like, I actually could take the view here where it's like, unless we're really careful, we're going to make a huge mess of it.

The whole thing of open source for me has never been that we need more open source. I always felt like what you need is like open source libraries are sort of common problems that a lot of people are sort of banding together and they're delivering the best quality code that can be so that we also overall can build better companies.

So that to me was like the idea is like, if there's a really hard problem, you get the best people together, you cooperate together, you build this thing, and then everybody is hopefully going to leverage this. With the cost of code generation going down and seemingly everybody loving the idea that GitHub stores is all that matters.

And we're going to have millions of modules on NPM and like having thousands of dependencies of an application is actually the way to build shit. Actually, I think like because it is such a predominant view, this, this can only end terribly with so much more actual real slop going around.

So I don't know, do I have a counter to that? Because my view on that is actually, unless. I have a lot of thoughts on that. First, let's just, just to get that elephant out of the room. I think there could be a more fundamental problem with all of this is because it's proven now that if AI is trained on content generated by AI, the quality degrades significantly and the open source is just this perfect mixer of things.

Because now you have pull requests partially written by AI. It's like really hard to separate which part of it is written by AI, which part of it is written for human, even for humans. That's part of my few complaints in a minute. But it's also hard for AI and for all those training pipelines or whatever they have to create those modules.

So I'm not an AI researcher, but like I have enough in my context to know that this is a problem. So I'm a little bit worried about this. We're going to see a lot of new code written, potentially 10x, maybe 100x new code written. There's going to be a sea of new stuff and like who knows which part of it is AI, which part of it is human brilliance.

So with that out of the way, problems that I see, like let's get back for example to this comfortable example of UUID. It's sort of required by CPython contribution policy that you sort of notify people if you use the AI or not. And here I am. I don't know.

Like I really don't know how to answer this question. So it's 20, 100 lines of C code, which is significant. Most of this code is kind of mundane. This is not like, I don't know, dict object C where we have a complicated pointer, arithmetic magic and whatnot. It's mundane C, but it's still C.

So it's sharp. You can die there easily and take the interpreter with you easily. So it has to be reviewed. And responsibly, I wrote that code responsibly. So I'm not, I don't have a single line where it's just AI generated and I haven't touched it, rewrote it or reviewed it really carefully.

But then the instant I'm saying I use the AI for this, the whole thing is like dismissed because people, nothing that's 20, 200 lines of C. He's not an insane person. He probably generated half of it. So it dismisses my work now, but also I myself in a similar situation, even like I know Lukasz, for example, he submitted that and he said I use the AI.

I would also be dismissive in this case. So how can I even trust people now with this kind of stuff? Like what is the social dynamic here? The social dynamic is like really, really hard. And I was, I think like, this is one of the reasons why I feel like I don't even want to take a side on anything when it comes to the social dynamic, because like, this is just going to have to play out somehow.

But one of the areas where I definitely noticed, like, this is, is all kind of wonky. It's like, if you, I definitely have released source code out there, which is a hundred percent AI generated. And one of which was this Vite plugin. It's a very simple Vite plugin. It just forwards the console log to the Vite logger output so that you can see the console log in the browser in the terminal.

A hundred percent AI generated. And I, and then I was like, okay, so I'm going to publish this now. And then I wrote a license under Apache 2. And I also said like, if that applies, because quite frankly, laws in the US, courts in the US have already said like, that's not human generated output.

So there's also the question of like, if you actually have a significant amount of, of code being traded by, by AI, like, does it still cross the threshold of what we say, like, that is actually genuine human creation worthy of, of, of copyright. So like, even on that level, I'm now starting to like, look at a lot of source code out there.

Like I've, I ran into a company where I, they have code on GitHub. I think it might be, I don't know if it's open source license or if it's just happens to be in GitHub, but, but I found like implementations, uppercase implementation_summary.md in the full of emojis somewhere in the code base.

And I looked at the code was like, this is like, someone is vibing hard here. And, and, and if you, if you hang around and sort of the startup kind of ecosystem right now, there's so much of people throwing shit on GitHub, which is probably just a full AI output.

Yeah. And how are we going to respond to that? Like it's one thing of seeing this against like a pull request on an like established open source project. And maybe there's at least something in place, but there's a, there's going to be a whole range of people, which is like creating this amalgamation of, of, of, of, of different kinds of things there, which are just regurgitated in some way, human output.

Yeah. It, it, it erodes trust. That's the problem. For sure. Because I might know that, Hey, like, well, I'm not going to be using code written by you anymore, but like, for example, let's say Lukas, like I, I know Lukas well, he's, he's a brilliant engineer. So here's the library written by him.

And I'm pretty sure that if this let's, let's assume it's something like to do with like high performance IO. And I know that Lukas would like look at everything, everything that touches like the core logic and the, it will be solid, but there is a lot of code on the outskirts of it.

So if there is like disclaimer, I use the AI here, can I trust that that code is fine? Maybe he didn't even review it. I don't know. Maybe it just appears to be working. So this whole, this whole problem of like me trusting someone personally or some organization personally.

And, and, and just, just because of it's that organization, I trust that the code is good. I feel that no longer applies and it's huge. Yeah. No, I, I shared a concern a lot actually, because I don't, that's the thing where like, I, I don't actually, there's one thing where I know like what the machine does to me.

Right. But it's a data point of one, you know, in a rather narrow set of things. Right. And so I cannot even argue, like I, I trust the AI, I don't trust the AI. It's like the only view I can take is like whatever I create together with the AI, I'm responsible for.

Yeah. But I don't think that's the view that most people take. And I don't, and there's like no social standard for it, nor anything. And it would be irresponsible for me to say like, well, because my experience is this, that's sort of like what everybody's experience is going to be.

And, and we're also very early in this because in some ways we're, we're doing like this version of open source agent decoding for like six months, give or take, even less, I think in some ways. Because for sure, what I see is that after cloud code and after new codecs, there's a lot more, like we're way past this sort of little bit of autocomplete, where you were at least like very actively paying attention to every single thing that sort of autocompletes out.

Now we're like, well, let it run for like 15 minutes and then let's do some code review here. So I, I'm with you on the, on the trust part. And I also think that one of the problems with this is now it has created, look, there are also positive things, right?

I think like creating repo cases for bug reports, great time saver in the context of open source. Because you used to get this, this bug reports of like, well, shit doesn't work. And then the only thing I could do is like works on my machine, give me some more details.

And now, now I get this to like copy paste this, try to make a repo case and usually gets one. So, so those are positive things, but I'm, I'm definitely suspicious of pull requests now. And it has made a lot of things much cheaper than actually were pretty good that they were not cheap before.

Because some of, some of the issue reports are clearly not even human generated. They're also like AI generated. And security reports against libraries, they're like partially completely, Claude, invent some security issues so that can annoy some people to get a CVE for. Like all of it's just insane now.

It's like objectively insane. It's insane that the well-written issue is a red flag. Or code with comments like, no, like, I'm not going to touch it. I used to like, if I, if I got a long, like a long issue against one of my project, I was like, oh, this is like going to spend some time reading for this.

Like, this is great. And I get a long issue. It's like, oh my God, someone went hard on Claude here. But it's just like, this is, this is the trust eroding part. I really, really hate. Because like, that's, that's not so much about like, what the machine does with you.

It's like what, what we as a group of engineers find. Like, I find it irresponsible when people sort of shit some AI stuff into an issue tracker and don't declare it. Or into my mailbox for what it's worth. I get so much email now that looks like slop. And I, I don't even want to engage with that person.

Well, the idea is that you don't. You ask AI to do that and just never check your email again. No, but that's really trust eroding and I hate it. I really, really hate it. Yes. And I, it's like, it makes people lazy. We're going to talk about it in a second.

That's, that's, that's the thing. Like, trust, trust erodes, not just like between me and somebody who I know or, or know about. Trust erodes within the team. That's the thing. Like, it's hard to understand. It's hard for people to sort of know what works and what doesn't work. And sometimes you can see a pull request.

I know that, that the code in that pull request is written by human and I know that humans, so that is fine. I can review it. But then tests. And I see that tests are generated by AI and it's such a common sentiment online that, hey, just make it run tests.

But you know that probably that having a big test suite might not necessarily be a good thing. Like if you have, if you have duplicate tests or if you have too many mock tests, it's, it's, it's, it's, it's, it's worse than not having tests at all. Sometimes tests are expensive, extremely expensive to maintain, to evolve, to do anything.

And if you heavily rely on mocking, for example, you might not have a test suite at all. You might have an illusion of having a test suite. And AI is really good at creating this illusion of being able to write tests. And because tests are always treated like a second class citizen, but a lot of engineers say, like, I'm not going to be investing much time into reviewing that part.

And let's just hope that it gets it right. I can see myself just like really being scared of, of, of the situation. Like we're just not going to write high quality software because high quality software demands having nice tests. So I think like, this is sort of a meta point for me in general.

I've written about some Twitter autonomous. The, the, the quality of AI generated tests is like, it's so bad. That's, that's the part, that's real slow. But I also think that this is, is to a large degree because like we as engineers also suck at writing tests, like really, really bad because I, I've seen many more bad tests, mocks, for instance, I hate them.

There's so many situations where it mocks out the part that's actually the one that's most likely to fail and then it fails. Right. Or, or like so many tests where people just write integration tests that, that rely on bizarre sort of side effects in the system somewhere. Like let's put some sleep here.

Let's do some, like all of this. Right. And, and you're going to get a lot of these tests generated by AI now, but they were already pretty bad test patterns before. And yeah, we treat them as second-class citizens. That's, that's. I think like as engineers, we don't know at scale, we don't know how to write good tests and we're even rewarding people for writing bad tests.

That's the problem with tests. Is that like the more you sort of progress in your software engineer career, the more you understand that tests are hard and yes, mock tests are horrible. That, that knowledge internalizes in time. Usually companies go down because of that lack of that knowledge, but tests are also extremely demanding, as you said, and to not have mocks means building infrastructure for tests.

And sometimes you might have, you might end up building more infrastructure for your tests than infrastructure for other parts of like the actual production code sometimes. And then once you build that infrastructure, you need tests for that infrastructure. It's like, it's, it's a fractal of a problem to have good, reliable test suite.

Running it, paralyzing it, make sure that like it doesn't run hours on GitHub CI and slows everything down. This is really hard. And yes, AIs are not trained on that. There are not too many good examples of good test suites. And even if they are, usually those test suites are so highly specialized to that specific project, they're an integral part of that code base.

You can just separate it. So it needs deep insight to get any, like any, takes deep knowledge to get like any good insights that you can just apply to another project after that. It's really hard. And I have a feeling that it will take a couple of years of like LLM progress before, like they can sort of extract that information and reapply it to your domain.

I think like at one point you sort of, the only way I think that is going to get better is like, if, if we maybe also solve the other problem, which is sort of like maybe the quality of AI goes down because they're trained on an LLM output. Right.

So at one point I think like someone has to find a way to judge the quality of a code base and be more selective on, on the, on the learning part. And I think for tests, it will be necessary to some degree because there's just so many bad tests out there.

And they're like their entire test frameworks, which encourage shitty tests. And there has been a generation of programmers that really believe that those shitty tests are exactly the gold standard of a test that you should write. And, and I think like that's, that's actually a problem for open source to large degree.

Books written, books, whole books written about like writing tests in a horrible way by prominent publishers. I mean, this is, this is the weird thing is like, that's actually why I feel like my, my job is going to be secure for, for generations to come is because really like eventually you realize that you're getting really good, but being very counter cultured because the culture is, is sort of, is sort of going to the median of software engineering, which is where not good quality is being created.

Yeah. But yeah. So I have a feeling that the problem here is like even more fundamental and it probably has to do with the current like technology of LLMs. Again, I'm not an AI researcher, but I'm hearing from here and there from prominent AI researchers that LLMs are like either at that end, or like we need next big revolution in LLM to happen.

And to me personally, it all boils back to this active context management because LLMs don't have memory. All they have is context. Every, every new task is completely new from them. A person can learn. A person can internalize the, the, the, the information about this project, about knowledge, about the mission of this project and about some meta and the, and understanding of the domain that you are trying to solve.

But LLM doesn't have that. And writing tests, good tests requires all of that. And Claude.md will not help you to capture all that knowledge. So until we see next generation of those AI of either LLMs or God knows, maybe it's going to be something else that sort of has this part of context management inside that loop, inside the model, somehow, I don't know how, it will continue to be this problem where there are just some areas which appear to be easy or appear to be non-important, like good tests for your production application.

Not going to be solved without intense human input. And also like deep culture adjustment needs to happen. I think my only counter argument here is that I think that we have as an industry created so much complex, complex code and like over engineered shit and really bad tests that now the question is just, is it going to get worse in a way?

Because the correct solution here is actually in a team to push back on slop either human generated or machine generated. And at least I found over the years, both in a company and on other projects in open source or elsewhere, it is actually very, very hard to take a principled stance on things that sort of most people think is actually a good thing.

That's actually really hard. And I think in a lot of projects, you end up with the kind of code that you know after a while, you should never accept, but other contributors on the project will accept it or it's sort of industry standard, they need this stuff. I don't know to which degree that is necessarily an AI problem.

I think is, I guess a little bit my call here. If we concentrate on tests as an example, are you going to get worse tests now with AI than you get? I think you get overall more tests, but the percentage of the shitty ones are probably like all things equal, you're just going to get more tests.

So you're going to get more shitty ones, but the percentage might be the same. Because I actually think that the AI sort of perfectly creates sort of the standard crap that we have created. The reason it's so hard to work on large code bases, at least what has come out of the Silicon Valley in the last 15 years is like super complex systems, like overly, overly complicated.

Everything needs to be outsourced to some third parties, like infrastructure startup. Like it's just insane. I don't know. I feel like that there's, that is at the root of that evil to some degree. And I actually found AI to be at least some sort of pinnacle of hope here, where rather than me having to go and use this infrastructure component that some random company might give me, because like, we need to do it this way because otherwise it doesn't work.

Now it's like, okay, you know what? I get the 80% of solution to this. It's a 500 lines module code somewhere in my utility module. It's exactly tweaked for exactly the size of my problem. I'm going to use that. And as a result, I have less crazy stuff going on.

So I feel like there's also like a flip side to that. It's going to be so funny if it's going to be bad unit mocked tests that break the AI back. And people just say, no, this is, this is the end. Let's not use AI after all. But it might as well just happen.

Yeah. Because bad tests will slow down things, not only for people, but for AI as well. Testing incorrect things, incorrect behavior, codifying it, incorrect behavior in a test, which creates a bad sort of feedback loop in the development for everyone specifically, especially for AI. Yeah. All of those things are unsolved.

I'm with you that part of this problem is coming from humans. Establish projects where you already have a good functional test suite with a good functional harness to run it will likely benefit. But even those things will degrade if people just blindly allow LLM or outsource the task to the LLM.

So I think some education must happen and it might happen just naturally because people will start observing these problems more and more and more and more and finally understand the value of a properly written minimal functional test suite. But again, it's definitely going to be a learning curve, not just for EIs and LLMs and and everybody, but for like software engineers themselves, 100%.

And ultimately, this is good. Like we can actually say that this might actually have net positive effect out of all of this is that... Look, there's optimism. Yeah. There is some optimism that people will actually understand the value of this because there is a lot of also like misalignment on tests in general.

Should we have them or not and whatnot. Well, if you want to use AI, then you have to. There is just no way. You should come to the question of like, how hyped is it? Before, before that, like, let's quickly chat about the other thing, which is laziness. Laziness and the brain numbness or the way how I wanted to send it to your first brain smoothness and by another complaint with AI.

So as I said before, I can, I can write Rust, I can understand it, but I'm not good at Rust. Not to the degree that I'm good like at C or Python or some other programming languages like TypeScript or whatever. Rust to me is still new. And what I observed is that there are a couple of code bases where I contribute and where it's like, it's socially acceptable for us to contribute some of the code that say I generated, people can review it and I'm open about it.

That's fine. But I'm not learning it. That's the thing. Like, I'm not exercising my brain. I'm, it generates Rust code. I'm making some fixes here and there, but it's not compared to me writing that code at all. So it feels like I, like my progress is stalled. So I have to make a conscious decision of not using AI in order to get better.

But there is a significant pushback towards it because the kind of Rust code that I need to write is actually extremely simple and like an RPC method here and an RPC method there. AI excels at that kind of stuff. I'm just harnessing the powerful already written code by humans and exposing it.

That's easy, but I'm not doing it. So it's like I'm on a Smith machine doing bench and my personal trainer just lifts it for me. So I'm doing the movement, but I'm not getting stronger because of that at all. And it's such a deep trap. So you say that you're better at Go now.

And I'm curious, like, are you? Because maybe you have perception of you being a good Go engineer now, but like in reality, it's not. I don't know. So I'm definitely a better Go engineer than it was like six months ago for sure. And it is very objective because I basically couldn't write any Go code.

like I could, but not like. There's a lot of stuff in the standard library I would have to go to the Go docs for all the time. And I was like, okay, I know how this shit works. Like I definitely learned a lot. I think it is very easy to fall in the trap where you don't learn.

And I made this learning the hard way for sure. And not even just for, in the context of programming. Like I feel like the thing that I learned the most over the last two years, just working with AI to begin with is understanding that if you turn off your brain, really bad things can happen.

And so it is in the same sense of like, you have to, if you go with your gym example, like it doesn't help you that you know that you have to lift. You have to like, you have to make it a habit of going there, doing it regularly. Increasing weights.

That's, that's not something that comes naturally. So neither does like working responsibly with API. You have to, you have to go into this, the reinforcement part. But then once you, once you, you understand the dangers and like how to work with it, you can make it more of a habit.

So I think that's what makes that work. I really liked, I said this before, I really liked the ENTROPIC, like marketing campaign over the hand. It's like, there was never a better time to have problems. Because I, that's, that's how I feel like it's like, I have a problem now.

I work and I use it like a better search engine, like talk myself through it. Like I learned a bunch of things where I feel like I wanted to learn that before, but now it sort of can dumb it down and sort of can transpose the problem into a space where I actually feel engaged.

So if you want to learn, it's great to learn with it, but you have to make a conscious effort of wanting to do that. And if you, I don't know, if it's just vibe, if you just don't even try, it's like, you feel like you're making progress, but in reality, you just, you will feel bad about it or you should feel bad about it.

That's, I think. That's, that's, that's the thing. Like logically I hear you, like you're making a good argument, but then I'm looking back at myself and I have, I have two modes of operation. Well, it's more, but like the spectrum is either at the end of the spectrum or I'm a beast.

Like I'm writing code, I'm debugging it. I'm in the loop. I'm wired the keyboard and I know the problem and laser focused on it. I will solve this problem. We'll fix this bug eventually. In another mode, I'm just like, like a jelly on, on, on my chair, just lazily typing prompt.

We are, we are going in circles with Claude, both understand that this is unproductive. Claude hates me. I hate Claude. And we aren't like in this thing. And I know that the bug, the wall that it can cross, I can do it easily. I don't want to because I'm, I'm, I'm in this weird sort of like regime now that I'm just like not actively engaged with it.

I'm just wasting time. And that's, that, that, that's my problem is that when you use AI, you are not as focused on alert, the TikTok of programming. Yes, but it is, but it is like, it's just, this is the laziness. This is, and then when you are in that mode, it's really hard to learn either because you learn when you are alert, when you are focused and when you are not focused and when this thing is doing for you and you're tired, like suddenly it works.

You just like, okay, like PR is ready. You don't want to learn anymore. To my point of earlier, actually from a societal point of view, I have like this deep rooted concern that this will turn into yet another social network problem where people are like, this is going to be great for humanity.

We're all going to talk more to each other. And now all we have learned is like, we're smart zombies now with like a lot of psychological problems and everything. Right. So like, that is my concern on this, but like individually, I feel like I have solved this for now for me, but like, I hear you there.

It's like, there's definitely a version of this where, man, it just really feels like there's a, if you don't catch yourself in a moment where you just give into the machine and you turn off your brain, like it stops being great. It just has all the, the something, yeah, I mean, for sure.

I think part of the problem here is that this is what I'm trying to solve now. Hopefully I will finish that project with my team soon. Uh, but part of this is that the, the, the AI is a little bit disjointed from your day to day, like actual workflow as a software engineer.

It's, it's like this chat box where you type things, but it doesn't affect how you use tools yourself. So I want to fix that problem. Hence, hence the whole the terminal magic. Uh, I wanted to be part of that, but I think something like that is required. Like when AI actually augments tools instead of replacing them.

So I'm not saying that I will solve this problem by any means, but I think that's where the direction must take turn eventually is that AI would augment your workflow and not replace it. I feel that one of the biggest problems right now with AI is that it attempts and the marketing and, and, and everything and the social pressure attempts to replace you instead of making you 10, 10 times more productive in what you already know how to do, just make you more powerful.

So that's the thing that that's, I guess my biggest sort of complaint about the current landscape. And we might actually have enough of AI technology to make what I'm talking about happen. It's just as an industry, we're not focused on that yet. We're just like drinking this Kool-Aid of like generating new stuff in the chat window.

And I think one of the problems that I have is like, because we're so focused on this idea now that this is like the revolution and everything, any attempt of bringing some sort of nuance to anything is immediately either ridiculed or it is, um, like there's, there's real pushback against it.

One thing for instance, like it's, it's maybe not so much for programmers, but like, I noticed a lot on like, I'm still on this social network called X. Um, there's so much pushback against Entropic for instance, because they keep talking about like, it's going to replace jobs and stuff like this.

And they have this like, they sort of come out of this, like what is called EA movement. But I think like it is actually, I think it's good that people are talking about this a little bit in, in, in the AI companies. It's like, yeah, this might have some impact on society.

And maybe you should think about this. Because I, I think that unless we actually start thinking about some of the consequences of this, maybe like being responsible of how we use it, we're not going to be responsible in no way, shape, or form look at us. Like we so smoothly transitioned to the comfortable topic of overhype in AI.

Uh, it's, it's, it's one overhype is what I'm concerned with. And like what you, what you said essentially, like the whole, the whole thing is part of it is that I have deep concern that AI is just being overhyped. Like the whole thing of like replacing humans and replacing, maybe not software engineers, but I don't know, like support people and whatnot.

So far there, there are not too many examples where that are actually successful, but it's being touted as the next big thing by everyone, by CEOs of those shops and labs, uh, for sure. Like it's always three months until we have a full, full, full AGI. And I'm just getting, it's not even funny anymore.

I'm just, I'm just exhausted. I'm exhausted about reading that. I'm exhausted about reading reactions to that or people talking about that at this point of time. And I'm, I'm, I think I personally feel an enormous amount of value in AI, in chat GPT, in Claude, in, in, in everything.

I'm worried that this overhypeness will just actually create a bubble, inflate it artificially. And to a degree it probably is already. And then this beautiful thing might collapse and then I don't have an opportunity to, haven't had an opportunity to actually like even enjoy it, like, because, because, because it got overhyped.

So I'm a little bit worried about that. I think people have unreasonable expectations about the effectiveness of AI and they, uh, uh, and they make plans about it and they talk about it without, uh, actually building proper expertise. And it feels like a lot of wrong decisions and opinions are made about AI.

At least this is what it feels like to me. So I just wish that we would be like, whoa, whoa, whoa, like let's calm down. Like let's, let's, let's not try to replace everyone with AI tomorrow. Let's, uh, let's build better, better tools, better tests, not for tests that we were talking about, but tests for AI.

Like understand how you can actually replace anyone with AI, because you need tests for that. You need to make sure that whatever agent you create actually works. That's an open area of research right now. How do they do that? Like, I think the problem with the hype in particular, like I, I, I still think it's going to change the world and I think it's going, they're going to be 100%.

And I think that there's, I don't necessarily know, like there will be like 10%. I think there will be more programmers. That's generally my view, because I think like more people will program, but, um, I think like the definition of a programmer might change and stuff like that. But what is really, really frustrating is this ridiculous discourse around AI in any shape or form is like engagement bait and, and this ridiculous bets, which are going on about like, when we're going to have super intelligence and HEI and whatever.

Like someone has to get burned in that somewhere. And I'm pretty sure it's going to be the wrong people because statistically most likely it is. Exactly. Exactly. So, so there's definitely like, there's, there's some bubble going on here for sure. I don't think that if it's going to a pop, it's going to be, oh, we didn't want to do AI.

So we're going to do something else instead. I don't think we're going to end up like Dune where we said like, this is too powerful. Let's just say no to AI. I just don't see that. Um, so my, my view is that even with the existing models, even with the existing technology, we're going to end up somewhere where it's, it's actually going to be pretty cool.

But some of the bets going on are just pretty insane and interested. Are we going to need all that energy? Maybe, maybe it's also just a little bit insane what we're doing. I don't know. Like what is at one point, I think we have to look at this and say like some of the people which are currently paying someone's bill are doing this because they actually think there's some value in it, but they might themselves at one point feel like that was not sustainable.

And then they stopped paying for that. And then the service that they paid money for stops paying for the service that they paid. Like there's a, there's a, all that money that goes to different kinds of companies up and down the AI stacks right now. I don't think it's going to be sustainable long-term.

Someone made a misbet somewhere. I don't know who it is and maybe I'm wrong and maybe this is like the forever bubble, but some of it looks so unsustainable. Yeah, I agree. I don't have much to add on top of it. It's just my deep concern. I don't have any answers to it.

I'm concerned. I'm concerned about this technology. Yeah. Being impacted in a negative way because of the consequences of it being overhyped. I also have another like just fundamental problem with the whole thing. How can private companies at this point be so fucking valuable? It's like if you wanted to say like capital is all great because like people can participate in the public markets, there's nothing where me I will grow to at least in a reasonable way because that company is like it like we have created this behemoth of private capital that hold all of it.

There's a problem on its own, but it's just insane in a way that they grow like that. Like what was Google worth when it went public? A fraction of what all of those companies are even inflated for inflation and sort of the development of S&P 500. That to me like this insanity of creating this super national private large companies is just...

It's the pump phase for sure. Maybe we'll collapse. I really hope there's not going to be a dumb phase anytime soon. That's going to be massive. Anyways, they will learn something here. I don't know what they will learn. It seems like the conversation converted you to become an AI pessimist.

No, I am super optimistic. I love it. It's great. Am I optimistic about like what society will do with it? I don't know. I don't want to deal with it right now. Individually I feel super charged. Yeah. Well, me too. Ultimately I'm using AI. I think that a lot of our own practices might actually improve because of AI and will ultimately allow us to build faster and better.

I also think that there's going to be a lot more software engineers in the end because somebody will have to fix the AI mass and actually make good out of it. So I'm not worried about engineers being replaced by software at all, despite what a lot of people think on the street.

So yeah, I think the future is bright. It just feels like we are like really in a lot of uncharted territory right now and we don't have good answers to a lot of actually hard and imminent questions. So I'm not worried about HGI happening anytime soon because like what I see right now is that it's barely able to consume ClaudeMD reliably, left alone of launching nukes at us.

So I am not worried about any of that HGI-ness, but I am worried about the impact on the field and like ultimately AI slowing down some parts of software engineers as opposed to accelerating it. Unless we get better at using it and instrumenting it and just like better at writing software actually.

So it's an interesting intersection of problems for sure. All right, then thanks for the conversation. Thank you Armin. Okay. Let's see the feedback on this. I'm kind of curious. Well, I don't think that we said anything controversial. I was sh*t on some people indirectly. You never know.

Talking AI and Agentic Coding with Yury Selivanov

Transcript