back to index

Claudette source walk-thru - Answer.AI dev chat #1


Whisper Transcript | Transcript Only Page

00:00:00.000 | OK, hi, everybody.
00:00:01.400 | I'm Jeremy.
00:00:02.360 | And this is the first of our Answer.ai developer chats,
00:00:09.480 | where I guess we have two audiences.
00:00:12.320 | One is our fellow R&D folks at Answer.ai,
00:00:17.040 | who hopefully this will be a useful little summary of what
00:00:19.600 | we've been working on for them.
00:00:20.840 | But we thought we'd also make it public because why not?
00:00:23.960 | That way everybody can see it.
00:00:26.040 | So we've got Jono here.
00:00:27.960 | We've got Alexis here.
00:00:29.080 | We've got Griffin here.
00:00:30.560 | Hi, all.
00:00:33.400 | And we're going to be talking about a new library I've
00:00:37.720 | been working on called Claudette, which is Claude's
00:00:43.480 | friend.
00:00:45.440 | And Jono has been helping me a bit with the library,
00:00:49.360 | but he's going to, I think, largely feign ignorance
00:00:51.640 | about it today in order to be an interviewer to attempt
00:00:56.800 | to extract all its secrets out of my head.
00:00:59.320 | Does that sound about right, Jono?
00:01:01.840 | I think so, yeah.
00:01:04.440 | I'm ready when you are.
00:01:06.520 | Cool.
00:01:07.000 | Well, maybe we should start with--
00:01:08.960 | do you want to pull up the landing page?
00:01:11.520 | And then I think there's a few different directions
00:01:14.640 | that I'd love to hear from you.
00:01:16.040 | One is the specifics of this library, how does it work?
00:01:20.000 | But maybe also, since especially this is the first developer
00:01:22.640 | chat and preview, we can also go into some of the meta questions
00:01:25.720 | like, when does something become a library like this?
00:01:29.720 | How is it built? What's the motivation?
00:01:32.200 | Et cetera, et cetera.
00:01:33.400 | Sounds good.
00:01:34.360 | All right.
00:01:34.840 | So here we are, the landing page.
00:01:40.720 | So it's a GitHub repo.
00:01:43.480 | It's a public repo.
00:01:45.640 | And in the top right is a link to the documentation.
00:01:51.160 | And the documentation that you see here, the index,
00:01:53.600 | is identical to the README.
00:01:57.880 | But it's better to read it here because it looks a bit better.
00:02:02.800 | Cool.
00:02:03.360 | Fantastic.
00:02:03.840 | And this is a library that people can pip install?
00:02:07.040 | Yep, exactly.
00:02:08.240 | Here it is, pip install Claudette.
00:02:10.920 | And you can just follow along.
00:02:14.960 | Hopefully, when you type the things in here,
00:02:18.080 | you'll get the same thing.
00:02:20.280 | Or you could-- so that main page is actually
00:02:28.840 | also an index.ipynb.
00:02:31.760 | So you could also open that up in Colab, for example.
00:02:36.400 | And if you don't want to install it locally,
00:02:38.760 | it should all work fine.
00:02:40.680 | Cool.
00:02:41.200 | So I'll start with the big question, which
00:02:42.560 | is, why does this exist?
00:02:43.800 | What is the point of this library in a nutshell?
00:02:47.360 | OK, so by way of background, I started working on it
00:02:56.480 | for a couple of reasons.
00:02:57.880 | One is just I feel like--
00:03:02.440 | felt like-- still feel a bit like Claude is a bit underrated
00:03:07.160 | and underappreciated.
00:03:08.160 | I think most people use OpenAI because that's
00:03:12.360 | kind of what we're used to.
00:03:13.560 | And it's pretty good.
00:03:15.720 | And with 4.0, it's just got better.
00:03:18.920 | But Claude's also pretty good.
00:03:20.560 | And the nice thing about some of these models
00:03:24.160 | now, with also Google there, is they all
00:03:26.320 | have their things they're better at and things
00:03:29.080 | that they're worse at.
00:03:30.000 | So for example, I'm pretty interested in Haiku
00:03:34.120 | and in Google's Flash.
00:03:36.440 | They both seem like pretty capable models
00:03:42.200 | that maybe they don't know about much,
00:03:44.480 | but they're pretty good at doing stuff.
00:03:47.520 | And so they might be good with retrieval,
00:03:49.320 | which is where you can help it with not knowing stuff.
00:03:52.840 | So yeah, I was pretty interested, particularly
00:03:54.800 | in playing with Haiku.
00:03:57.520 | And then a second reason is just I
00:04:01.840 | did this video called A Hacker's Guide to LLMs last year.
00:04:05.600 | And I just recorded it for a conference,
00:04:08.760 | really just to help out a friend, to be honest.
00:04:11.960 | And I put it up online publicly because I
00:04:14.120 | do that for everything.
00:04:15.600 | And it became, kind of to my surprise,
00:04:17.520 | my most popular video ever.
00:04:18.840 | It's about to hit 500,000 views.
00:04:22.000 | And one of the things I did in that was to say, like, oh, look,
00:04:24.720 | you can create something that has the kind of behavior
00:04:32.000 | you're used to seeing in things like Constructor and Langchain
00:04:35.280 | and whatever else in a dozen lines of code.
00:04:38.800 | So you don't have to always use big, complex frameworks.
00:04:42.320 | And a lot of people said to me, like, oh, I
00:04:44.640 | would love to be able to use a library that's that small
00:04:47.240 | so I don't have to copy and paste yours.
00:04:49.440 | And so I thought, like, yeah, OK, I'll
00:04:51.200 | try and build something that's super simple, very
00:04:57.040 | transparent, minimum number of abstractions
00:05:03.160 | that people can use.
00:05:04.080 | And that way, they still don't have to write their own.
00:05:06.400 | But they also don't have to feel like it's a mysterious thing.
00:05:09.080 | So yeah, so Claudette is designed
00:05:11.240 | to be this fairly minimal, like no really very few abstractions
00:05:19.920 | or weird new things to learn, take
00:05:22.080 | advantage of just Python stuff for Claude,
00:05:27.680 | but also to be pretty capable and pretty convenient.
00:05:34.960 | Cool.
00:05:35.640 | You think it would be fair to say that this is more
00:05:38.040 | replacing the maybe more verbose code that we've copied
00:05:40.960 | and pasted from our own implementations a few times
00:05:43.200 | versus introducing too many completely new abstractions?
00:05:46.080 | Is that the kind of level that it's in?
00:05:47.720 | Yeah, I think a lot of people got their start with LLMs
00:05:52.200 | using stuff like LangChain, which I think
00:05:55.840 | is a really good way to get started in some ways
00:05:58.120 | and that you can--
00:05:59.080 | it's got good documentation and good demos.
00:06:01.480 | But a lot of people kind of come away feeling like,
00:06:05.400 | I don't really know what it's doing.
00:06:07.480 | I don't really know how to improve it.
00:06:09.360 | And I don't feel like I'm really learning at this point.
00:06:12.440 | And also, I don't really know how
00:06:13.960 | to use all my knowledge of Python
00:06:15.520 | to build on top of this because it's
00:06:17.840 | a whole new set of abstractions.
00:06:21.320 | So partly, it's kind of for those folks to be like, OK,
00:06:24.160 | here's how you can do things a bit closer to the bone
00:06:28.160 | without doing everything yourself from scratch.
00:06:30.160 | And for people who are already reasonably capable Python
00:06:35.160 | programmers feel like, OK, I want to leverage that,
00:06:37.760 | jump into LLMs and leverage my existing programming knowledge.
00:06:41.600 | This is a path that doesn't involve first learning
00:06:44.880 | some big new framework full of lots of abstractions
00:06:47.120 | and tens of thousands of lines of code to something with,
00:06:50.760 | I don't know what it is, maybe a couple of hundred lines of code,
00:06:54.480 | all of which is super clear and documented.
00:06:58.240 | And you can see step-by-step exactly what it's doing.
00:07:02.080 | Cool.
00:07:02.580 | Well, that sounds good.
00:07:04.040 | Do you want to start with a demo of what it does,
00:07:05.520 | or do you want to start straight with those hundred lines of code
00:07:08.240 | and step us through it?
00:07:09.680 | You know what?
00:07:10.440 | I'm inclined-- normally, I'd say do the demo,
00:07:12.760 | but I'm actually inclined to step through the code
00:07:15.040 | because the code's a bit, as you know, weird in that the code is--
00:07:23.800 | so if you click on Claudette's source,
00:07:25.800 | we can read the source code.
00:07:29.160 | This is the source code.
00:07:31.560 | It doesn't look like most source code.
00:07:33.800 | And that's because I tried something slightly different
00:07:38.400 | to what I've usually done in the past, which I've tried to create
00:07:40.800 | a truly literate program.
00:07:44.040 | So the source code of this is something that we can
00:07:46.120 | and will read top to bottom.
00:07:47.920 | And you'll see the entire implementation,
00:07:52.160 | but it also is designed to teach you about the API
00:07:57.400 | it's building on top of and the things
00:07:59.800 | that it's doing to build on top of that and so forth.
00:08:02.000 | So I think the best way to show you what it does
00:08:05.480 | is to also show you how it does it.
00:08:08.480 | So I'm here in a notebook.
00:08:11.520 | And so that source code we were viewing
00:08:15.240 | was just the thing called Quarto, which
00:08:18.440 | is a blogging platform that, amongst other things,
00:08:20.760 | can render notebooks.
00:08:21.640 | So we're just seeing the rendered version of this notebook.
00:08:25.280 | And so the bits that you see in gray
00:08:32.680 | are code cells in the notebook.
00:08:37.920 | Here's this code cell.
00:08:40.360 | Here's this code cell.
00:08:41.360 | And then you'll see some bits that
00:08:51.560 | have this little exported source thing here,
00:08:54.840 | which you can close and open.
00:08:56.200 | You can close them all at once from this menu here,
00:08:58.600 | hide all code.
00:09:00.000 | And that basically will get rid of all the bits that
00:09:02.320 | are actually the source code of the model.
00:09:04.320 | And all you'll be left with is the examples.
00:09:07.920 | And if we say show all code, then you can see, yeah,
00:09:11.080 | this is actually part of the source code.
00:09:12.920 | And so the way that works is these things
00:09:17.920 | that they say exports.
00:09:18.880 | So this is bits that actually becomes
00:09:20.760 | part of the library itself.
00:09:23.200 | OK, so the idea of this notebook is, as I said,
00:09:30.120 | as well as being the entire source code of the library,
00:09:32.400 | it's also by stepping through it, we'll see how Claude works.
00:09:37.240 | And so Claude has three models, Opus, Sonnet, and Haiku.
00:09:43.320 | So we just chuck them into a list
00:09:45.640 | so that anybody now who's using Claudette
00:09:48.400 | can see the models in it.
00:09:51.160 | And so the way we can see how to use it
00:09:53.280 | is that the readme and the home page,
00:10:01.720 | again, is actually a rendered version of a notebook.
00:10:04.920 | And it's called index.ipynb.
00:10:07.080 | And so we can import from Claudette.
00:10:09.920 | And so you can see, for example, if I say models,
00:10:12.560 | it shows me the same models that came from here.
00:10:17.200 | OK, so that's how these work.
00:10:18.640 | It ends up as part of this Claudette notebook.
00:10:21.160 | So that's the best Claude bottle, middle, worst.
00:10:28.000 | But I like using this one, Haiku,
00:10:29.880 | because it's really fast and really cheap.
00:10:31.680 | And I think it's interesting to experiment
00:10:33.440 | with how much more you can do with these fast, cheap models
00:10:36.280 | So that's the one I thought would try out.
00:10:43.880 | Any questions or comments so far from anybody?
00:10:48.320 | I guess, so far, this is--
00:10:50.680 | like, there's no reason you couldn't just
00:10:52.400 | write the full name of the model every time,
00:10:54.240 | like a lot of people do.
00:10:55.680 | And if a new one comes out, you can do that.
00:10:58.800 | But this is just trying to make it as smooth as possible,
00:11:01.560 | even to these tiny little details, right?
00:11:03.800 | Yeah, I don't want to have to remember these things.
00:11:05.960 | And obviously, I wouldn't remember those dates
00:11:07.480 | or whatever.
00:11:08.000 | So otherwise, I could copy and paste them.
00:11:10.480 | But yeah, I find I've been enjoying--
00:11:12.200 | I've been using this for a few weeks now.
00:11:15.680 | Once it got to a reasonably usable point--
00:11:17.680 | and definitely, this tiny minor thing is something I found nice
00:11:20.720 | is to not have to think about model names ever again,
00:11:23.400 | and also know that it goes, like, best, middle, worst.
00:11:26.240 | So I don't even have to--
00:11:27.480 | I can just go straight to, like, OK, worst one.
00:11:31.000 | Cool.
00:11:32.520 | So they provide an SDK, Anthropic.
00:11:36.640 | So their SDK gives you this Anthropic class
00:11:42.600 | you could import from.
00:11:43.960 | So you can pip install it.
00:11:45.040 | If you pip install Claudette, you'll get this for free.
00:11:49.600 | So I think it's nice if you're going to show somebody
00:11:54.760 | how to use your code, you should, first of all,
00:11:56.800 | show how to use the things that your code uses.
00:11:59.120 | So in this case, basically, the thing we use
00:12:00.920 | is the Anthropic SDK.
00:12:03.160 | So let's use it, right?
00:12:04.960 | So the way it works is that you create the client.
00:12:10.520 | And then you call messages.create.
00:12:13.160 | And then you pass in some messages.
00:12:16.080 | So I'm going to pass in a message, I'm Jeremy.
00:12:19.440 | Each message has a role of either user or assistant.
00:12:25.160 | And in fact, they always--
00:12:26.800 | this is, like, if you think about it,
00:12:28.360 | it's actually unnecessary, because they always
00:12:30.600 | have to go user assistant, user assistant, user assistant.
00:12:35.960 | So if you pass in the wrong one, you get an error.
00:12:39.160 | So strictly speaking, they're kind of redundant.
00:12:42.160 | So in this case, and they're just dictionaries, right?
00:12:44.520 | So I'm going to pass in a list of messages.
00:12:47.080 | It contains one message.
00:12:48.080 | It's a user message.
00:12:49.200 | So this is something I've said, whereas the assistant is
00:12:52.120 | something the model said.
00:12:53.480 | And it says, I'm Jeremy.
00:12:55.040 | And then you tell it what model to use.
00:12:57.400 | And then you can pass in various other things.
00:12:59.160 | As you can see, there's a number of other things
00:13:05.560 | that you can pass in, like a system prompt, stop sequences,
00:13:08.720 | and so forth.
00:13:10.440 | And you can see here, they actually
00:13:12.000 | check for what kind of model you want.
00:13:17.280 | So if I go ahead and run that, I get back a message.
00:13:22.280 | And so messages can be dictionaries,
00:13:25.400 | or they can also be certain types of object.
00:13:28.880 | And on the whole, it doesn't really matter which you choose.
00:13:32.360 | When you build them, it's easier just to make them dictionaries.
00:13:35.920 | So a message has an ID.
00:13:37.160 | I haven't used that for anything, really.
00:13:38.960 | And it tells you what model you used.
00:13:40.960 | Now this one's got a role of assistant.
00:13:42.880 | And it's a message.
00:13:47.680 | And it tells you how many tokens we used.
00:13:51.920 | If you're not sure what tokens are and basics like that,
00:13:57.440 | then check out this Hacker's Guide to Language Models,
00:14:01.080 | where I explain all those kinds of basics.
00:14:05.920 | But the main thing is the content.
00:14:09.000 | And the content is text.
00:14:10.520 | It could also reply with images, for instance.
00:14:12.800 | So this is text.
00:14:14.160 | And the text is, that's what it has to say for me.
00:14:19.480 | So that's basically how it works.
00:14:22.600 | It's a nice, simple API design.
00:14:25.400 | I really like it.
00:14:26.920 | The OpenAI one is more complicated to work with,
00:14:31.200 | because they didn't decide on this basic idea of like, oh,
00:14:33.880 | user assistant, user assistant.
00:14:36.560 | OK, so one thing I really like--
00:14:39.080 | Can I ask a question?
00:14:39.960 | Yeah, hit me.
00:14:41.760 | So one thing I know, and I'm sure lots of other people
00:14:45.120 | do as well, is that often when you interact with an assistant,
00:14:47.640 | you provide a system message or guidance
00:14:50.920 | about how the assistant should see their role.
00:14:54.680 | Here, you didn't.
00:14:55.360 | You just started right off with a role from yourself as a user.
00:14:58.760 | Is that because the API or this library
00:15:02.080 | already starts with the default guidance to the assistants?
00:15:05.480 | There's a system prompt here.
00:15:07.600 | And the default is not given.
00:15:10.680 | So yeah, language models are perfectly
00:15:14.800 | happy to talk to you without a system prompt.
00:15:17.520 | Just means they have no extra information.
00:15:22.200 | But they went through instruction fine-tuning
00:15:25.320 | and RLHF, some of those examples would
00:15:28.080 | have had no system prompt.
00:15:29.320 | So they know how to have some kind of default personality,
00:15:33.080 | if you like.
00:15:34.800 | Yeah.
00:15:35.440 | Cool, thanks.
00:15:38.160 | OK, so I always think, like, my notebook
00:15:42.880 | is both where I'm working.
00:15:44.000 | So I want it to be clear and simple to use.
00:15:45.840 | And I also know it's going to end up rendered
00:15:47.800 | as our documentation and as our kind of rendered source.
00:15:51.920 | So I don't really want things to look like this.
00:15:56.040 | So the first thing I did was to format the output.
00:15:57.960 | So here is part of the API.
00:16:02.520 | This is exported.
00:16:03.960 | So the first thing I wanted to do
00:16:05.760 | was just, like, find the content in here.
00:16:11.400 | And so there's a number-- this is an array,
00:16:15.280 | as you can see, of blocks.
00:16:16.960 | So the content is the text block.
00:16:19.880 | So this is just something that finds the first text block.
00:16:23.880 | I mean, it's tiny.
00:16:25.080 | And so that means that now at least I've
00:16:29.600 | kind of got down to the bit that has--
00:16:32.800 | the bit I normally care about, because I don't normally
00:16:35.000 | care about the ID.
00:16:36.200 | I already know what the model is.
00:16:37.600 | I know what the role's going to be, et cetera.
00:16:41.640 | And then so from the text block, I want to pull out the text.
00:16:46.120 | So this is just something that pulls out the text.
00:16:48.760 | And so now from now on, I can always just say contents
00:16:52.800 | and get what I care about.
00:16:55.040 | So something I really like, though, is like, OK,
00:16:57.200 | this is good, but sometimes I want
00:16:58.920 | to know the extra information, like the stop sequence
00:17:01.360 | or the usage.
00:17:03.000 | So in Jupyter, if you create this particular named method
00:17:08.800 | representation in Markdown for an object,
00:17:14.000 | then it displays that object using that Markdown.
00:17:18.200 | So in this case, I'm going to put
00:17:20.560 | the contents of the object followed
00:17:24.080 | by the details as a list.
00:17:28.520 | And so you can see what that looks like here.
00:17:31.280 | There's the contents, and there's the details.
00:17:35.880 | And if you're wondering, like, OK,
00:17:38.160 | how did Jeremy add this behavior to Anthropx class?
00:17:45.520 | Now, this is a nice little fast core thing called patch,
00:17:48.800 | where if you define a function, and you say patch,
00:17:51.160 | and you give it one or more types,
00:17:53.400 | it changes those existing types to give it this behavior.
00:17:56.920 | So this is now, if we look at tools beta message
00:18:02.920 | dot prep for Markdown, there we go.
00:18:05.600 | We just put it in there.
00:18:07.720 | So that's nice.
00:18:08.480 | And so the other--
00:18:09.320 | yeah.
00:18:10.560 | - I was going to say, there's like a trade-off
00:18:13.200 | in terms of time here, where if you only ever
00:18:15.400 | had to look at something once, you just manually type out
00:18:18.760 | response dot messages zero dot block whatever dot choices
00:18:23.840 | dot text, right?
00:18:24.720 | You type all that up.
00:18:26.080 | You have to do it a million times.
00:18:27.520 | It's very nice to have these conveniences.
00:18:29.440 | - Yeah.
00:18:30.000 | Also, for the docs, right?
00:18:31.440 | Like, every time I want to show what the response is,
00:18:35.440 | this is now free.
00:18:37.560 | I think that's nice.
00:18:39.800 | Yeah.
00:18:41.720 | So I don't-- yeah.
00:18:43.080 | So I actually-- what I'm describing here
00:18:47.120 | is not the exact order it happened in in my head.
00:18:50.040 | Because, yeah, it wasn't until I did this a couple of times
00:18:52.800 | and was trying to find the contents and blah, blah, blah,
00:18:55.800 | that I was like, oh, this is annoying me.
00:18:57.480 | And I went back, and I added it.
00:18:59.080 | This is probably like 15 minutes later, I went back.
00:19:01.920 | And it's like, yeah, I wish that existed.
00:19:05.800 | I did know that usage tracking was
00:19:07.240 | going to be important, like how much money you're spending
00:19:09.880 | depends on input and output tokens.
00:19:12.360 | So I decided to make it easy to keep track of that.
00:19:16.360 | So I created a little constructor for usage.
00:19:19.720 | I just added a property to the usage class
00:19:24.240 | that adds those together.
00:19:26.400 | And then I added a representation.
00:19:28.360 | This one is used for strings as well.
00:19:30.600 | This is part of Python itself.
00:19:33.040 | If you add this in--
00:19:34.280 | so now if I say usage, I can see all that information.
00:19:36.760 | So that was nice.
00:19:37.960 | And then since we want to be able to track usage,
00:19:40.000 | we have to be able to add usage things together.
00:19:42.720 | So if you override done to add in Python, it lets you use plus.
00:19:47.720 | So that's something else I decided to do.
00:19:52.720 | And so at this point, yeah, I felt
00:19:54.200 | like all these basic things I'm working with all the time,
00:19:58.000 | I should better use them conveniently.
00:20:00.200 | And it only took a few minutes to add those.
00:20:04.720 | And then ditto, I noticed a lot of people--
00:20:06.760 | in fact, nearly everybody, including
00:20:08.600 | the Anthropic documentation, manually writes these.
00:20:12.760 | I mean, again, it doesn't take long.
00:20:14.720 | But it doesn't take very long to write this once either.
00:20:17.760 | And now if you just-- something as simple as defaulting
00:20:21.480 | the role, then it's just a bit shorter.
00:20:23.880 | I can now say makeMessage.
00:20:25.560 | And it's just creating that dictionary.
00:20:28.680 | So now I can do exactly the same thing.
00:20:34.960 | And so then since it always goes user, assistant, user,
00:20:38.800 | assistant, user, assistant, I thought, OK,
00:20:41.000 | you should be able to just send in a list of strings.
00:20:43.440 | And it just figures that out.
00:20:44.880 | So this is just going using i%2 to jump
00:20:48.480 | between user and assistant.
00:20:50.160 | And makeMessage, I then realized, OK,
00:20:57.160 | we should change it slightly.
00:20:58.520 | So if it's already a dictionary, we
00:21:00.240 | shouldn't change it and stuff like that.
00:21:02.120 | But basically, as you can see here,
00:21:04.920 | I can now pass in a list of messages a bit more easily.
00:21:10.480 | So my prompt was I'm Jeremy.
00:21:15.720 | R is a response.
00:21:19.200 | And then I've got another string.
00:21:24.520 | So if I pass in something which has a content attribute,
00:21:34.920 | then I use that.
00:21:38.560 | And so that way, you can see the messages now.
00:21:40.720 | I've got I'm Jeremy.
00:21:41.640 | And then the assistant contains the response
00:21:45.280 | from the assistant.
00:21:46.360 | So it's happy with that as well.
00:21:48.560 | It doesn't have to be a string.
00:21:50.240 | And so this is how--
00:21:52.400 | and OK, again, from people, if you've
00:21:54.160 | watched my LLM Hacker's Guide, you know this.
00:21:58.800 | Language models currently have no state.
00:22:01.440 | Like when you chat with chatGPT, it looks like it has state.
00:22:06.360 | You can ask follow-up questions.
00:22:08.720 | But actually, the entire previous dialogue
00:22:10.600 | gets sent each time.
00:22:12.320 | So when I say I forgot my name, can you remind me, please?
00:22:14.720 | I also have to pass it all of my previous questions
00:22:17.720 | and all of its previous answers.
00:22:19.840 | And that's how it knows what's happened.
00:22:22.880 | And so then hopefully--
00:22:26.760 | OK, I don't know your name.
00:22:27.960 | I referred to you as Jeremy.
00:22:29.320 | All right, well, so you do know my name.
00:22:31.000 | Thank you.
00:22:31.520 | So it turns out my name's Jeremy.
00:22:33.320 | OK, so I feel like something as simple as this
00:22:36.360 | is already useful for experimenting and playing
00:22:40.000 | around.
00:22:40.520 | And for me, I would rather generally use something
00:22:45.200 | like a notebook interface for interacting with a model
00:22:48.080 | than the kind of default chatGPT thing or Cloud thing.
00:22:54.240 | This is where I can save my notebooks.
00:22:56.400 | I can go back and experiment.
00:22:57.920 | I can do things programmatically.
00:23:00.880 | So this was a big kind of thing for me is like, OK, I want to--
00:23:05.600 | I want to make a notebook at least as ergonomic as chatGPT
00:23:11.680 | plus all of the additional usability of a notebook.
00:23:14.560 | So these are the little things that I think help.
00:23:17.280 | Passing the model each time seems weird
00:23:22.760 | because I generally pick the model once per session.
00:23:26.240 | So I just created this tiny class
00:23:28.200 | to remember what model I'm working with.
00:23:30.840 | So that's what this client class does.
00:23:32.640 | And the second thing it's going to do
00:23:34.200 | is it's going to keep track of my usage.
00:23:37.440 | So maybe the transition that I see happening right now
00:23:40.680 | is everything up until this point was like housekeeping
00:23:44.080 | of I'm doing exactly the same things
00:23:45.520 | as the official API can do, but I'm just
00:23:48.000 | making my own convenience functions for that.
00:23:50.880 | But then the official API doesn't
00:23:52.280 | give you tracking usage over multiple conversations,
00:23:54.520 | keeping track of the history and all of that.
00:23:56.400 | So it seems like now we're shifting to like, OK,
00:23:58.560 | I can do the same things that the API allows me to do,
00:24:01.400 | but now I don't have to type as much.
00:24:03.560 | I've got my convenience functions.
00:24:05.280 | But now it's like, OK, what else would I like to do?
00:24:06.880 | I'd like to start tracking usage--
00:24:08.440 | Yeah, exactly.
00:24:09.040 | --of the persistent model setting.
00:24:10.520 | And this is kind of important to me
00:24:12.000 | because I don't want to spend all my money unknowingly.
00:24:17.000 | So I want it to be really easy.
00:24:18.280 | And so what I used to do was to always go back
00:24:20.560 | to the OpenAI or whatever web page
00:24:22.800 | and check the billing because you can actually
00:24:25.800 | blow out things pretty quickly.
00:24:28.000 | So this way, it's just like I just saying like, OK, well,
00:24:31.040 | let's just start with a use of 0.
00:24:35.040 | And then I just wrote this tiny little private thing here.
00:24:38.640 | We'll ignore prefill for now, which just stores
00:24:42.320 | what the last result was and adds the usage.
00:24:46.760 | So now when I call it a few times, each time I call it,
00:24:51.520 | it's just going to remember the usage.
00:24:54.600 | And so again, I was going to ignore stream for a moment.
00:24:59.400 | So then I define dunder call.
00:25:01.960 | So dunder call is the thing that you basically
00:25:05.440 | could create an object, and then you
00:25:07.240 | could pretend it's a function.
00:25:08.920 | So it's the thing that makes it callable.
00:25:10.840 | And so when I call this function,
00:25:13.800 | I'll come back to some of the details in a moment.
00:25:15.920 | But the main thing it does is it calls make messages
00:25:20.880 | on your messages.
00:25:22.560 | And then it calls the messages.create.
00:25:27.480 | And then it remembers the result and keeps track of the usage.
00:25:33.040 | So basically, the key behavior now
00:25:34.760 | is that when I start, it's got zero usage.
00:25:38.240 | I do something, and I've now tracked the usage.
00:25:43.360 | And so if I call it again, that 20 should be higher, now 40.
00:25:48.320 | So it's still not remembering my chat history or anything.
00:25:53.440 | It's just my usage history.
00:25:56.080 | So I like to do very little at a time.
00:26:00.320 | So you'll see this is like a large function by my standards.
00:26:05.120 | It's like 1, 2, 3, 4, 5, 6, 7, 8 whole lines.
00:26:08.680 | I don't want to get much bigger than that
00:26:10.400 | because my brain's very small.
00:26:11.960 | So I can't keep it all in my head.
00:26:14.480 | So that's just a small amount of stuff.
00:26:16.600 | So there's a couple of other things we do here.
00:26:19.720 | One is we do something which Anthropic
00:26:23.960 | is one of the few companies to officially support,
00:26:27.440 | which is called prefill, which is where you can say
00:26:31.760 | to Anthropic, OK, this is my question.
00:26:34.240 | What's the meaning of life?
00:26:36.240 | And you answered with this starting point.
00:26:41.480 | You don't say, please answer with this.
00:26:44.320 | It literally has to start its answer with this.
00:26:47.080 | That's called prefill.
00:26:48.440 | So if I call it, that's my object.
00:26:53.080 | With this question, with this prefill,
00:26:55.400 | it forces it to start with that answer.
00:26:59.400 | So yeah, so basically, when you call this little tracking
00:27:04.640 | thing, which takes track of the usage,
00:27:07.240 | this is where you also pass in the prefill.
00:27:09.400 | And so if you want some prefill, then as you can see,
00:27:11.600 | it just adds it in.
00:27:15.000 | And the way it also-- so that's just to the answer
00:27:17.080 | because Anthropic doesn't put it in the answer.
00:27:19.240 | And the way Anthropic actually implements
00:27:22.680 | this is that the messages, it gets
00:27:32.800 | appended as an additional assistant message.
00:27:35.360 | So it's the messages plus the prefill.
00:27:37.480 | So basically, you pass in an assistant message at the end.
00:27:40.200 | And then the assistant's like, oh, that's
00:27:41.920 | what I started my answer with.
00:27:44.320 | This isn't documented necessarily in their API
00:27:46.640 | because it's like, oh, this is how you send a user message
00:27:49.600 | and we'll respond with an assistant message.
00:27:51.400 | And you have to kind of dig a little bit more to say, oh,
00:27:53.760 | if I send you an assistant message as the last message
00:27:56.040 | in the conversation, this is how we'll interpret it.
00:27:58.360 | We'll continue on in that message and like--
00:28:00.200 | Yeah, they've got it here.
00:28:01.440 | So they've got-- Anthropic's good with this.
00:28:04.360 | They actually understand that prefill is incredibly powerful.
00:28:12.040 | Particularly, Claude loves it.
00:28:14.240 | Claude does not listen to system prompts much at all.
00:28:17.280 | And this is why each different model,
00:28:21.160 | you have to learn about its quirks.
00:28:23.440 | So Claude ignores system prompts.
00:28:25.840 | But if you tell it, oh, this is how
00:28:28.040 | you answered the last three questions,
00:28:30.360 | it just jumps into that role now.
00:28:32.240 | It's like, oh, this is how I behave.
00:28:34.560 | And it'll keep doing that.
00:28:36.800 | And you can maintain character consistency.
00:28:40.160 | So I use this a lot.
00:28:45.680 | And here's a good example.
00:28:47.080 | Start your assistant with OpenCurlyBrace.
00:28:50.040 | I mean, they support tool calling or whatever.
00:28:52.200 | But this is a simple way.
00:28:53.960 | So sometimes I will start my response
00:28:57.680 | with backtick, backtick, backtick Python.
00:29:00.320 | That forces it to start answering the Python code.
00:29:04.000 | So yeah, lots of useful things you can do with prefill.
00:29:08.240 | Have you noticed personally that the improvement is significant
00:29:12.400 | when you use prefill?
00:29:13.600 | I mean, I see they're recommending it.
00:29:15.280 | But I'm just curious what your anecdotal impression is.
00:29:17.520 | I mean, it answers it with that start.
00:29:21.520 | So yeah, if you want it to answer with that start,
00:29:23.840 | then it's perfect.
00:29:25.080 | Unfortunately, GPT-4.0 doesn't generally do it properly.
00:29:33.280 | Google's Gemini Pro does.
00:29:35.400 | And Google's Gemini Flash doesn't.
00:29:38.080 | So they're a bit all over the place at the moment.
00:29:40.320 | So then, yeah, the other thing you can do
00:29:45.200 | is streaming.
00:29:46.120 | Now, streaming is a bit hard to see with a short question.
00:29:49.880 | So we'll make a longer one.
00:29:56.000 | And so you can see it generally comes out, bop, bop, bop, bop,
00:29:59.600 | I don't have any pre-written funny-- that's terrible.
00:30:02.800 | Oh, you know why?
00:30:04.480 | Because we're using Haiku.
00:30:06.440 | And Haiku is not at all creative.
00:30:11.960 | So if we go c.model equals model 0,
00:30:17.880 | we can upscale to Opus.
00:30:21.720 | Try again.
00:30:22.360 | All right.
00:30:28.120 | Slow enough.
00:30:28.720 | I don't think I want to wait to do the whole thing anyway.
00:30:32.700 | It's just going to keep going, isn't it?
00:30:38.320 | Stop.
00:30:39.560 | All right.
00:30:43.020 | Let's go back to this one.
00:30:46.180 | Yeah.
00:30:46.820 | So streaming, it's one of these things
00:30:52.900 | I get a bit confused about.
00:30:55.780 | It's as simple as calling messages.stream
00:31:01.940 | instead of messages.create.
00:31:04.740 | But the thing you get back is an iterator,
00:31:07.540 | which you have to yield.
00:31:09.580 | And then once it's finished doing that,
00:31:12.060 | it stores the final message in here.
00:31:15.820 | And that's also got the usage in it.
00:31:17.900 | So anyway, this is some little things
00:31:19.940 | which, without the framework, would be annoying.
00:31:25.020 | So with this little tiny framework, it's automatic.
00:31:28.020 | And you see this in the notebook in its final form, right?
00:31:30.400 | But this call method was first written without any streaming.
00:31:34.020 | Like, get it working on the regular case first.
00:31:38.620 | No prefill first.
00:31:39.860 | So then it's three lines.
00:31:41.660 | And then it's OK.
00:31:43.020 | The original version is that.
00:31:48.260 | Yeah.
00:31:50.220 | Yeah.
00:31:52.140 | And then we can test the stream function by itself
00:31:55.020 | and test it out with the smaller primitives first
00:31:58.180 | and then put it into a function that then finally
00:32:00.980 | gets integrated in.
00:32:03.260 | Yeah.
00:32:04.580 | Exactly.
00:32:05.940 | And this is like one of these little weird complexities
00:32:10.220 | of Python is--
00:32:13.780 | John was asking me yesterday.
00:32:15.060 | It's like, oh, we could just refactor this and move this
00:32:18.500 | into here.
00:32:19.060 | And then we don't need a whole separate method.
00:32:21.260 | But actually, we can't.
00:32:22.380 | As soon as there's a yield anywhere in a function,
00:32:24.900 | the function is no longer a function.
00:32:26.460 | And it's now a coroutine.
00:32:28.020 | And it behaves differently.
00:32:29.420 | So this is kind of a weird thing where
00:32:31.700 | you have to pull your yields out into separate methods.
00:32:35.540 | Yeah, it's a minor digression.
00:32:37.700 | So yeah, you can see it's nice that it's now
00:32:44.660 | tracking our usage across all these things.
00:32:49.180 | And we can add the two together, prefill and streaming.
00:32:55.740 | Yeah, any questions or comments so far?
00:32:59.500 | Mm-hmm.
00:33:00.500 | Yeah.
00:33:02.500 | And is there a way to try to reset the counter if you wanted
00:33:06.300 | just to be able to start over at some point?
00:33:10.020 | I mean, the way I would do it was I would just
00:33:13.060 | create a new client, C equals client.
00:33:17.020 | But you could certainly go C.use equals usage 0, 0.
00:33:22.260 | In fact, 0, 0 is the default. So actually, now I
00:33:25.540 | think about it, we could slightly improve our code
00:33:28.540 | to remove three characters, which
00:33:32.300 | would be a big benefit because we don't like characters.
00:33:34.700 | We could get rid of those.
00:33:36.460 | Well, there you go.
00:33:39.460 | So yeah, you could just say C.use equals usage.
00:33:47.420 | And in general, I think people don't
00:33:54.580 | do this manipulating the attributes of objects
00:33:59.900 | directly enough.
00:34:01.420 | Why not?
00:34:02.420 | You don't have to do everything through--
00:34:04.940 | people would often create some kind of set usage method
00:34:07.980 | or something.
00:34:08.740 | No, don't do that.
00:34:09.860 | Paranoid Java style that--
00:34:11.380 | Exactly.
00:34:12.340 | --emerged like a decade ago for some reason.
00:34:14.460 | Oh, more stuff.
00:34:14.980 | Speaking of--
00:34:15.660 | Yeah, oh, yeah?
00:34:17.020 | Yeah, it was longer.
00:34:18.740 | Speaking of directly manipulating
00:34:22.180 | properties of objects, so you showed
00:34:24.780 | how we can use prefill to predefine
00:34:27.780 | the beginning of the assistance response.
00:34:30.500 | If I've had a multi-turn exchange with an assistant,
00:34:33.140 | can I just go in there and clobber
00:34:35.940 | one of the earlier assistant messages
00:34:38.220 | to convince the assistant that it said something it didn't?
00:34:40.940 | Because sometimes that's actually useful.
00:34:43.140 | Yeah, because we don't have any state in our class, right?
00:34:49.180 | So we're passing in--
00:34:51.980 | so here, we're passing in a single string, right?
00:34:58.140 | But we could absolutely pass in a list.
00:35:02.740 | So I said hi, and the model said hi to you too.
00:35:15.580 | I am Plato.
00:35:16.740 | I am Socrates.
00:35:26.180 | Tell me about yourself.
00:35:29.340 | I don't know what will happen here,
00:35:30.820 | but we're just convincing it that this is a conversation
00:35:33.180 | that it's occurred.
00:35:34.740 | So now Claude is probably going to be slightly confused
00:35:37.780 | by the fact that it reported itself not to be Claude.
00:35:42.340 | No, I mean, I don't-- we haven't set a system message
00:35:44.540 | to say it's Claude.
00:35:45.460 | So there you go.
00:35:48.780 | No, not at all confused.
00:35:50.100 | I am Socrates.
00:35:53.500 | So as I said, Claude's very happy to be told what it said,
00:36:00.260 | and it will go along with it.
00:36:02.060 | I'm very fond of Claude.
00:36:03.060 | Claude has good vibes.
00:36:05.620 | Oh, you may be surprised to hear that I am actually Australian.
00:36:16.420 | This is the point of the video where
00:36:21.900 | we get sidetracked and talk to Rupes for a good long time.
00:36:24.540 | [LAUGHTER]
00:36:27.100 | Oh, not very interesting.
00:36:32.300 | What do you say, mate?
00:36:33.460 | [LAUGHTER]
00:36:35.460 | Fair enough.
00:36:36.780 | So yeah, you can tell it anything you like,
00:36:38.980 | is the conversation, because it's got no state.
00:36:42.220 | Now it's forgotten everything it's just said.
00:36:44.300 | The only thing it remembers is its use.
00:36:46.700 | So that's what we've done so far.
00:36:49.940 | So remembering-- oh, actually, so before we do that,
00:36:55.260 | we'll talk about tool use.
00:36:57.740 | So yeah, basically, I wanted to, before we
00:37:01.740 | got into multi-turn dialogue automatic stuff,
00:37:04.100 | I wanted to have the basic behavior that Anthropic's SDK
00:37:12.700 | provides.
00:37:13.460 | I wanted to have it conveniently wrapped.
00:37:17.580 | So tool use is officially still in beta,
00:37:24.580 | but I'm sure it won't be for long.
00:37:27.860 | Can I ask one more pre-tool use case
00:37:29.860 | that I think occurs to me right away?
00:37:31.540 | And so maybe it'll occur to other people
00:37:33.220 | if they're curious.
00:37:35.580 | One thing you often find yourself
00:37:37.340 | doing when you're experimenting with prompts
00:37:39.380 | is going through a lot of variations of the same thing.
00:37:42.300 | So you have your template, and then you
00:37:43.980 | want very different parts of it.
00:37:46.180 | And before you write code to churn out variations,
00:37:50.580 | you're usually doing it a bit ad hoc.
00:37:52.500 | So using this API the way it is now,
00:37:54.780 | if I had a client and a bit of an exchange already built up,
00:37:59.180 | and then I wanted to fork that and create five of them
00:38:02.020 | and then continue them in five different ways,
00:38:04.700 | can I just duplicate--
00:38:08.020 | would the right way to do that would
00:38:09.540 | be to duplicate the client?
00:38:10.620 | Would the right way to do that just
00:38:12.080 | to be to extract the list that represents
00:38:14.020 | the exchange and create new clients?
00:38:15.540 | I'm just getting a sense for what
00:38:16.900 | would be the fluid way of doing it with this API?
00:38:19.460 | Let's do it.
00:38:32.200 | Options equals-- how do you spell Zimbabwean, Johnno?
00:38:38.880 | Zimbab--
00:38:41.480 | E-W-E, yep.
00:38:44.560 | Print contents-- oh, you know what?
00:39:00.160 | This is boring again, because I think we've gone back
00:39:02.360 | to our old model.
00:39:04.560 | C.model equals models 0.
00:39:09.440 | No wonder it's got so dull.
00:39:12.240 | Haiku is just really doesn't like pretending.
00:39:16.400 | Oh, come on.
00:39:21.360 | Look what it's doing.
00:39:22.200 | All right, anyway, Claude is being a total disappointment.
00:39:37.240 | So the fact that it's reasonable to do this just
00:39:39.280 | by slamming a new list into the C function
00:39:42.600 | is an indication of what you just said,
00:39:44.960 | which is that there's no state hiding inside the C function
00:39:47.640 | that we need to worry about mangling when we do that.
00:39:49.880 | That's right.
00:39:50.380 | There's no state at all.
00:39:52.160 | Got it.
00:39:54.240 | So when we recall it, it knows nothing
00:39:56.160 | about being Socrates whatsoever.
00:39:57.960 | Everyone is a totally independent REST call to a--
00:40:01.040 | there's no session ID.
00:40:02.480 | There's no nothing to tie these things together.
00:40:07.680 | All right, we probably just spent cents on that question.
00:40:12.480 | It's so funny, they're like a few dollars per million tokens
00:40:19.480 | or something.
00:40:20.200 | I look at this and like, whoa, all those tokens.
00:40:22.200 | I'm like, oh, yeah, it's probably
00:40:23.620 | like $0.01 or something less.
00:40:25.440 | I've got to get used to not being too scared.
00:40:27.680 | OK, so tool use refers to things like this,
00:40:37.920 | get the weather in the given location.
00:40:39.880 | So there's a tool called getWeather.
00:40:41.640 | And then how would it work?
00:40:43.080 | I don't know.
00:40:43.760 | It would call some other weather API or something.
00:40:48.120 | So both in OpenAI and in Lord, tools
00:40:53.160 | are specified using this particular format, which
00:40:56.720 | is called JSON schema.
00:40:57.960 | And my goal is that you should never have to think about that.
00:41:06.760 | For some reason, nearly everything
00:41:08.240 | you see in all of their documentation
00:41:10.680 | writes this all out manually, which
00:41:12.440 | I think is pretty horrible.
00:41:14.160 | So instead, we're going to create a very complicated tool.
00:41:17.240 | It's something that adds two things together.
00:41:20.160 | And so I think the right way to create a tool
00:41:22.240 | is to just create a Python function.
00:41:26.600 | So the thing about these tools, as you see from their example,
00:41:33.280 | is they really care a lot about descriptions of the tool,
00:41:38.240 | descriptions of each parameter.
00:41:40.520 | And they say quite a lot in their documentation
00:41:44.720 | about how important all these details are to provide.
00:41:48.640 | So luckily, I wrote a thing a couple of years
00:41:56.040 | ago called Documents that makes it really easy to add
00:41:59.640 | information to your functions.
00:42:02.520 | And it basically uses built-in Python stuff.
00:42:05.480 | So the names of each parameter is just
00:42:07.280 | the name of the parameter.
00:42:08.360 | The type of the parameter is the type.
00:42:10.400 | The default of the parameter is the default.
00:42:12.320 | And the description of the parameter
00:42:13.820 | is the comment that you put after it.
00:42:15.920 | Or if you want more room, you can put the comment before it.
00:42:20.400 | Documents is happy with either.
00:42:23.600 | And you can also put a description of the result.
00:42:27.280 | You can also put a description of the function.
00:42:29.760 | And so if you do all those things, then you can see here.
00:42:36.480 | I said tools equals getSchema.
00:42:38.040 | So this is the thing that creates the JSON schema.
00:42:40.120 | So if I say tools, there you go.
00:42:42.680 | You can see it's created the JSON schema from that,
00:42:46.480 | including the comments have all appeared.
00:42:50.760 | And the return comment ends up in this returns.
00:42:56.600 | Yeah.
00:42:57.240 | And if you didn't do any of that,
00:42:58.720 | like if you just wrote a function sums that
00:43:00.520 | took two untyped variables A and B,
00:43:02.920 | you would still get something functional.
00:43:04.880 | The model would probably still be able to use it.
00:43:07.240 | But it just wouldn't be recommended.
00:43:08.880 | Is that right?
00:43:11.480 | I think-- well, I mean, my understanding
00:43:14.560 | is you have to pass in a JSON schema.
00:43:18.880 | So if you don't pass in a JSON schema,
00:43:21.120 | so you would have to somehow create that JSON schema.
00:43:24.480 | I don't know if it's got some default thing that
00:43:26.520 | auto-generates one for you.
00:43:28.360 | Oh, I'm more thinking like if we don't follow the documents
00:43:33.880 | format, for example.
00:43:34.840 | Oh, yeah.
00:43:35.360 | So if we got rid of these, so if we got rid of the documents,
00:43:42.240 | yep, you could get rid of the doc string.
00:43:44.440 | You could get rid of the types and the defaults.
00:43:49.480 | You could do that, in which case the--
00:43:52.160 | OK, so it does at least need types.
00:43:58.600 | So let's add types.
00:44:01.320 | Ah, well, that's a bit of a bug in my--
00:44:03.480 | You need to annotate it.
00:44:06.560 | No, yeah.
00:44:07.720 | It appears like you have to annotate it there.
00:44:10.360 | Well, I'll fix that.
00:44:11.680 | It shouldn't be necessary.
00:44:14.200 | Description.
00:44:15.080 | Oh, OK.
00:44:16.520 | It at least wants a doc string.
00:44:17.960 | OK, so currently, that's the minimum that it needs.
00:44:25.600 | And I don't know if it actually requires a description.
00:44:28.560 | I suspect it probably does, because otherwise, maybe
00:44:31.360 | I guess it could guess what it's for from the name.
00:44:34.560 | But yeah, it wouldn't be particularly useful.
00:44:37.640 | So OK, so now that we've got a tool, when we call Claude,
00:44:48.240 | we can tell it what tools there are.
00:44:50.800 | And now we're also going to add a system prompt.
00:44:53.560 | And I'm just going to use that system prompt.
00:44:57.920 | You don't have to, right?
00:44:59.280 | If you don't say you have to use it,
00:45:01.160 | then sometimes it'll try to add it itself,
00:45:03.640 | but it's not very good at adding.
00:45:05.120 | So I would like to--
00:45:08.240 | I also think user-facing, I think it's weird the way
00:45:11.880 | Claude tends to say, OK, I will use the sum tool
00:45:15.000 | to calculate that sum.
00:45:16.200 | It loves doing that.
00:45:17.400 | OpenAI doesn't.
00:45:19.360 | I think this is because Anthropic's a bit--
00:45:21.160 | like, they haven't got as much user-facing stuff.
00:45:23.160 | They don't have any user-facing tool use yet.
00:45:25.680 | So yeah, I don't think their tool use is quite
00:45:28.080 | as nicely described.
00:45:31.320 | So if we pass in this prompt, what is that plus that?
00:45:39.080 | We get back this.
00:45:44.200 | So we don't get back an answer.
00:45:46.800 | Instead, we get back a tool use message.
00:45:52.040 | The tool use says what tool to use
00:45:55.400 | and what parameters to pass it.
00:45:59.400 | So I then just wrote this little thing
00:46:05.480 | that you pass in your tool use block.
00:46:10.320 | So that's this thing.
00:46:12.800 | And it grabs the name of the function to call.
00:46:16.280 | And it grabs that function from your symbol table.
00:46:21.320 | And it calls that function with the input that was requested.
00:46:26.720 | So when I said the symbol table or the namespace,
00:46:29.760 | basically, this is just a dictionary
00:46:32.000 | from the name of the tool to the definition of the tool.
00:46:38.320 | So if you don't pass one, it uses globals, which,
00:46:40.720 | in other words, is every Python function
00:46:42.560 | you currently have available.
00:46:44.480 | You probably don't want to do that if it's
00:46:47.160 | like os.unlink or something.
00:46:51.960 | So this little make namespace thing
00:46:54.720 | is just something that you just pass
00:46:56.880 | in a bunch of functions to.
00:46:59.520 | And it just creates a mapping from the name to the function.
00:47:03.040 | So that way, this way, I'm just--
00:47:04.800 | yeah.
00:47:06.800 | I'm just going to say, if I'm a somewhat beginner,
00:47:09.240 | I'm approaching LLMs.
00:47:11.120 | I've seen your Hacker's Guide.
00:47:13.040 | This screen full of code is quite a lot
00:47:14.880 | of fairly deep Python stuff.
00:47:16.560 | We've got some typing going on.
00:47:18.120 | And I might not know what mappings are or callables.
00:47:20.840 | There's namespaces and get_atras and dicts and is_instance.
00:47:25.520 | How should I approach this code versus maybe the examples that
00:47:28.560 | are being interleaved?
00:47:29.560 | Because this is the source code of this library.
00:47:32.480 | But you're not writing this with lots of comments or explanations.
00:47:35.440 | It's more like the usage.
00:47:37.080 | So what should I-- like, if I come to this library
00:47:39.120 | and I'm reading the source code, how much
00:47:41.120 | should I be focusing on the deep Python internals
00:47:44.280 | versus the usage versus like the big picture?
00:47:46.880 | That's a good question.
00:47:47.840 | So for someone who doesn't particularly
00:47:54.480 | want to learn new Python things but just
00:47:56.680 | wants to use this library, this probably
00:48:00.320 | isn't the video for you.
00:48:01.560 | Instead, just read the docs.
00:48:05.520 | And none of that-- like, you can see in the docs,
00:48:07.960 | there's nothing weird, right?
00:48:09.280 | The docs just use it.
00:48:11.040 | And you don't need this video.
00:48:17.280 | It's really easy to use.
00:48:20.280 | So yeah, the purpose of this discussion
00:48:23.400 | is for people who want to go deeper.
00:48:27.880 | And yeah, the fact that I'm skipping over these details
00:48:33.880 | isn't because either they're easy
00:48:36.000 | or that everybody should understand them or any of that.
00:48:39.520 | It's just that they're all things
00:48:44.880 | that Google or ChatGPT or whatever
00:48:47.240 | will perfectly happily teach you about.
00:48:49.240 | So these are all things that are built into Python.
00:48:54.440 | But yeah, that'd probably be part of something
00:48:56.600 | called Python Advanced Course or something.
00:48:59.080 | So one of the things a lot of intermediate Python programmers
00:49:01.720 | tell me is that they like reading my code
00:49:05.440 | to learn about bits of Python they didn't know about.
00:49:07.960 | And then they use it as jumping off points to study.
00:49:13.280 | And that's also why, like, OK, why do I not have many comments?
00:49:17.240 | So my view is that comments should describe
00:49:21.760 | why you're doing something, not what you're doing.
00:49:25.880 | So for something that you could answer, like, oh,
00:49:29.040 | what does isInstanceABC.mapping do?
00:49:32.400 | You don't need a comment to tell you that.
00:49:34.160 | You can just Google it.
00:49:35.200 | And so in this case, all of the things I'm doing,
00:49:38.760 | once you know what the code does, why is it doing it
00:49:41.920 | is actually obvious.
00:49:43.760 | Like, why do we get the name of the function from the object?
00:49:48.720 | Or why do we pass the input to the function?
00:49:51.880 | I mean, that's literally what functions are.
00:49:53.720 | They're things you call them, and you pass in the input.
00:49:57.000 | Yeah.
00:49:57.840 | So I think that's a good question.
00:50:00.400 | Let's say, like, yeah, don't be--
00:50:04.320 | you actually don't need to know any of these details.
00:50:07.240 | But if you want to learn about them,
00:50:11.720 | yeah, the reason I'm using these features of the language
00:50:14.040 | is because I think they're useful features
00:50:15.800 | of the language.
00:50:16.640 | And if I haven't got a comment on them,
00:50:19.240 | it's because I'm using them in a really normal, idiomatic way
00:50:21.880 | that isn't worthy of a comment.
00:50:23.640 | So that means if you learn about how
00:50:25.160 | to use this thing for this reason,
00:50:27.640 | that's a perfectly useful thing to learn about.
00:50:30.800 | And you can experiment with it.
00:50:32.120 | And I'll add that, like, I'm learning this stuff
00:50:34.080 | as we code together on this as well, right?
00:50:35.960 | Like, you don't have to know any of this to be a good programmer,
00:50:39.000 | but it's really fun as well.
00:50:40.200 | And I think, like, some of these things we wrote multiple ways,
00:50:43.360 | maybe one that was more verbose first, and then we say,
00:50:45.600 | oh, I think we can do this in this more clever way
00:50:47.760 | if we condense it down.
00:50:48.800 | So if you are watching this and you are wanting to learn
00:50:51.080 | and you're still like, oh, I still
00:50:52.520 | don't know what some of these things are,
00:50:54.000 | I can't remember what the double--
00:50:55.400 | like, yeah, dig in and find out.
00:50:57.120 | But it's also, like, it's totally OK if you're not,
00:50:59.240 | like, comfortable at this.
00:51:01.120 | Yeah, the other thing I would say
00:51:02.880 | is the way I write all of my code, pretty much,
00:51:07.560 | is I don't write it in a function.
00:51:10.000 | I write nearly all of it outside of a function in cells.
00:51:16.360 | So you can do the same thing, right?
00:51:19.240 | So, like, let's set ns to none, so then I can run this.
00:51:24.680 | It's like, oh, what the hell is globals?
00:51:27.400 | It's like, oh, wow, everything in Python is a dictionary.
00:51:31.240 | Now, this is a really powerful thing, which
00:51:33.560 | is well worth knowing about.
00:51:38.240 | If I could offer just--
00:51:39.960 | yeah, sorry.
00:51:41.120 | But just offer one perspective to maybe make a little bridge
00:51:43.600 | from the kind of user point of view
00:51:45.320 | to the why these internals might be unfamiliar point of view,
00:51:49.000 | just to recap and make sure I understand it right.
00:51:51.800 | From the user point of view here, when we use tools,
00:51:54.520 | we get a response back from a plod,
00:51:59.000 | in the way we're doing it now, that describes a function
00:52:02.160 | that we now want to execute.
00:52:04.160 | Correct?
00:52:04.800 | That's the function to execute, and that's
00:52:06.560 | the input to provide to it.
00:52:09.600 | So with this library, I can write a function in Python
00:52:13.600 | and then tell clod to call the function that's
00:52:16.600 | sitting there on my system, right?
00:52:19.560 | Yeah, if it wants to.
00:52:21.000 | For that to work, if it wants to, if it chooses to.
00:52:23.640 | But for that to work, this library
00:52:25.800 | needs to do the magic of reading a text string that
00:52:29.680 | is clod's response, and then in Python,
00:52:33.680 | having that not be a text string,
00:52:35.240 | but having that become Python code that runs in Python.
00:52:38.720 | And that's a somewhat unfamiliar thing to do in Python.
00:52:41.800 | And that's what's called eval in JavaScript,
00:52:44.520 | or back in Lisp, where a lot of this stuff got started.
00:52:47.560 | And because that's not that sort of--
00:52:50.000 | Well, it's actually not.
00:52:51.640 | We're not actually doing an eval, right?
00:52:54.160 | OK, that's interesting.
00:52:55.560 | Yeah, we're definitely not doing eval.
00:52:59.400 | So in the end, this is the function we want to call.
00:53:04.200 | So I can call that, and there's the answer.
00:53:07.920 | In Python, this is just syntax sugar, basically,
00:53:13.160 | for passing in a dictionary and dereferencing it.
00:53:19.080 | So those are the same.
00:53:20.520 | Those are literally the same thing,
00:53:22.840 | as far as Python is concerned.
00:53:27.280 | So we were never passed a string of code to eval or execute.
00:53:36.200 | We were just told, call this tool
00:53:39.040 | and pass in these inputs.
00:53:41.600 | So to find the tool by string, we look it up
00:53:46.080 | in the symbol table.
00:53:47.440 | So let's just change fc.name to fc_name.
00:53:49.760 | And the name it's giving us is the one
00:53:55.880 | that we provided earlier.
00:53:57.120 | Yeah, it's the name that came from our schema, which
00:54:03.120 | is this name.
00:54:04.880 | Yeah, so if you look back at our tool schema,
00:54:09.520 | this tool has a name.
00:54:10.680 | And you can give it lots of tools.
00:54:12.640 | So later on, we might see one where we've
00:54:14.280 | got both sums and multiply.
00:54:15.960 | And it can pick.
00:54:19.840 | We'll see this later, can pick and choose.
00:54:22.160 | So the flow is, we write our function in Python.
00:54:26.760 | The library automatically knows how to interpret the Python
00:54:29.760 | and turn it into a structured representation,
00:54:31.640 | the JSON schema, that is then fed to Cloud.
00:54:35.040 | It's fed to Cloud.
00:54:35.880 | We're also feeding it the name for the function
00:54:38.880 | that it's going to use when it wants to come back to us
00:54:41.400 | and say, hey, now call the function.
00:54:43.200 | When it comes back to us and says, hey, call the function,
00:54:45.640 | it uses that name.
00:54:46.440 | We look up the original function, and then we execute.
00:54:48.600 | Yeah, and so it decides.
00:54:50.360 | It knows it's got a function that can do this
00:54:53.640 | and that it can return this.
00:54:56.360 | And so then if it gets a request that can use that tool,
00:55:04.240 | then it will decide of its own accord, OK,
00:55:06.720 | I'm going to call the function that Jeremy provided,
00:55:09.640 | the tool that Jeremy provided.
00:55:11.960 | Yeah, so we'll see a bunch of examples of this.
00:55:14.400 | And this is generally part of what's called the React
00:55:17.560 | framework, nothing to do with React, the JavaScript GUI
00:55:21.280 | thing, but React was a paper that basically said like, hey,
00:55:24.520 | you can have language models or tools.
00:55:28.720 | And again, my LLM Hackers video is the best place
00:55:32.640 | to go to learn about the React pattern.
00:55:36.360 | And so here we're implementing the React pattern,
00:55:38.400 | or at least we're implementing the things necessary for Cloud
00:55:41.160 | to implement the React pattern using
00:55:43.320 | what it calls tool calling.
00:55:45.320 | So we look up the function, which
00:55:47.280 | is a string into this dictionary,
00:55:50.640 | and we get back the function.
00:55:54.320 | And so we can now call the function.
00:55:59.040 | So that's what we're doing.
00:56:00.160 | And so I think the key thing here
00:56:05.480 | is this idea that all this is in a notebook.
00:56:08.760 | The source code here to this whole thing
00:56:11.600 | is in a notebook, which means you can play with it, which
00:56:16.320 | I think is fantastically powerful because you never
00:56:19.880 | have to guess what something does.
00:56:21.320 | You literally can copy and paste it into a cell and experiment.
00:56:25.400 | And it's also worth learning these keyboard shortcuts
00:56:28.200 | like CV to copy and paste the cell, and like Apple A,
00:56:35.200 | Apple left square bracket, control shift hyphen.
00:56:39.640 | There's all these nice things worth learning,
00:56:42.360 | all these keyboard shortcuts to be able to use this Jupyter
00:56:46.520 | tool quickly.
00:56:49.480 | Anyway, the main thing to know is
00:56:50.880 | we've now got this thing called call function, which
00:56:54.800 | can take the tool use request from Cloud, this function call
00:56:58.600 | request, and call it.
00:57:01.440 | And it passes back a dictionary with the result of the tool
00:57:08.000 | call, and when it asked us to make this call,
00:57:16.040 | it included an ID.
00:57:18.680 | So we have to pass back the same ID
00:57:20.800 | to say this is the answer to this question.
00:57:24.320 | And that's the bit that says this is the answer
00:57:28.120 | to this question.
00:57:30.120 | That's the answer.
00:57:32.120 | And so we can now pass that back to Cloud,
00:57:37.800 | and Cloud will say, oh, great, I got the answer,
00:57:40.160 | and then it will respond with text.
00:57:44.040 | So I put all that together here and make a tool response
00:57:47.640 | where you pass in the tool response request from Cloud,
00:57:52.360 | the namespace to search for tools,
00:57:54.400 | or an object to search for tools,
00:57:58.200 | and we create the message from Cloud.
00:58:03.240 | We call that call function for every tool use request.
00:58:07.360 | There can be more than one, and we add that to their response.
00:58:12.880 | And so if you have a look now here, when we call that,
00:58:18.800 | it calculates the sum, and it's going to pass back the--
00:58:24.720 | going to add in the tool use request and the response
00:58:27.960 | to that request.
00:58:30.800 | So we can now go ahead and do that,
00:58:36.840 | and you can see Cloud returns the string, the response.
00:58:43.240 | So it's turned the result of the tool request into a response.
00:58:48.240 | And so this is how stuff like Code Interpreter in ChatGPT
00:58:52.520 | works.
00:58:53.020 | So it might be easier to see it all in one place,
00:58:58.400 | and this is like another demo of how we can use it.
00:59:00.920 | Instead of calling functions, we can also call methods.
00:59:03.600 | So here's sums again.
00:59:05.560 | But this time it's a method of a class.
00:59:08.440 | So we can do the same thing, get schema dummy dot sums.
00:59:13.320 | Yeah, so we make the message containing our prompt.
00:59:16.680 | So that's the question, what's this plus this?
00:59:20.320 | We pass that along to Cloud.
00:59:22.320 | Cloud decides that it wants you to create a tool request.
00:59:25.720 | We make the tool request, calculate the answer,
00:59:29.800 | add that to the messages, and put it all together.
00:59:33.960 | Oops, crazy.
00:59:34.800 | And there we go.
00:59:38.960 | OK, anything worth adding to that?
00:59:54.120 | So if you're not comfortable and familiar with the React
00:59:58.160 | framework, this will feel pretty weird.
01:00:04.520 | Definitely worth spending time learning about,
01:00:07.560 | because it's an incredibly powerful technique
01:00:12.360 | and opens up a lot of opportunities to--
01:00:19.440 | because I think a lot of people, I certainly
01:00:22.160 | feel this way, that there's so many things that language
01:00:26.520 | models aren't very good at.
01:00:28.480 | But they're very good at recognizing
01:00:30.960 | when it needs to use some tool.
01:00:33.440 | If you tell it like, oh, you've got access to this proof
01:00:37.960 | checking tool, or you've got access to this account creation
01:00:42.480 | tool, or whatever, it's good at using those.
01:00:45.240 | And those tools could be things like reroute this code
01:00:53.400 | or a customer service representative.
01:00:55.560 | They don't have to be text generating tools.
01:00:59.400 | They can be anything.
01:01:01.800 | And there's also no reason--
01:01:03.000 | you're not under obligation to send the response back
01:01:05.280 | to the model, right?
01:01:05.880 | It can actually be a useful endpoint.
01:01:07.400 | It's like, oh, I tell the model to look
01:01:10.280 | at this query from a customer and then respond appropriately.
01:01:13.280 | And one of the tools is like escalate.
01:01:15.560 | Well, if it sends a tool response,
01:01:19.160 | a tool request for that function,
01:01:21.320 | that could be like, oh, I should exit this block,
01:01:24.360 | forget about it, throw away the history
01:01:25.960 | because now I need to bump this up
01:01:27.640 | to some actual human in the loop,
01:01:29.360 | or store the result somewhere.
01:01:31.120 | It's just a very convenient way to get--
01:01:33.600 | Yeah, we're going to see a bunch more examples
01:01:36.520 | in the next section because there's a whole module called
01:01:40.560 | tool loop, which has a really nice example, actually,
01:01:43.520 | that came from the Anthropic examples of how
01:01:46.520 | to use this for customer service interaction.
01:01:51.640 | But for now, yeah, you can put that aside.
01:01:54.200 | Don't worry about it because we're
01:01:56.960 | going to go on to something much more familiar to everybody,
01:01:59.360 | which is chat.
01:02:01.880 | So chat is just a class which is going
01:02:08.560 | to keep track of the history.
01:02:10.120 | So self.h is the history.
01:02:11.720 | And it's going to start out as an empty list.
01:02:13.680 | There's no history.
01:02:14.480 | And it's also going to contain the client, which
01:02:23.680 | is the thing we just made.
01:02:25.720 | And so if you ask the chat for its use,
01:02:28.200 | it'll just pass it along to the client to get its use.
01:02:30.520 | You can give it some tools.
01:02:35.920 | And you can give it a system prompt.
01:02:37.440 | OK, so the system prompt, pass it in, no tools, no usage,
01:02:46.320 | no history.
01:02:49.080 | Again, there's a stream version and a non-stream version.
01:02:53.080 | So you can pass in stream as true or false.
01:02:55.520 | If you pass in stream, it'll use the stream version.
01:02:58.840 | Otherwise, it won't.
01:03:01.360 | So again, we patch in, done to call.
01:03:10.960 | Now, of course, we don't need to use patch.
01:03:13.320 | We could have put these methods directly in inside here.
01:03:17.440 | But I feel like I really prefer to do things much more
01:03:21.080 | interactively and step by step.
01:03:22.680 | So this way, I can create my class.
01:03:25.280 | And then I can just gradually add a little bit to it
01:03:27.480 | at a time as I need it.
01:03:28.920 | And I can also document it a little bit as the time,
01:03:31.840 | rather than having a big wall of code, which
01:03:34.000 | is just, I find, overwhelming.
01:03:35.480 | So all right, so there's a prompt.
01:03:45.160 | So if you pass in the prompt, then we
01:03:46.840 | add that to the history as a message.
01:03:50.200 | Now, get our tools.
01:03:57.560 | So I just call get schema for you automatically.
01:03:59.800 | And then at the end, we'll add to the history
01:04:06.000 | the results, which may include tool use.
01:04:08.640 | So now I can just call chat.
01:04:10.560 | And then I can call chat again.
01:04:17.640 | And as you can see, it's now got state.
01:04:20.240 | It knows my name.
01:04:21.040 | And the reason why is because each time it calls the client,
01:04:28.000 | it's passing in the entire history.
01:04:29.640 | So again, we can also add pre-fill, just like before.
01:04:36.520 | We can add streaming, just like before.
01:04:40.640 | And that's it, right?
01:04:41.560 | So you can see adding chat required almost no code.
01:04:47.680 | Really, it's just a case of adding everything
01:04:50.600 | to the history, and every time you call the client,
01:04:52.720 | passing in the whole history.
01:04:54.320 | So that's all a stateful-seeming language model is.
01:05:00.320 | So I don't actually have to write anything
01:05:05.960 | to get tool use to work, which is nice.
01:05:08.400 | I can just pass in my tools.
01:05:11.160 | And the nice thing, the kind of interesting thing
01:05:13.160 | here is that because the tool use request and response are
01:05:19.680 | both added to the history, to do the next step,
01:05:23.920 | I don't pass in anything at all.
01:05:25.360 | That's already in the history.
01:05:27.000 | So I just run it, and it goes ahead and tells me the answer.
01:05:37.120 | Anything to say about that?
01:05:39.200 | So I know in chat GPT, it sometimes
01:05:41.760 | is, would you like to go ahead with this tool activation?
01:05:44.920 | And here, the model is responding
01:05:46.720 | with the tool use block, like it would like to use this tool.
01:05:49.400 | Do you have a way of interrupting
01:05:51.440 | before it actually runs the code that you gave it?
01:05:53.720 | Maybe you want to check the inputs or something like that?
01:05:57.720 | So you would need to put that into your function.
01:06:03.200 | So I've certainly done that before.
01:06:06.280 | So one of the things-- in fact, we'll see it shortly.
01:06:09.440 | I've got a code interpreter.
01:06:11.320 | And you don't want to run arbitrary code,
01:06:13.640 | so it asks you if you want to complete it.
01:06:15.800 | And part of the definition of the tool
01:06:19.040 | will be, what is the response that you
01:06:21.000 | get, Claude, if the user's declined
01:06:23.280 | your request to run the tool?
01:06:26.080 | Cool.
01:06:28.960 | So I think people might imagine--
01:06:31.960 | Good question.
01:06:32.600 | Yeah.
01:06:33.800 | So I had my earlier question before we introduced chat,
01:06:36.720 | where we forked a conversation, as it were,
01:06:40.360 | just by forcing stuff into earlier exchanges.
01:06:43.400 | And at that point, we were talking about how
01:06:45.360 | the interface was stateless, because there
01:06:47.120 | were no session IDs.
01:06:49.240 | Now that we have these tool interactions with tool IDs,
01:06:52.600 | does that change the story?
01:06:54.600 | Like, let's say I had a sequence of interactions
01:06:57.960 | that involved tool use, and now I
01:07:00.960 | want to create three variations to explore
01:07:03.480 | different ways I might respond.
01:07:06.120 | Is that problematic?
01:07:09.000 | No, not at all.
01:07:10.080 | I mean, let's do it.
01:07:10.840 | So actually--
01:07:11.600 | OK, let's try it.
01:07:12.400 | Actually, I'm not Jeremy.
01:07:13.440 | I'm actually Alexis.
01:07:17.640 | You might want to zero index there.
01:07:23.520 | Thanks.
01:07:24.020 | So at this point, it's now going to be very confused,
01:07:35.880 | because I'm Alexis.
01:07:39.720 | It's nice to meet you, Jeremy.
01:07:41.760 | What's my name?
01:07:42.440 | Your name is Jeremy.
01:07:43.960 | So yeah, let's try.
01:07:46.680 | Poor thing.
01:07:50.600 | Lord, really?
01:08:04.600 | Does that answer your question, Alexis,
01:08:06.480 | if that is your real name?
01:08:08.120 | This is abuse of Claude.
01:08:09.240 | One day, this will be illegal.
01:08:13.920 | Yeah.
01:08:18.880 | And so I also had a question, too.
01:08:24.600 | And if Claude returns a tool block,
01:08:29.920 | and is that added as a tool block to the history?
01:08:35.480 | Does it have to be converted to a string?
01:08:39.480 | No, no, no.
01:08:42.080 | It's just the tool block as part of the history.
01:08:44.040 | The history is perfectly-- the messages
01:08:46.440 | can be those message objects.
01:08:49.040 | They don't have to be dictionaries.
01:08:50.760 | The contents don't have to be strings.
01:08:54.280 | OK, cool.
01:08:55.320 | Yeah.
01:08:57.160 | All right.
01:08:58.160 | So I was delighted to discover how straightforward images are
01:09:05.840 | to deal with.
01:09:07.960 | So yeah, I mean, we can read in our image,
01:09:16.800 | and it's just a bunch of bytes.
01:09:21.440 | And Anthropx documentation describes
01:09:26.480 | how they expect images to come in.
01:09:31.040 | Here we are.
01:09:35.280 | So yeah, basically, this is just something
01:09:43.960 | which takes in the bytes of an image
01:09:46.520 | and creates the message that they expect,
01:09:49.720 | which is type image.
01:09:51.400 | And the source is a dictionary containing base64,
01:09:56.360 | a MIME type, and the data.
01:09:57.760 | Anyway, you don't have to worry about any of that
01:09:59.760 | because it does it all for you.
01:10:01.960 | And so if we have a look at that,
01:10:09.480 | that's what it looks like.
01:10:12.160 | And so because you could--
01:10:14.440 | and so they're kind of quite like this.
01:10:16.040 | You can have multiple images and multiple pieces of text
01:10:18.960 | in a request.
01:10:19.840 | They can be interleaved, whatever.
01:10:21.920 | So to do that, it means you can't just pass in strings.
01:10:25.440 | You have to pass in little dictionaries type
01:10:28.560 | text in the string.
01:10:30.680 | So here we can say, all right, let's create--
01:10:33.560 | this is a single message containing multiple parts.
01:10:37.760 | So maybe these functions should be called image part and text
01:10:40.800 | part.
01:10:41.280 | I don't know.
01:10:42.560 | But they're not.
01:10:44.240 | A single message contains an image and this prompt.
01:10:51.040 | And so then we pass that in.
01:10:52.560 | And you see I'm passing in a list
01:10:54.120 | because it's a list of messages.
01:10:55.880 | And the first message contains a list of parts.
01:10:58.520 | And yep, it does contain purple flowers.
01:11:01.880 | And then it's like, OK, well, there's
01:11:07.120 | no particular reason to have to manually do these things.
01:11:09.920 | We can perfectly well just look and see, oh, it's a string.
01:11:13.080 | We should make it a text message, or it's bytes.
01:11:15.120 | We should make it an image message.
01:11:16.600 | So I just have a little private helper.
01:11:20.080 | And then finally, I've changed makeMessage.
01:11:23.160 | This is something I remember Jono and I talked about.
01:11:25.360 | Jono was saying--
01:11:26.320 | I think you said you feel like this is kind of like part
01:11:29.440 | of the Jeremy way of coding is I don't go back and refactor
01:11:33.280 | things, but I just redefine them later in my notebook.
01:11:36.960 | And so I previously hadn't exported makeMessage.
01:11:39.560 | I don't export it till now.
01:11:42.480 | And so here's my final version that's
01:11:44.160 | now going to actually call makeContent to automatically
01:11:49.640 | handle images as well.
01:11:52.160 | And so now we can just pass in-- we can call our client.
01:11:54.840 | We can pass in a list of one message.
01:11:58.120 | The list of one message contains a list of parts,
01:12:02.640 | as you can see.
01:12:06.400 | So behind the scenes, when we then run the last cell,
01:12:15.040 | it actually generates a Python file containing
01:12:23.000 | all of the exports code.
01:12:24.960 | So it's 229 lines, which isn't much,
01:12:29.240 | particularly when you look at how much empty space there is.
01:12:32.160 | And these all things say which cell it comes from and so forth.
01:12:34.800 | So in terms of actual code, it'll
01:12:36.560 | be well under 200 lines of code.
01:12:37.960 | OK, so that is the first of two modules to look at.
01:12:47.200 | Any thoughts or questions before we move on to the tool loop?
01:12:50.600 | I think it's coming through.
01:12:55.760 | Maybe do you want to go into your objective
01:12:59.680 | when you started this, if it was beyond what you've already
01:13:02.840 | shown, like what was the goal always
01:13:07.000 | to keep this a simple self-contained thing?
01:13:08.920 | Is there plans for this to grow into a fully stateful chat
01:13:13.680 | thing that can offer up different functionality?
01:13:15.880 | What's the journey of, oh, I should write this thing that's
01:13:20.440 | going into this?
01:13:21.720 | I mean, I imagine I must be a very frustrating person
01:13:24.800 | to work with because I'd never have any plans, really.
01:13:28.840 | I just have this vague, intuitive feeling
01:13:32.360 | that maybe I should do something in this general direction.
01:13:34.920 | And then people ask me, like, oh, why are you doing that?
01:13:37.600 | So it's like, I don't know.
01:13:41.160 | Just seems like, why not?
01:13:43.120 | Seems like a good idea.
01:13:44.120 | So yeah, I don't think I had any particular plans
01:13:47.560 | to where it would end up.
01:13:48.600 | Just a sense of like--
01:13:49.840 | the way I saw things being written,
01:13:57.800 | including in the Anthropic documentation for Claude,
01:14:00.080 | seemed unfairly difficult. I didn't
01:14:04.400 | think people should have to write stuff like that.
01:14:07.840 | And then when I started to write my own thing using
01:14:11.240 | the Anthropic client, I didn't find
01:14:13.640 | it very ergonomic and nice.
01:14:16.400 | I looked at some of the things that are out there,
01:14:20.040 | kind of general LLM toolkits, APIs, libraries.
01:14:25.120 | And on the whole, I found them really complicated, too long,
01:14:31.800 | too many new abstractions, not really
01:14:34.200 | taking advantage of my existing Python knowledge.
01:14:36.800 | So I guess that was my high-level hope.
01:14:40.560 | Simon Willison has a nice library
01:14:42.160 | called LLM, which Jono and I started looking at together.
01:14:47.200 | But it was missing a lot of the features that we wanted.
01:14:50.800 | And we did end up adding one as a PR.
01:14:54.160 | Not that it's been merged yet.
01:14:55.400 | But yeah, in the end, I guess the other thing about--
01:15:00.680 | so the interesting thing about Simon's approach with LLM
01:15:03.840 | is it's a general front end to dozens of different LLM
01:15:09.760 | backends, open source, and proprietary,
01:15:12.560 | and inference services.
01:15:15.160 | And as a result, he kind of has to have this lowest common
01:15:20.280 | denominator API of like, oh, they all support this.
01:15:24.040 | So that's kind of all we support.
01:15:26.800 | So this was a bit of an experiment in being like,
01:15:28.840 | OK, I'm going to make this as Claude-friendly as possible.
01:15:32.440 | Which is why I even gave it a name based on Claude.
01:15:35.720 | Because I was like, I want this to be--
01:15:38.440 | you know, that's why I said this is Claude's friend.
01:15:41.720 | I wanted to make it like something that
01:15:43.400 | worked really well with Claude.
01:15:44.800 | And I didn't know ahead of time whether that would turn out
01:15:47.200 | to be something I could then use elsewhere
01:15:49.520 | with slight differences or not.
01:15:52.880 | So that was kind of the goal.
01:15:54.160 | So where it's got to--
01:16:00.120 | I think what's happened in the few weeks
01:16:05.560 | since I started writing it is there's
01:16:07.680 | been a continuing standardization.
01:16:12.600 | Like, the platforms are getting-- which is nice,
01:16:14.960 | more and more similar.
01:16:16.680 | So the plan now, I think, is that there
01:16:20.840 | will be GPT's friend and Gemini's friend as well.
01:16:28.560 | GPT's friend is nearly done, actually.
01:16:30.240 | Maybe they'll have an entirely consistent API.
01:16:35.520 | We'll see, you know, or not.
01:16:39.440 | But again, I'm kind of writing each of them
01:16:41.680 | to be as good as possible to work with that LLM.
01:16:44.800 | And then I'll worry about like, OK,
01:16:46.480 | is it possible to make them compatible with each other
01:16:49.200 | later?
01:16:51.320 | And I think that's something--
01:16:52.560 | I mean, I'd be interested to hear your thoughts, Jono.
01:16:54.760 | But like, when we wrote the GPT version together,
01:17:01.360 | and we literally just duplicated the original Baudet notebook,
01:17:06.640 | started at the top cell, and just changed the--
01:17:10.880 | and did a search and replace of Anthropic with OpenAI,
01:17:15.080 | and of Claude with GPT, and then just went through each cell one
01:17:18.760 | at a time to see how do you port that to the OpenAI API.
01:17:24.200 | And I found that it took us, what, a couple of hours?
01:17:27.280 | It felt like a very--
01:17:28.200 | It was very quick, yeah.
01:17:29.440 | Simple.
01:17:30.080 | Didn't have to use my brain much.
01:17:32.840 | Yeah.
01:17:33.360 | I mean, that's maybe worth highlighting,
01:17:35.040 | is that this is not the full and only output of the AnswerAI
01:17:38.600 | organization over the last month.
01:17:40.000 | This is like, oh, you saw things Jeremy is tinkering with just
01:17:43.600 | on the side.
01:17:44.320 | So maybe, yeah, it's good to set expectations appropriately.
01:17:46.800 | But also, yeah, it didn't really feel like it was pretty easy,
01:17:48.680 | especially because I think they've all
01:17:50.520 | been inspired by, is the generous way of saying it,
01:17:52.520 | each other.
01:17:52.960 | And OpenAI, I think, maybe led the way
01:17:54.520 | with some of the API stuff.
01:17:55.640 | So yeah, it's chat.completions.create
01:17:58.240 | versus anthropic.client.messages.create
01:18:01.120 | or something.
01:18:02.200 | In a standard IDE environment, I think
01:18:04.520 | I would have found it a lot harder.
01:18:07.920 | You know, because it's so sequential, in some ways,
01:18:10.800 | it could feel like a bit of a constraint.
01:18:12.480 | But it doesn't mean you can do it from the top
01:18:16.720 | and you go all the way through until you get to the bottom.
01:18:19.120 | You don't have to jump around the place.
01:18:21.040 | Right.
01:18:21.560 | And the only part that was even mildly tricky
01:18:24.040 | then ended up being a change that we
01:18:25.960 | made for the OpenAI one, which instantly got mirrored back
01:18:28.600 | to Claude.
01:18:29.120 | And then, again, because they were built in the same way,
01:18:30.880 | it was like, oh, we've tweaked the way we do.
01:18:32.680 | I think it was streaming.
01:18:33.840 | One of the things that-- yeah.
01:18:35.120 | Like, OK, we've figured out a nice way
01:18:36.320 | to do that in the second rewrite.
01:18:37.960 | It was very easy to just go and find the equivalent function,
01:18:40.520 | because the two are so close together.
01:18:43.120 | Yeah, so I also-- it's quite a nice way
01:18:44.720 | to write software, especially for this kind of like--
01:18:47.080 | it's not going to grow too much in scope
01:18:49.240 | beyond what is one or two notebooks of stuff.
01:18:51.520 | I don't think it's not--
01:18:52.600 | If it did, I would add another project or another notebook.
01:18:56.720 | Like, I wouldn't change these.
01:18:58.560 | These are kind of like the bases which we can build on.
01:19:02.240 | Yeah.
01:19:03.120 | Yeah.
01:19:03.640 | OK, let's keep going then.
01:19:06.320 | So there's just one more notebook.
01:19:08.520 | And this, hopefully, will be a useful review
01:19:14.160 | of React framework.
01:19:16.400 | So yeah, Anthropic has this nice example
01:19:20.560 | in their documentation of a customer service agent.
01:19:24.200 | And again, there's a lot of this boilerplate.
01:19:30.960 | And then it's all a second time, because it's now the functions.
01:19:40.080 | And so basically, the idea is here,
01:19:41.680 | there's like a little pretend bunch of customers
01:19:46.080 | and a pretend bunch of orders.
01:19:48.400 | And I made this a bit more--
01:19:50.200 | a little bit more sophisticated.
01:19:51.520 | These customers don't have orders.
01:19:53.040 | The orders are not connected to customers.
01:19:54.800 | In my version, I have the orders separately.
01:19:59.760 | And then each customer has a number of orders.
01:20:03.440 | So it's a kind of a relational, but it's more like MongoDB
01:20:07.400 | style or whatever, denormalized.
01:20:10.760 | Not a relational database.
01:20:13.400 | Yeah, so they basically describe this rather long, complex
01:20:19.800 | process.
01:20:20.320 | And as you can see, they do absolutely everything
01:20:22.400 | manually, which maybe that's fine
01:20:25.440 | if you're really trying to show people the very low level
01:20:28.760 | details.
01:20:29.800 | But I thought it'd be fun to do exactly the same thing,
01:20:32.600 | but make it super simple.
01:20:33.920 | And also make it more sophisticated
01:20:35.520 | by adding some really important features
01:20:37.760 | that they never implemented.
01:20:39.720 | So the first feature they implement is getCustomerInfo.
01:20:44.400 | You pass in a customer ID, which is a string,
01:20:48.040 | and you get back the details.
01:20:52.400 | So that's what it is, customers.get.
01:20:55.720 | And so you'll see here we've got the documents,
01:20:59.160 | we've got the doc string, and we've got the type.
01:21:02.960 | So everything necessary to get a schema.
01:21:09.440 | Same thing for order details, orders.get.
01:21:14.400 | And then something that they didn't quite implement
01:21:18.080 | is a proper cancel order.
01:21:20.400 | So if order ID not in orders, so you
01:21:23.760 | can see we're returning a bool.
01:21:27.040 | So if the order ID is not there, we
01:21:28.640 | were not able to cancel it.
01:21:30.360 | If it is there, then we'll set the status to cancel
01:21:34.520 | and return true.
01:21:36.560 | OK, so this is interesting now.
01:21:40.440 | We've got more than one tool.
01:21:43.640 | And the only reason that Claude can possibly
01:21:48.480 | know what tool to use when, if any,
01:21:51.240 | is from their descriptions in the doc string here.
01:21:55.600 | So if we now go chat.tools, because we passed it in,
01:22:02.920 | you can see all the functions are there.
01:22:05.120 | And so when it calls them, it's going to, behind the scenes,
01:22:08.120 | automatically call getSchema on each one.
01:22:13.560 | But to see what that looks like, we could just do it here.
01:22:16.920 | And so getSchema is actually defined
01:22:24.200 | in a different library, which we created called tools.lm.
01:22:34.600 | OK, getSchema, oops, O for O in chat.tools, there you go.
01:22:46.800 | So you basically end up with something
01:22:48.400 | pretty similar to what Anthropx version had manually.
01:22:54.160 | So yeah, we can say, tell me the email address of customer C1.
01:22:59.080 | And I mean, Claude doesn't know.
01:23:03.400 | So it says, oh, you need to do a tool use.
01:23:08.720 | You need to call this function.
01:23:10.680 | You need to pass in this input.
01:23:13.400 | And so remember, with our thing, that's already now
01:23:17.040 | already got added to the history.
01:23:18.760 | So we just call chat.
01:23:20.120 | And it automatically calls it on our history.
01:23:24.120 | And there it is.
01:23:25.400 | And you can see this retrieving customer C1
01:23:27.480 | is because we added a print here.
01:23:31.240 | So you can see, as soon as we got that request,
01:23:35.640 | we went ahead and retrieved C1.
01:23:37.640 | And so then we call chat.
01:23:38.800 | It just passes it back.
01:23:40.280 | And there we go.
01:23:41.400 | There's our answer.
01:23:43.680 | So can I channel my inner--
01:23:45.600 | our dear friend Hamel has a thing about saying,
01:23:47.720 | you've got to show me the prompt.
01:23:49.080 | I already want to be able to inspect what's going on.
01:23:51.320 | Maybe we could do this at a couple of different levels.
01:23:53.600 | But can we see what was fed to the model?
01:23:55.560 | What was the history?
01:23:56.600 | Or what was the most recent request?
01:23:57.800 | Something like that.
01:23:58.600 | Yeah, so there's our-- here's our history.
01:24:00.800 | So the first message we passed in
01:24:03.800 | was, tell me the email address.
01:24:06.160 | It passed back an assistant message, which
01:24:08.320 | was a tool use block asking for calling this function
01:24:14.080 | with these parameters.
01:24:16.080 | And then we passed back-- and that had a particular ID.
01:24:18.480 | We passed back saying, oh, that tool ID request,
01:24:20.720 | we've answered it for you.
01:24:23.000 | And this was the response we got.
01:24:30.360 | And then it told us--
01:24:33.440 | OK, there's-- it's just telling us what we told it.
01:24:37.880 | Right.
01:24:39.520 | And then if I was really paranoid,
01:24:40.840 | like I wanted to see the actual tool definitions and things,
01:24:43.440 | and the actual requests, is there
01:24:44.920 | a way to dig to that deeper level
01:24:46.960 | beyond just looking at the history?
01:24:48.520 | Yeah, so we can do this.
01:24:52.680 | It's a bit of fun.
01:24:58.320 | So that has to be done before you import Anthropic.
01:25:02.680 | So we'll set it to debug.
01:25:05.880 | And so now if we call that, OK, it
01:25:15.280 | tells us everything that's going on.
01:25:17.440 | And so here is the request method post URL, headers,
01:25:25.000 | JSON data, HTTP request.
01:25:29.240 | Nice.
01:25:33.160 | Yeah.
01:25:34.600 | So if we do that here.
01:25:38.040 | So now this is including all of that same information again,
01:25:44.160 | because the model on Anthropic's side is not stateful.
01:25:47.240 | So we pass the full history.
01:25:48.400 | We can see, OK, we've still got all of the tools,
01:25:52.480 | the definitions in there.
01:25:53.800 | We've still got all the previous messages.
01:25:56.160 | Yeah, so this is like, it's a bit of a pain
01:25:58.320 | to have all this output all of the time.
01:26:00.000 | But if you're playing around with this,
01:26:01.600 | I'd recommend turning this on until you
01:26:03.320 | can trust that the library does what you want.
01:26:07.480 | And it's nice to be able to have--
01:26:08.920 | Thank you, Anthropic, for having that environment variable.
01:26:11.400 | It's very nice.
01:26:13.080 | Yeah, because in the end, if you're stuck on something,
01:26:15.680 | then all that's happening is that those pieces of text
01:26:18.880 | are being passed over an HTTP connection
01:26:21.160 | and passed back again.
01:26:22.120 | So there is nothing else.
01:26:24.400 | So that's a full debug.
01:26:26.920 | Thanks.
01:26:27.400 | That's a good question, Jono.
01:26:28.640 | So yeah, this is an interesting request.
01:26:33.080 | Please cancel all orders for customer C1.
01:26:36.480 | So this is interesting, because it can't be done in one go.
01:26:41.760 | So the answer it gave us was, OK, tell me about customer C1.
01:26:48.600 | But that doesn't finish it.
01:26:50.840 | So what actually happens?
01:26:52.960 | Well, I mean, we could actually show that.
01:26:54.720 | So if we pass it back, then it says, OK, there are two orders.
01:27:06.840 | Let's cancel each one.
01:27:10.000 | And it has a tool use request.
01:27:15.560 | So it's passed back some text and a tool use request
01:27:19.040 | to cancel order A1.
01:27:22.000 | It's not being that smart.
01:27:23.240 | And I think it's because we're using Haiku.
01:27:25.000 | If we are using Opus, it probably
01:27:27.200 | would have had both tool use requests in one go.
01:27:31.040 | Well, let's find out.
01:27:32.000 | So if we change this model to model 0--
01:27:38.760 | Definitely slower.
01:27:45.120 | Definitely slower.
01:27:47.120 | Yeah.
01:27:49.200 | Oh, you can see here.
01:27:50.120 | So this is something interesting that it does.
01:27:52.400 | It has these thinking blocks.
01:27:57.960 | That's something that Opus, in particular, does.
01:28:00.560 | So then-- no, OK, it's still only doing one at a time.
01:28:10.120 | So it's fine.
01:28:10.920 | Does it only use those thinking blocks
01:28:15.480 | or does it need to use tool use?
01:28:16.360 | I haven't seen them before when I do API access.
01:28:19.360 | OK, that's why I haven't seen them.
01:28:20.920 | As far as I know.
01:28:22.560 | OK, so basically, you can see we're
01:28:24.280 | going to have to-- given that it's only doing one at a time,
01:28:26.800 | it's going to take at least three goes--
01:28:28.640 | one to get the information about the customer,
01:28:30.840 | then to cancel order A1, and then to cancel order A2.
01:28:33.920 | But each time that we get back another tool use request,
01:28:38.200 | we should just do it automatically.
01:28:39.680 | There's no need to manually do this.
01:28:42.400 | So we've added a thing here called tool loop.
01:28:46.280 | And for up to 10 steps, it'll check whether it's
01:28:51.480 | asked for more tool use.
01:28:53.680 | And if so, it just calls self again.
01:29:01.880 | That's it.
01:29:03.520 | Just like we just called-- because self is chat, right?
01:29:06.960 | Just keeps doing it again and again.
01:29:09.880 | Optionally, I added a function that you could just
01:29:12.120 | call, for example, trace func equals print.
01:29:14.200 | It'll just print out the request each time.
01:29:16.480 | And I also added a thing called continuation func, which
01:29:19.440 | is whether you want to continue.
01:29:21.600 | So if these are both empty, then nothing happens.
01:29:24.160 | It's just doing that again and again and again.
01:29:28.000 | So super simple function.
01:29:30.280 | So now, if we say, can you tell me the email address
01:29:35.200 | for customer C1, we never have to--
01:29:39.520 | we never have to do that.
01:29:43.200 | Just does it for us until it's finished.
01:29:44.960 | And it says, sure, there you go.
01:29:46.840 | It's like, OK, please cancel all orders for customer C1.
01:29:51.000 | Retrieving, canceling.
01:29:52.880 | Now, why did it only do O2?
01:29:57.680 | Oh, I think we already canceled O1.
01:30:00.360 | Let's do that again.
01:30:01.280 | There we go.
01:30:07.200 | So we are agentic, are we not?
01:30:11.320 | Yes, definitely.
01:30:14.880 | Yeah, so when people say I've made an agent,
01:30:16.680 | it's like, oh, congratulations.
01:30:17.960 | You have a for loop that calls the thing 10 times.
01:30:23.400 | It's not very fancy, but it's nice.
01:30:28.360 | It's such a simple thing.
01:30:32.200 | And so now we can ask it, like, OK, how
01:30:34.200 | do we go with O2 again?
01:30:35.680 | And remember, it's got the whole history, right?
01:30:40.400 | So it now can say, like, oh, yeah,
01:30:42.000 | you told me to cancel it.
01:30:43.080 | It is canceled.
01:30:44.360 | Cool, cool.
01:30:46.920 | So something I never tried is--
01:30:58.240 | great, now cancel order O6.
01:31:04.920 | I think it should get back false, and it should know.
01:31:09.880 | Yeah, there we go.
01:31:10.640 | Not successful, that's good.
01:31:17.040 | Nice.
01:31:19.200 | So here's a fun example.
01:31:21.560 | Let's implement Code Interpreter,
01:31:23.640 | just like ChatGPT, because Claude doesn't
01:31:25.360 | have the Code Interpreter.
01:31:26.800 | So now it does.
01:31:29.480 | So I created this little library called ToolsLM.
01:31:32.800 | So we've already used one thing from it,
01:31:40.200 | which is GetSchema, which is this little thing here.
01:31:45.840 | And it's actually got a little example of a Python Code
01:31:53.200 | Interpreter there.
01:31:55.760 | Yeah, it's also got this little thing called Shell.
01:32:00.360 | So yeah, we're going to use that.
01:32:09.880 | So GetShell is just a little Python Interpreter.
01:32:12.600 | So we're going to create a subclass of chat
01:32:20.400 | called CodeChat.
01:32:23.920 | And CodeChat is going to have one extra method in it,
01:32:28.800 | which is to run code.
01:32:31.520 | So code to execute in a persistent session.
01:32:34.880 | So this is important to tell it all this stuff,
01:32:37.080 | like it's a persistent IPython session.
01:32:39.480 | And the result of the expression on the last line
01:32:41.720 | is what we get back.
01:32:43.360 | If the user declines request to execute, then it's declined.
01:32:51.080 | And so you can see here, I have this little confirmation
01:32:54.320 | message.
01:32:54.880 | So I call input with that message.
01:32:57.560 | And if they say no thank you, then I return declined.
01:32:59.840 | And I try to encourage it to actually do
01:33:06.200 | complex calculations yourself.
01:33:08.800 | And I have a list of imports that I do automatically.
01:33:13.840 | So that's part of the system prompt.
01:33:17.120 | You've already got these imports done.
01:33:20.240 | And just a little reminder, Haiku is not so smart.
01:33:22.960 | So I tend to be a little bit more verbose about reminding it
01:33:26.880 | about things.
01:33:27.920 | And I wanted to see if it could combine the Python
01:33:32.520 | tool with other tools.
01:33:33.600 | So I created a simple little thing here called getUser.
01:33:36.160 | It just returns my name.
01:33:37.280 | So if I do CodeChat--
01:33:44.400 | so I'm going to use Sonnet, which
01:33:47.840 | is less stupid than Haiku.
01:33:50.600 | So in trying to figure out how to get Haiku to work--
01:33:56.720 | in fact, let's use Haiku.
01:33:59.440 | One thing that I found really helped a lot
01:34:01.440 | was to give it more examples of how things ought to work.
01:34:05.240 | So I actually just set the history here.
01:34:08.040 | So I said, oh, I asked you to do this.
01:34:10.160 | So I asked you to do this.
01:34:17.040 | And then you gave me this answer.
01:34:18.720 | And then I asked you to do this.
01:34:20.240 | And you gave me this answer.
01:34:21.560 | So these aren't the actual messages
01:34:32.160 | that would include the actual tool calling syntax or anything?
01:34:35.000 | That doesn't cause trouble--
01:34:35.840 | No, I didn't bother with that.
01:34:37.120 | --in plain text?
01:34:37.760 | Yeah.
01:34:39.000 | It seems to be enough for it to know what I'm talking about.
01:34:44.080 | Yeah.
01:34:45.000 | If you wanted that full thing, I guess
01:34:46.800 | you could have this conversation with it,
01:34:49.080 | install the history or something like that.
01:34:50.880 | Well, I'm going to add it.
01:34:53.120 | So for the OpenAI one, I just added today, actually,
01:34:56.440 | something called mockToolUse, which
01:34:59.640 | is a function you can call because GPT does care.
01:35:03.560 | So we might add the same thing here, mockToolUse.
01:35:08.440 | And you just pass in the--
01:35:10.480 | yeah, here's the function you're pretending to call.
01:35:12.680 | Here's the result we're pretending that function had.
01:35:14.920 | Yeah.
01:35:15.420 | OK, so create a one-line--
01:35:25.240 | no, must have broken it at some point.
01:35:29.200 | OK, we'll use Sonnet.
01:35:35.400 | Create a one-line function for a string s
01:35:38.040 | that multiplies together the ASCII values of each character
01:35:40.720 | in s using reduce.
01:35:44.480 | Call tool loop with that.
01:35:48.120 | And OK, press Enter to execute or N to skip.
01:35:52.160 | So that's just coming from this input with this message.
01:35:57.120 | So it's actually, if you enter anything at all, it'll stop.
01:36:08.240 | We'll press N. OK, so it responded with a tool use
01:36:17.400 | request to run this code.
01:36:19.920 | And because it's in the tool loop, it did run that code.
01:36:28.480 | And it also responded with some text.
01:36:31.000 | So it's responded with both text as well as a tool use request.
01:36:37.720 | And this doesn't return anything.
01:36:44.240 | The print doesn't return anything.
01:36:45.840 | So all that's happened is that behind the scenes,
01:36:52.840 | we created this interactive Python shell
01:37:00.960 | called a self.shell.
01:37:02.760 | Self.shell ran the code it requested.
01:37:05.880 | So that shell should now have in it
01:37:08.960 | a function called checksum.
01:37:12.040 | So in fact, we can have a look at that.
01:37:14.920 | There's the shell.
01:37:16.920 | And we can even run code in it.
01:37:18.800 | So if I just write checksum, that should show it to me.
01:37:24.600 | Result equals function lambda.
01:37:33.240 | Checksum, there you go.
01:37:41.080 | So you can play around with the interpreter yourself.
01:37:43.880 | So you can see it has the interpreter has now
01:37:45.640 | got this function defined.
01:37:46.960 | And so this is where it gets quite interesting.
01:37:48.920 | Use it to get the checksum of the username of this session.
01:37:51.960 | So it knows that one of the tools it's been told exists
01:37:57.640 | is getUser.
01:37:59.240 | And in this code chat, I automatically
01:38:01.880 | added a setTools, append the self.runCell automatically.
01:38:12.720 | So it now knows that it can get the user and it can run a cell.
01:38:15.640 | Sorry, it can run a cell.
01:38:18.080 | So if I now call that, you can see it's called getUser.
01:38:24.440 | Found out the name's Jeremy.
01:38:25.960 | Then asked to get the checksum of Jeremy.
01:38:28.080 | There it goes.
01:38:28.840 | So you can see this is doing a tool
01:38:31.120 | use with multiple tools, including our code interpreter,
01:38:34.760 | which I think is a pretty powerful concept, actually.
01:38:43.440 | And if you wanted to see the actual code it was writing,
01:38:46.160 | you could change the trace function,
01:38:47.720 | or look at the history, or inspect that in some other way.
01:38:50.440 | Yeah, so we could change the trace function
01:38:53.480 | to print, for example.
01:38:55.960 | So we've used showContents, which is specifically just
01:38:59.760 | trying to find the interesting bit.
01:39:01.440 | If we change it to print, it'll show everything.
01:39:05.120 | Or yeah, you can do whatever you like in that trace function.
01:39:07.620 | You don't really have to show things.
01:39:10.400 | And of course, we could also set the anthropic lobbying debug
01:39:13.920 | thing to see all the requests going through.
01:39:16.880 | So yeah, none of this needs to be mysterious.
01:39:22.720 | Yeah, so at the end of all this, we
01:39:25.880 | end up with a pretty convenient wrapper,
01:39:33.480 | where the only thing I bother documenting on the home page
01:39:36.240 | is chat, because that's what 99.9% of people use.
01:39:39.760 | You just call chat, you just pass in a system prompt,
01:39:42.880 | you pass in messages, you can use tools,
01:39:49.760 | and you can use images.
01:39:52.000 | So for the user, there's not much to know, really.
01:39:56.680 | Right.
01:39:57.200 | It's only if you want to mess around making your own code
01:39:59.640 | interpreter, or trying something that you'd even
01:40:02.880 | look a little bit deeper.
01:40:06.000 | Yeah.
01:40:06.920 | Yeah.
01:40:07.560 | Exactly.
01:40:08.060 | I don't know.
01:40:11.680 | I mean, I feel like I'm quite excited about this way
01:40:15.040 | of writing code as being something
01:40:19.200 | like I feel like I can show you guys.
01:40:22.120 | Like, all I did just now was walk through the source
01:40:25.880 | code of the module.
01:40:28.880 | But in the process, hopefully, you
01:40:33.040 | might have done it all already, but if you didn't,
01:40:35.160 | you learned something about Claude,
01:40:38.280 | and the anthropic API, and the React pattern,
01:40:42.360 | and blah, blah, blah.
01:40:45.400 | And I remember I asked you, Jono, when I first
01:40:48.200 | wrote it, actually, and I said, oh,
01:40:50.160 | could you read this notebook and tell me what you think?
01:40:52.480 | And you said the next day, OK, I read the notebook.
01:40:54.600 | I now feel ready that I could both use and work
01:40:58.680 | on the development of this module.
01:41:01.720 | I was like, OK, that's great.
01:41:04.040 | Right, yeah.
01:41:05.120 | And I think it definitely depends
01:41:06.960 | where you're coming from, how comfortable you
01:41:08.760 | are at those different stages.
01:41:10.040 | I think using it, it's very nice, very approachable.
01:41:12.920 | You get the website with the documentation right there.
01:41:15.240 | The examples that you use to develop the pieces
01:41:17.880 | are always useful examples.
01:41:20.440 | Yeah, reading through the source code definitely felt like, OK,
01:41:23.520 | I think I could grasp where I needed to make changes.
01:41:26.320 | I had to add something for a different project
01:41:28.320 | we were working on.
01:41:29.120 | It was, OK, I think I can see the bits that
01:41:30.920 | are important for this.
01:41:32.880 | Yeah, so it's quite a fun way to build stuff.
01:41:35.040 | I think it's quite approachable.
01:41:40.080 | I think the bits that I'd expect people
01:41:44.280 | might find a little tricky if they
01:41:45.680 | haven't seen this sort of thing before.
01:41:47.560 | There's two bits.
01:41:48.320 | One is you have some personal metaprogramming practices that
01:41:54.720 | aren't part of normal Python.
01:41:55.960 | So patching into classes instead of defining in classes,
01:41:59.960 | liberal use of delegates, liberal use of double star
01:42:04.200 | keyword arg unpacking, stuff like that.
01:42:07.480 | And then the second is the not exactly eval,
01:42:12.280 | but eval-ish metaprogramming around interpreting--
01:42:14.960 | I mean, it's using the symbol table as a dictionary.
01:42:18.160 | Yeah, yeah.
01:42:19.080 | I mean, these are all things that you
01:42:20.800 | would have in an advanced Python course.
01:42:24.200 | They're beyond loops and conditionals.
01:42:29.040 | And I think they're all things that
01:42:33.880 | can help people to create stuff that otherwise
01:42:38.680 | might be hard to create or might have otherwise
01:42:40.800 | required a lot of boilerplate.
01:42:42.600 | So in general, my approach to coding
01:42:44.400 | is to not write stuff that the computer should
01:42:50.680 | be able to figure out for me.
01:42:55.320 | And you can take the same approach,
01:42:58.400 | even if you're not quite the Jeremy, where
01:43:00.760 | instead of importing from FastCore and from the tools
01:43:04.760 | on library that you've written separately,
01:43:06.640 | it's like, oh, often I'll have a utils notebook, right?
01:43:09.480 | Which is like, oh, there's things
01:43:10.880 | that are completely orthogonal to what I'm actually doing,
01:43:14.000 | which is like a tool loop with a chatbot.
01:43:15.760 | It's like, oh, I have a thing for getting a list
01:43:21.200 | into a different format or a thing for reading files
01:43:23.800 | and changing the image type and resizing it to a maximum size.
01:43:27.480 | You can still have the same idea where, like,
01:43:29.360 | here's the main notebook.
01:43:30.400 | This is me implementing the pieces one by one.
01:43:32.360 | Maybe there's somewhere else where, like, oh, there's
01:43:34.560 | a few utility functions that are defined and documented
01:43:37.080 | in their own separate place so they don't clutter things up.
01:43:39.640 | This more general literate notebook-driven small example
01:43:43.880 | of atomic piece-driven development
01:43:45.440 | doesn't require that you've written the FastCore library,
01:43:48.280 | but it's helped along by having those pieces at hand.
01:43:51.440 | Yeah.
01:43:51.960 | And also, in general, FastCore, particularly FastCore.basics,
01:43:56.520 | is designed to be the things I feel like maybe
01:44:00.320 | could have been in Python.
01:44:03.440 | I almost never write any script or notebook
01:44:06.880 | without importing from that and without using stuff from that
01:44:09.600 | because I'm, like, the things that I just
01:44:12.000 | think you use all the time.
01:44:15.000 | I'll say, like, it's something I've definitely noticed.
01:44:17.520 | There's two reactions I see to people reading my code, which,
01:44:26.360 | as you say, it's kind of like it's
01:44:28.720 | got a particular flavor to it.
01:44:30.840 | And it's very intentionally not the same as everybody else's.
01:44:35.040 | Some people get really angry.
01:44:36.960 | And they're like, why don't you just do everything the same way
01:44:41.480 | as everybody else?
01:44:42.880 | And some people go, like, wow, there's a bunch of things here
01:44:47.120 | I haven't seen before.
01:44:48.800 | I'm so excited this is an opportunity
01:44:50.320 | to learn those things.
01:44:51.560 | And so, like, our friend Hamill, in particular,
01:44:54.760 | this happens all the time.
01:44:55.800 | He's just like, oh, I was just reading your source code
01:44:57.560 | to that thing you did, and I learned about this whole thing.
01:45:00.400 | That's really helpful.
01:45:01.720 | So I don't mind.
01:45:03.760 | People can react in either way.
01:45:05.200 | [LAUGHTER]
01:45:07.040 | They've been to the stage in the video.
01:45:08.760 | They're likely in the second class.
01:45:10.400 | Probably.
01:45:11.840 | Otherwise, they would have given up long ago.
01:45:14.160 | I think also it helps--
01:45:15.200 | What's this star, star, fuck, dot, args?
01:45:17.460 | [LAUGHTER]
01:45:19.840 | If people can relate it to things in other languages,
01:45:21.640 | it doesn't seem so alien.
01:45:22.600 | Like, every time you do patch, I'm just like, OK,
01:45:24.200 | it's a Swift extension.
01:45:25.520 | Or every time you, you know, a lot of these things
01:45:27.840 | have analogies in other languages.
01:45:28.720 | Yeah, it's exactly a Swift extension.
01:45:30.280 | Or in Ruby, they even call it monkey patching.
01:45:32.640 | Yeah.
01:45:33.360 | But because it's, like, not built
01:45:35.840 | into the standard library, some Python programmers are like,
01:45:38.360 | no, you're not allowed to use the dynamic features
01:45:41.960 | of this dynamic language that were created to make it dynamic.
01:45:47.160 | Anyhow, yeah.
01:45:48.520 | Nice.
01:45:49.200 | Cool.
01:45:49.720 | Well, thank you, Jeremy.
01:45:50.680 | I think this is hopefully--
01:45:52.120 | this is like the ultra in-depth.
01:45:54.280 | If you were reading the source code and you wanted more,
01:45:56.840 | well, I think you got all the more you could want.
01:46:00.480 | Yeah, hopefully we might try and do these for a few other
01:46:02.920 | projects as well and work through the backlog
01:46:04.920 | of the little side things that haven't really been documented.
01:46:08.000 | But yeah, is there anything else you wanted to add for this
01:46:10.680 | or, like, introduce this series, I guess?
01:46:14.360 | I mean, series is probably too grand a word, you know?
01:46:18.280 | I think it's just like, from my point of view,
01:46:22.040 | I wanted an opportunity to--
01:46:23.720 | mainly to hear from other folks at answer.ai
01:46:31.040 | more about their work.
01:46:32.680 | And so partly, this is like my cunning plan
01:46:35.800 | is to, like, if I do one, maybe other people
01:46:38.000 | will feel some social pressure to do the same thing.
01:46:40.400 | And I will then get to learn more about their work.
01:46:44.800 | And Alexis and I have had a similar conversation.
01:46:46.840 | And he's promised to teach me about some of his work
01:46:50.680 | soon as well.
01:46:51.520 | So hopefully that will be happening soon.
01:46:55.320 | And also, I think, in general, answer.ai
01:47:03.840 | is a public benefit corporation.
01:47:06.280 | And hopefully this is something that provides some level
01:47:08.720 | of public benefit to at least some people is--
01:47:11.800 | there's no real-- like in a normal company,
01:47:13.560 | this is probably something that would be a private, secret,
01:47:16.560 | password-protected internal series.
01:47:19.680 | And here, it's like, no, there's no need to do that.
01:47:23.920 | Other people can benefit, too, if they want to.
01:47:26.120 | Did any of you have anything else to add to that?
01:47:35.200 | My one thought is I think these are a good idea.
01:47:38.040 | And I think we should also explore
01:47:39.480 | the full range of durations.
01:47:41.320 | So it's good to do deep dives on something that has depth to it.
01:47:45.000 | It's also good to do shallow dives on something
01:47:48.400 | that is quick and might be trivial to the person
01:47:52.680 | explaining it, but is totally unfamiliar and, therefore,
01:47:55.800 | high value to the person who hasn't seen it before.
01:47:57.880 | And we can put those ones on TikTok, too.
01:48:00.720 | Yes, there we go.
01:48:01.680 | Let's see who achieves the first TikTok-length explainer.
01:48:07.200 | Awesome.
01:48:08.200 | Thanks, all.
01:48:09.440 | Thanks, everybody.
01:48:10.640 | Well done.
01:48:12.120 | Sure.
01:48:13.520 | Thanks.