Claudette source walk-thru - Answer.AI dev chat #1

00:00:00.000 | OK, hi, everybody.

00:00:01.400 | I'm Jeremy.

00:00:02.360 | And this is the first of our Answer.ai developer chats,

00:00:09.480 | where I guess we have two audiences.

00:00:12.320 | One is our fellow R&D folks at Answer.ai,

00:00:17.040 | who hopefully this will be a useful little summary of what

00:00:19.600 | we've been working on for them.

00:00:20.840 | But we thought we'd also make it public because why not?

00:00:23.960 | That way everybody can see it.

00:00:26.040 | So we've got Jono here.

00:00:27.960 | We've got Alexis here.

00:00:29.080 | We've got Griffin here.

00:00:30.560 | Hi, all.

00:00:31.560 | Hey.

00:00:33.400 | And we're going to be talking about a new library I've

00:00:37.720 | been working on called Claudette, which is Claude's

00:00:43.480 | friend.

00:00:45.440 | And Jono has been helping me a bit with the library,

00:00:49.360 | but he's going to, I think, largely feign ignorance

00:00:51.640 | about it today in order to be an interviewer to attempt

00:00:56.800 | to extract all its secrets out of my head.

00:00:59.320 | Does that sound about right, Jono?

00:01:01.840 | I think so, yeah.

00:01:03.360 | OK.

00:01:04.440 | I'm ready when you are.

00:01:06.520 | Cool.

00:01:07.000 | Well, maybe we should start with--

00:01:08.960 | do you want to pull up the landing page?

00:01:11.520 | And then I think there's a few different directions

00:01:14.640 | that I'd love to hear from you.

00:01:16.040 | One is the specifics of this library, how does it work?

00:01:20.000 | But maybe also, since especially this is the first developer

00:01:22.640 | chat and preview, we can also go into some of the meta questions

00:01:25.720 | like, when does something become a library like this?

00:01:29.720 | How is it built? What's the motivation?

00:01:32.200 | Et cetera, et cetera.

00:01:33.400 | Sounds good.

00:01:34.360 | All right.

00:01:34.840 | So here we are, the landing page.

00:01:40.720 | So it's a GitHub repo.

00:01:43.480 | It's a public repo.

00:01:45.640 | And in the top right is a link to the documentation.

00:01:51.160 | And the documentation that you see here, the index,

00:01:53.600 | is identical to the README.

00:01:57.880 | But it's better to read it here because it looks a bit better.

00:02:02.800 | Cool.

00:02:03.360 | Fantastic.

00:02:03.840 | And this is a library that people can pip install?

00:02:07.040 | Yep, exactly.

00:02:08.240 | Here it is, pip install Claudette.

00:02:10.920 | And you can just follow along.

00:02:14.960 | Hopefully, when you type the things in here,

00:02:18.080 | you'll get the same thing.

00:02:20.280 | Or you could-- so that main page is actually

00:02:28.840 | also an index.ipynb.

00:02:31.760 | So you could also open that up in Colab, for example.

00:02:36.400 | And if you don't want to install it locally,

00:02:38.760 | it should all work fine.

00:02:40.680 | Cool.

00:02:41.200 | So I'll start with the big question, which

00:02:42.560 | is, why does this exist?

00:02:43.800 | What is the point of this library in a nutshell?

00:02:47.360 | OK, so by way of background, I started working on it

00:02:56.480 | for a couple of reasons.

00:02:57.880 | One is just I feel like--

00:03:02.440 | felt like-- still feel a bit like Claude is a bit underrated

00:03:07.160 | and underappreciated.

00:03:08.160 | I think most people use OpenAI because that's

00:03:12.360 | kind of what we're used to.

00:03:13.560 | And it's pretty good.

00:03:15.720 | And with 4.0, it's just got better.

00:03:18.920 | But Claude's also pretty good.

00:03:20.560 | And the nice thing about some of these models

00:03:24.160 | now, with also Google there, is they all

00:03:26.320 | have their things they're better at and things

00:03:29.080 | that they're worse at.

00:03:30.000 | So for example, I'm pretty interested in Haiku

00:03:34.120 | and in Google's Flash.

00:03:36.440 | They both seem like pretty capable models

00:03:42.200 | that maybe they don't know about much,

00:03:44.480 | but they're pretty good at doing stuff.

00:03:47.520 | And so they might be good with retrieval,

00:03:49.320 | which is where you can help it with not knowing stuff.

00:03:52.840 | So yeah, I was pretty interested, particularly

00:03:54.800 | in playing with Haiku.

00:03:57.520 | And then a second reason is just I

00:04:01.840 | did this video called A Hacker's Guide to LLMs last year.

00:04:05.600 | And I just recorded it for a conference,

00:04:08.760 | really just to help out a friend, to be honest.

00:04:11.960 | And I put it up online publicly because I

00:04:14.120 | do that for everything.

00:04:15.600 | And it became, kind of to my surprise,

00:04:17.520 | my most popular video ever.

00:04:18.840 | It's about to hit 500,000 views.

00:04:22.000 | And one of the things I did in that was to say, like, oh, look,

00:04:24.720 | you can create something that has the kind of behavior

00:04:32.000 | you're used to seeing in things like Constructor and Langchain

00:04:35.280 | and whatever else in a dozen lines of code.

00:04:38.800 | So you don't have to always use big, complex frameworks.

00:04:42.320 | And a lot of people said to me, like, oh, I

00:04:44.640 | would love to be able to use a library that's that small

00:04:47.240 | so I don't have to copy and paste yours.

00:04:49.440 | And so I thought, like, yeah, OK, I'll

00:04:51.200 | try and build something that's super simple, very

00:04:57.040 | transparent, minimum number of abstractions

00:05:03.160 | that people can use.

00:05:04.080 | And that way, they still don't have to write their own.

00:05:06.400 | But they also don't have to feel like it's a mysterious thing.

00:05:09.080 | So yeah, so Claudette is designed

00:05:11.240 | to be this fairly minimal, like no really very few abstractions

00:05:19.920 | or weird new things to learn, take

00:05:22.080 | advantage of just Python stuff for Claude,

00:05:27.680 | but also to be pretty capable and pretty convenient.

00:05:34.960 | Cool.

00:05:35.640 | You think it would be fair to say that this is more

00:05:38.040 | replacing the maybe more verbose code that we've copied

00:05:40.960 | and pasted from our own implementations a few times

00:05:43.200 | versus introducing too many completely new abstractions?

00:05:46.080 | Is that the kind of level that it's in?

00:05:47.720 | Yeah, I think a lot of people got their start with LLMs

00:05:52.200 | using stuff like LangChain, which I think

00:05:55.840 | is a really good way to get started in some ways

00:05:58.120 | and that you can--

00:05:59.080 | it's got good documentation and good demos.

00:06:01.480 | But a lot of people kind of come away feeling like,

00:06:05.400 | I don't really know what it's doing.

00:06:07.480 | I don't really know how to improve it.

00:06:09.360 | And I don't feel like I'm really learning at this point.

00:06:12.440 | And also, I don't really know how

00:06:13.960 | to use all my knowledge of Python

00:06:15.520 | to build on top of this because it's

00:06:17.840 | a whole new set of abstractions.

00:06:21.320 | So partly, it's kind of for those folks to be like, OK,

00:06:24.160 | here's how you can do things a bit closer to the bone

00:06:28.160 | without doing everything yourself from scratch.

00:06:30.160 | And for people who are already reasonably capable Python

00:06:35.160 | programmers feel like, OK, I want to leverage that,

00:06:37.760 | jump into LLMs and leverage my existing programming knowledge.

00:06:41.600 | This is a path that doesn't involve first learning

00:06:44.880 | some big new framework full of lots of abstractions

00:06:47.120 | and tens of thousands of lines of code to something with,

00:06:50.760 | I don't know what it is, maybe a couple of hundred lines of code,

00:06:54.480 | all of which is super clear and documented.

00:06:58.240 | And you can see step-by-step exactly what it's doing.

00:07:02.080 | Cool.

00:07:02.580 | Well, that sounds good.

00:07:04.040 | Do you want to start with a demo of what it does,

00:07:05.520 | or do you want to start straight with those hundred lines of code

00:07:08.240 | and step us through it?

00:07:09.680 | You know what?

00:07:10.440 | I'm inclined-- normally, I'd say do the demo,

00:07:12.760 | but I'm actually inclined to step through the code

00:07:15.040 | because the code's a bit, as you know, weird in that the code is--

00:07:23.800 | so if you click on Claudette's source,

00:07:25.800 | we can read the source code.

00:07:29.160 | This is the source code.

00:07:31.560 | It doesn't look like most source code.

00:07:33.800 | And that's because I tried something slightly different

00:07:38.400 | to what I've usually done in the past, which I've tried to create

00:07:40.800 | a truly literate program.

00:07:44.040 | So the source code of this is something that we can

00:07:46.120 | and will read top to bottom.

00:07:47.920 | And you'll see the entire implementation,

00:07:52.160 | but it also is designed to teach you about the API

00:07:57.400 | it's building on top of and the things

00:07:59.800 | that it's doing to build on top of that and so forth.

00:08:02.000 | So I think the best way to show you what it does

00:08:05.480 | is to also show you how it does it.

00:08:08.480 | So I'm here in a notebook.

00:08:11.520 | And so that source code we were viewing

00:08:15.240 | was just the thing called Quarto, which

00:08:18.440 | is a blogging platform that, amongst other things,

00:08:20.760 | can render notebooks.

00:08:21.640 | So we're just seeing the rendered version of this notebook.

00:08:25.280 | And so the bits that you see in gray

00:08:32.680 | are code cells in the notebook.

00:08:37.920 | Here's this code cell.

00:08:40.360 | Here's this code cell.

00:08:41.360 | And then you'll see some bits that

00:08:51.560 | have this little exported source thing here,

00:08:54.840 | which you can close and open.

00:08:56.200 | You can close them all at once from this menu here,

00:08:58.600 | hide all code.

00:09:00.000 | And that basically will get rid of all the bits that

00:09:02.320 | are actually the source code of the model.

00:09:04.320 | And all you'll be left with is the examples.

00:09:07.920 | And if we say show all code, then you can see, yeah,

00:09:11.080 | this is actually part of the source code.

00:09:12.920 | And so the way that works is these things

00:09:17.920 | that they say exports.

00:09:18.880 | So this is bits that actually becomes

00:09:20.760 | part of the library itself.

00:09:23.200 | OK, so the idea of this notebook is, as I said,

00:09:30.120 | as well as being the entire source code of the library,

00:09:32.400 | it's also by stepping through it, we'll see how Claude works.

00:09:37.240 | And so Claude has three models, Opus, Sonnet, and Haiku.

00:09:43.320 | So we just chuck them into a list

00:09:45.640 | so that anybody now who's using Claudette

00:09:48.400 | can see the models in it.

00:09:51.160 | And so the way we can see how to use it

00:09:53.280 | is that the readme and the home page,

00:10:01.720 | again, is actually a rendered version of a notebook.

00:10:04.920 | And it's called index.ipynb.

00:10:07.080 | And so we can import from Claudette.

00:10:09.920 | And so you can see, for example, if I say models,

00:10:12.560 | it shows me the same models that came from here.

00:10:17.200 | OK, so that's how these work.

00:10:18.640 | It ends up as part of this Claudette notebook.

00:10:21.160 | So that's the best Claude bottle, middle, worst.

00:10:28.000 | But I like using this one, Haiku,

00:10:29.880 | because it's really fast and really cheap.

00:10:31.680 | And I think it's interesting to experiment

00:10:33.440 | with how much more you can do with these fast, cheap models

00:10:35.800 | now.

00:10:36.280 | So that's the one I thought would try out.

00:10:43.880 | Any questions or comments so far from anybody?

00:10:48.320 | I guess, so far, this is--

00:10:50.680 | like, there's no reason you couldn't just

00:10:52.400 | write the full name of the model every time,

00:10:54.240 | like a lot of people do.

00:10:55.680 | And if a new one comes out, you can do that.

00:10:58.800 | But this is just trying to make it as smooth as possible,

00:11:01.560 | even to these tiny little details, right?

00:11:03.800 | Yeah, I don't want to have to remember these things.

00:11:05.960 | And obviously, I wouldn't remember those dates

00:11:07.480 | or whatever.

00:11:08.000 | So otherwise, I could copy and paste them.

00:11:10.480 | But yeah, I find I've been enjoying--

00:11:12.200 | I've been using this for a few weeks now.

00:11:15.680 | Once it got to a reasonably usable point--

00:11:17.680 | and definitely, this tiny minor thing is something I found nice

00:11:20.720 | is to not have to think about model names ever again,

00:11:23.400 | and also know that it goes, like, best, middle, worst.

00:11:26.240 | So I don't even have to--

00:11:27.480 | I can just go straight to, like, OK, worst one.

00:11:31.000 | Cool.

00:11:32.520 | So they provide an SDK, Anthropic.

00:11:36.640 | So their SDK gives you this Anthropic class

00:11:42.600 | you could import from.

00:11:43.960 | So you can pip install it.

00:11:45.040 | If you pip install Claudette, you'll get this for free.

00:11:49.600 | So I think it's nice if you're going to show somebody

00:11:54.760 | how to use your code, you should, first of all,

00:11:56.800 | show how to use the things that your code uses.

00:11:59.120 | So in this case, basically, the thing we use

00:12:00.920 | is the Anthropic SDK.

00:12:03.160 | So let's use it, right?

00:12:04.960 | So the way it works is that you create the client.

00:12:10.520 | And then you call messages.create.

00:12:13.160 | And then you pass in some messages.

00:12:16.080 | So I'm going to pass in a message, I'm Jeremy.

00:12:19.440 | Each message has a role of either user or assistant.

00:12:25.160 | And in fact, they always--

00:12:26.800 | this is, like, if you think about it,

00:12:28.360 | it's actually unnecessary, because they always

00:12:30.600 | have to go user assistant, user assistant, user assistant.

00:12:35.960 | So if you pass in the wrong one, you get an error.

00:12:39.160 | So strictly speaking, they're kind of redundant.

00:12:42.160 | So in this case, and they're just dictionaries, right?

00:12:44.520 | So I'm going to pass in a list of messages.

00:12:47.080 | It contains one message.

00:12:48.080 | It's a user message.

00:12:49.200 | So this is something I've said, whereas the assistant is

00:12:52.120 | something the model said.

00:12:53.480 | And it says, I'm Jeremy.

00:12:55.040 | And then you tell it what model to use.

00:12:57.400 | And then you can pass in various other things.

00:12:59.160 | As you can see, there's a number of other things

00:13:05.560 | that you can pass in, like a system prompt, stop sequences,

00:13:08.720 | and so forth.

00:13:10.440 | And you can see here, they actually

00:13:12.000 | check for what kind of model you want.

00:13:17.280 | So if I go ahead and run that, I get back a message.

00:13:22.280 | And so messages can be dictionaries,

00:13:25.400 | or they can also be certain types of object.

00:13:28.880 | And on the whole, it doesn't really matter which you choose.

00:13:32.360 | When you build them, it's easier just to make them dictionaries.

00:13:35.920 | So a message has an ID.

00:13:37.160 | I haven't used that for anything, really.

00:13:38.960 | And it tells you what model you used.

00:13:40.960 | Now this one's got a role of assistant.

00:13:42.880 | And it's a message.

00:13:47.680 | And it tells you how many tokens we used.

00:13:51.920 | If you're not sure what tokens are and basics like that,

00:13:57.440 | then check out this Hacker's Guide to Language Models,

00:14:01.080 | where I explain all those kinds of basics.

00:14:05.920 | But the main thing is the content.

00:14:09.000 | And the content is text.

00:14:10.520 | It could also reply with images, for instance.

00:14:12.800 | So this is text.

00:14:14.160 | And the text is, that's what it has to say for me.

00:14:19.480 | So that's basically how it works.

00:14:22.600 | It's a nice, simple API design.

00:14:25.400 | I really like it.

00:14:26.920 | The OpenAI one is more complicated to work with,

00:14:31.200 | because they didn't decide on this basic idea of like, oh,

00:14:33.880 | user assistant, user assistant.

00:14:36.560 | OK, so one thing I really like--

00:14:39.080 | Can I ask a question?

00:14:39.960 | Yeah, hit me.

00:14:41.760 | So one thing I know, and I'm sure lots of other people

00:14:45.120 | do as well, is that often when you interact with an assistant,

00:14:47.640 | you provide a system message or guidance

00:14:50.920 | about how the assistant should see their role.

00:14:54.680 | Here, you didn't.

00:14:55.360 | You just started right off with a role from yourself as a user.

00:14:58.760 | Is that because the API or this library

00:15:02.080 | already starts with the default guidance to the assistants?

00:15:05.480 | There's a system prompt here.

00:15:07.600 | And the default is not given.

00:15:10.680 | So yeah, language models are perfectly

00:15:14.800 | happy to talk to you without a system prompt.

00:15:17.520 | Just means they have no extra information.

00:15:22.200 | But they went through instruction fine-tuning

00:15:25.320 | and RLHF, some of those examples would

00:15:28.080 | have had no system prompt.

00:15:29.320 | So they know how to have some kind of default personality,

00:15:33.080 | if you like.

00:15:34.800 | Yeah.

00:15:35.440 | Cool, thanks.

00:15:38.160 | OK, so I always think, like, my notebook

00:15:42.880 | is both where I'm working.

00:15:44.000 | So I want it to be clear and simple to use.

00:15:45.840 | And I also know it's going to end up rendered

00:15:47.800 | as our documentation and as our kind of rendered source.

00:15:51.920 | So I don't really want things to look like this.

00:15:56.040 | So the first thing I did was to format the output.

00:15:57.960 | So here is part of the API.

00:16:02.520 | This is exported.

00:16:03.960 | So the first thing I wanted to do

00:16:05.760 | was just, like, find the content in here.

00:16:11.400 | And so there's a number-- this is an array,

00:16:15.280 | as you can see, of blocks.

00:16:16.960 | So the content is the text block.

00:16:19.880 | So this is just something that finds the first text block.

00:16:23.880 | I mean, it's tiny.

00:16:25.080 | And so that means that now at least I've

00:16:29.600 | kind of got down to the bit that has--

00:16:32.800 | the bit I normally care about, because I don't normally

00:16:35.000 | care about the ID.

00:16:36.200 | I already know what the model is.

00:16:37.600 | I know what the role's going to be, et cetera.

00:16:41.640 | And then so from the text block, I want to pull out the text.

00:16:46.120 | So this is just something that pulls out the text.

00:16:48.760 | And so now from now on, I can always just say contents

00:16:52.800 | and get what I care about.

00:16:55.040 | So something I really like, though, is like, OK,

00:16:57.200 | this is good, but sometimes I want

00:16:58.920 | to know the extra information, like the stop sequence

00:17:01.360 | or the usage.

00:17:03.000 | So in Jupyter, if you create this particular named method

00:17:08.800 | representation in Markdown for an object,

00:17:14.000 | then it displays that object using that Markdown.

00:17:18.200 | So in this case, I'm going to put

00:17:20.560 | the contents of the object followed

00:17:24.080 | by the details as a list.

00:17:28.520 | And so you can see what that looks like here.

00:17:31.280 | There's the contents, and there's the details.

00:17:35.880 | And if you're wondering, like, OK,

00:17:38.160 | how did Jeremy add this behavior to Anthropx class?

00:17:45.520 | Now, this is a nice little fast core thing called patch,

00:17:48.800 | where if you define a function, and you say patch,

00:17:51.160 | and you give it one or more types,

00:17:53.400 | it changes those existing types to give it this behavior.

00:17:56.920 | So this is now, if we look at tools beta message

00:18:02.920 | dot prep for Markdown, there we go.

00:18:05.600 | We just put it in there.

00:18:07.720 | So that's nice.

00:18:08.480 | And so the other--

00:18:09.320 | yeah.

00:18:10.560 | - I was going to say, there's like a trade-off

00:18:13.200 | in terms of time here, where if you only ever

00:18:15.400 | had to look at something once, you just manually type out

00:18:18.760 | response dot messages zero dot block whatever dot choices

00:18:23.840 | dot text, right?

00:18:24.720 | You type all that up.

00:18:26.080 | You have to do it a million times.

00:18:27.520 | It's very nice to have these conveniences.

00:18:29.440 | - Yeah.

00:18:30.000 | Also, for the docs, right?

00:18:31.440 | Like, every time I want to show what the response is,

00:18:35.440 | this is now free.

00:18:37.560 | I think that's nice.

00:18:39.800 | Yeah.

00:18:41.720 | So I don't-- yeah.

00:18:43.080 | So I actually-- what I'm describing here

00:18:47.120 | is not the exact order it happened in in my head.

00:18:50.040 | Because, yeah, it wasn't until I did this a couple of times

00:18:52.800 | and was trying to find the contents and blah, blah, blah,

00:18:55.800 | that I was like, oh, this is annoying me.

00:18:57.480 | And I went back, and I added it.

00:18:59.080 | This is probably like 15 minutes later, I went back.

00:19:01.920 | And it's like, yeah, I wish that existed.

00:19:05.800 | I did know that usage tracking was

00:19:07.240 | going to be important, like how much money you're spending

00:19:09.880 | depends on input and output tokens.

00:19:12.360 | So I decided to make it easy to keep track of that.

00:19:16.360 | So I created a little constructor for usage.

00:19:19.720 | I just added a property to the usage class

00:19:24.240 | that adds those together.

00:19:26.400 | And then I added a representation.

00:19:28.360 | This one is used for strings as well.

00:19:30.600 | This is part of Python itself.

00:19:33.040 | If you add this in--

00:19:34.280 | so now if I say usage, I can see all that information.

00:19:36.760 | So that was nice.

00:19:37.960 | And then since we want to be able to track usage,

00:19:40.000 | we have to be able to add usage things together.

00:19:42.720 | So if you override done to add in Python, it lets you use plus.

00:19:47.720 | So that's something else I decided to do.

00:19:52.720 | And so at this point, yeah, I felt

00:19:54.200 | like all these basic things I'm working with all the time,

00:19:58.000 | I should better use them conveniently.

00:20:00.200 | And it only took a few minutes to add those.

00:20:04.720 | And then ditto, I noticed a lot of people--

00:20:06.760 | in fact, nearly everybody, including

00:20:08.600 | the Anthropic documentation, manually writes these.

00:20:12.760 | I mean, again, it doesn't take long.

00:20:14.720 | But it doesn't take very long to write this once either.

00:20:17.760 | And now if you just-- something as simple as defaulting

00:20:21.480 | the role, then it's just a bit shorter.

00:20:23.880 | I can now say makeMessage.

00:20:25.560 | And it's just creating that dictionary.

00:20:27.960 | OK.

00:20:28.680 | So now I can do exactly the same thing.

00:20:31.000 | OK.

00:20:34.960 | And so then since it always goes user, assistant, user,

00:20:38.800 | assistant, user, assistant, I thought, OK,

00:20:41.000 | you should be able to just send in a list of strings.

00:20:43.440 | And it just figures that out.

00:20:44.880 | So this is just going using i%2 to jump

00:20:48.480 | between user and assistant.

00:20:50.160 | And makeMessage, I then realized, OK,

00:20:57.160 | we should change it slightly.

00:20:58.520 | So if it's already a dictionary, we

00:21:00.240 | shouldn't change it and stuff like that.

00:21:02.120 | But basically, as you can see here,

00:21:04.920 | I can now pass in a list of messages a bit more easily.

00:21:10.480 | So my prompt was I'm Jeremy.

00:21:15.720 | R is a response.

00:21:19.200 | And then I've got another string.

00:21:24.520 | So if I pass in something which has a content attribute,

00:21:34.920 | then I use that.

00:21:38.560 | And so that way, you can see the messages now.

00:21:40.720 | I've got I'm Jeremy.

00:21:41.640 | And then the assistant contains the response

00:21:45.280 | from the assistant.

00:21:46.360 | So it's happy with that as well.

00:21:48.560 | It doesn't have to be a string.

00:21:50.240 | And so this is how--

00:21:52.400 | and OK, again, from people, if you've

00:21:54.160 | watched my LLM Hacker's Guide, you know this.

00:21:58.800 | Language models currently have no state.

00:22:01.440 | Like when you chat with chatGPT, it looks like it has state.

00:22:06.360 | You can ask follow-up questions.

00:22:08.720 | But actually, the entire previous dialogue

00:22:10.600 | gets sent each time.

00:22:12.320 | So when I say I forgot my name, can you remind me, please?

00:22:14.720 | I also have to pass it all of my previous questions

00:22:17.720 | and all of its previous answers.

00:22:19.840 | And that's how it knows what's happened.

00:22:22.880 | And so then hopefully--

00:22:26.760 | OK, I don't know your name.

00:22:27.960 | I referred to you as Jeremy.

00:22:29.320 | All right, well, so you do know my name.

00:22:31.000 | Thank you.

00:22:31.520 | So it turns out my name's Jeremy.

00:22:33.320 | OK, so I feel like something as simple as this

00:22:36.360 | is already useful for experimenting and playing

00:22:40.000 | around.

00:22:40.520 | And for me, I would rather generally use something

00:22:45.200 | like a notebook interface for interacting with a model

00:22:48.080 | than the kind of default chatGPT thing or Cloud thing.

00:22:54.240 | This is where I can save my notebooks.

00:22:56.400 | I can go back and experiment.

00:22:57.920 | I can do things programmatically.

00:23:00.880 | So this was a big kind of thing for me is like, OK, I want to--

00:23:05.600 | I want to make a notebook at least as ergonomic as chatGPT

00:23:11.680 | plus all of the additional usability of a notebook.

00:23:14.560 | So these are the little things that I think help.

00:23:17.280 | Passing the model each time seems weird

00:23:22.760 | because I generally pick the model once per session.

00:23:26.240 | So I just created this tiny class

00:23:28.200 | to remember what model I'm working with.

00:23:30.840 | So that's what this client class does.

00:23:32.640 | And the second thing it's going to do

00:23:34.200 | is it's going to keep track of my usage.

00:23:37.440 | So maybe the transition that I see happening right now

00:23:40.680 | is everything up until this point was like housekeeping

00:23:44.080 | of I'm doing exactly the same things

00:23:45.520 | as the official API can do, but I'm just

00:23:48.000 | making my own convenience functions for that.

00:23:50.880 | But then the official API doesn't

00:23:52.280 | give you tracking usage over multiple conversations,

00:23:54.520 | keeping track of the history and all of that.

00:23:56.400 | So it seems like now we're shifting to like, OK,

00:23:58.560 | I can do the same things that the API allows me to do,

00:24:01.400 | but now I don't have to type as much.

00:24:03.560 | I've got my convenience functions.

00:24:05.280 | But now it's like, OK, what else would I like to do?

00:24:06.880 | I'd like to start tracking usage--

00:24:08.440 | Yeah, exactly.

00:24:09.040 | --of the persistent model setting.

00:24:10.040 | OK.

00:24:10.520 | And this is kind of important to me

00:24:12.000 | because I don't want to spend all my money unknowingly.

00:24:17.000 | So I want it to be really easy.

00:24:18.280 | And so what I used to do was to always go back

00:24:20.560 | to the OpenAI or whatever web page

00:24:22.800 | and check the billing because you can actually

00:24:25.800 | blow out things pretty quickly.

00:24:28.000 | So this way, it's just like I just saying like, OK, well,

00:24:31.040 | let's just start with a use of 0.

00:24:35.040 | And then I just wrote this tiny little private thing here.

00:24:38.640 | We'll ignore prefill for now, which just stores

00:24:42.320 | what the last result was and adds the usage.

00:24:46.760 | So now when I call it a few times, each time I call it,

00:24:51.520 | it's just going to remember the usage.

00:24:54.600 | And so again, I was going to ignore stream for a moment.

00:24:59.400 | So then I define dunder call.

00:25:01.960 | So dunder call is the thing that you basically

00:25:05.440 | could create an object, and then you

00:25:07.240 | could pretend it's a function.

00:25:08.920 | So it's the thing that makes it callable.

00:25:10.840 | And so when I call this function,

00:25:13.800 | I'll come back to some of the details in a moment.

00:25:15.920 | But the main thing it does is it calls make messages

00:25:20.880 | on your messages.

00:25:22.560 | And then it calls the messages.create.

00:25:27.480 | And then it remembers the result and keeps track of the usage.

00:25:33.040 | So basically, the key behavior now

00:25:34.760 | is that when I start, it's got zero usage.

00:25:38.240 | I do something, and I've now tracked the usage.

00:25:43.360 | And so if I call it again, that 20 should be higher, now 40.

00:25:48.320 | So it's still not remembering my chat history or anything.

00:25:53.440 | It's just my usage history.

00:25:56.080 | So I like to do very little at a time.

00:26:00.320 | So you'll see this is like a large function by my standards.

00:26:05.120 | It's like 1, 2, 3, 4, 5, 6, 7, 8 whole lines.

00:26:08.680 | I don't want to get much bigger than that

00:26:10.400 | because my brain's very small.

00:26:11.960 | So I can't keep it all in my head.

00:26:14.480 | So that's just a small amount of stuff.

00:26:16.600 | So there's a couple of other things we do here.

00:26:19.720 | One is we do something which Anthropic

00:26:23.960 | is one of the few companies to officially support,

00:26:27.440 | which is called prefill, which is where you can say

00:26:31.760 | to Anthropic, OK, this is my question.

00:26:34.240 | What's the meaning of life?

00:26:36.240 | And you answered with this starting point.

00:26:41.480 | You don't say, please answer with this.

00:26:44.320 | It literally has to start its answer with this.

00:26:47.080 | That's called prefill.

00:26:48.440 | So if I call it, that's my object.

00:26:53.080 | With this question, with this prefill,

00:26:55.400 | it forces it to start with that answer.

00:26:59.400 | So yeah, so basically, when you call this little tracking

00:27:04.640 | thing, which takes track of the usage,

00:27:07.240 | this is where you also pass in the prefill.

00:27:09.400 | And so if you want some prefill, then as you can see,

00:27:11.600 | it just adds it in.

00:27:15.000 | And the way it also-- so that's just to the answer

00:27:17.080 | because Anthropic doesn't put it in the answer.

00:27:19.240 | And the way Anthropic actually implements

00:27:22.680 | this is that the messages, it gets

00:27:32.800 | appended as an additional assistant message.

00:27:35.360 | So it's the messages plus the prefill.

00:27:37.480 | So basically, you pass in an assistant message at the end.

00:27:40.200 | And then the assistant's like, oh, that's

00:27:41.920 | what I started my answer with.

00:27:44.320 | This isn't documented necessarily in their API

00:27:46.640 | because it's like, oh, this is how you send a user message

00:27:49.600 | and we'll respond with an assistant message.

00:27:51.400 | And you have to kind of dig a little bit more to say, oh,

00:27:53.760 | if I send you an assistant message as the last message

00:27:56.040 | in the conversation, this is how we'll interpret it.

00:27:58.360 | We'll continue on in that message and like--

00:28:00.200 | Yeah, they've got it here.

00:28:01.440 | So they've got-- Anthropic's good with this.

00:28:04.360 | They actually understand that prefill is incredibly powerful.

00:28:12.040 | Particularly, Claude loves it.

00:28:14.240 | Claude does not listen to system prompts much at all.

00:28:17.280 | And this is why each different model,

00:28:21.160 | you have to learn about its quirks.

00:28:23.440 | So Claude ignores system prompts.

00:28:25.840 | But if you tell it, oh, this is how

00:28:28.040 | you answered the last three questions,

00:28:30.360 | it just jumps into that role now.

00:28:32.240 | It's like, oh, this is how I behave.

00:28:34.560 | And it'll keep doing that.

00:28:36.800 | And you can maintain character consistency.

00:28:40.160 | So I use this a lot.

00:28:45.680 | And here's a good example.

00:28:47.080 | Start your assistant with OpenCurlyBrace.

00:28:50.040 | I mean, they support tool calling or whatever.

00:28:52.200 | But this is a simple way.

00:28:53.960 | So sometimes I will start my response

00:28:57.680 | with backtick, backtick, backtick Python.

00:29:00.320 | That forces it to start answering the Python code.

00:29:04.000 | So yeah, lots of useful things you can do with prefill.

00:29:08.240 | Have you noticed personally that the improvement is significant

00:29:12.400 | when you use prefill?

00:29:13.600 | I mean, I see they're recommending it.

00:29:15.280 | But I'm just curious what your anecdotal impression is.

00:29:17.520 | I mean, it answers it with that start.

00:29:21.520 | So yeah, if you want it to answer with that start,

00:29:23.840 | then it's perfect.

00:29:25.080 | Unfortunately, GPT-4.0 doesn't generally do it properly.

00:29:33.280 | Google's Gemini Pro does.

00:29:35.400 | And Google's Gemini Flash doesn't.

00:29:38.080 | So they're a bit all over the place at the moment.

00:29:40.320 | So then, yeah, the other thing you can do

00:29:45.200 | is streaming.

00:29:46.120 | Now, streaming is a bit hard to see with a short question.

00:29:49.880 | So we'll make a longer one.

00:29:56.000 | And so you can see it generally comes out, bop, bop, bop, bop,

00:29:58.440 | bop.

00:29:59.600 | I don't have any pre-written funny-- that's terrible.

00:30:02.800 | Oh, you know why?

00:30:04.480 | Because we're using Haiku.

00:30:06.440 | And Haiku is not at all creative.

00:30:11.960 | So if we go c.model equals model 0,

00:30:17.880 | we can upscale to Opus.

00:30:21.720 | Try again.

00:30:22.360 | All right.

00:30:28.120 | Slow enough.

00:30:28.720 | I don't think I want to wait to do the whole thing anyway.

00:30:32.200 | Oh.

00:30:32.700 | It's just going to keep going, isn't it?

00:30:38.320 | Stop.

00:30:39.560 | All right.

00:30:41.060 | OK.

00:30:43.020 | Let's go back to this one.

00:30:46.180 | Yeah.

00:30:46.820 | So streaming, it's one of these things

00:30:52.900 | I get a bit confused about.

00:30:55.780 | It's as simple as calling messages.stream

00:31:01.940 | instead of messages.create.

00:31:04.740 | But the thing you get back is an iterator,

00:31:07.540 | which you have to yield.

00:31:09.580 | And then once it's finished doing that,

00:31:12.060 | it stores the final message in here.

00:31:15.820 | And that's also got the usage in it.

00:31:17.900 | So anyway, this is some little things

00:31:19.940 | which, without the framework, would be annoying.

00:31:25.020 | So with this little tiny framework, it's automatic.

00:31:28.020 | And you see this in the notebook in its final form, right?

00:31:30.400 | But this call method was first written without any streaming.

00:31:34.020 | Like, get it working on the regular case first.

00:31:38.620 | No prefill first.

00:31:39.860 | So then it's three lines.

00:31:41.660 | And then it's OK.

00:31:43.020 | The original version is that.

00:31:48.260 | Yeah.

00:31:50.220 | Yeah.

00:31:52.140 | And then we can test the stream function by itself

00:31:55.020 | and test it out with the smaller primitives first

00:31:58.180 | and then put it into a function that then finally

00:32:00.980 | gets integrated in.

00:32:03.260 | Yeah.

00:32:04.580 | Exactly.

00:32:05.940 | And this is like one of these little weird complexities

00:32:10.220 | of Python is--

00:32:13.780 | John was asking me yesterday.

00:32:15.060 | It's like, oh, we could just refactor this and move this

00:32:18.500 | into here.

00:32:19.060 | And then we don't need a whole separate method.

00:32:21.260 | But actually, we can't.

00:32:22.380 | As soon as there's a yield anywhere in a function,

00:32:24.900 | the function is no longer a function.

00:32:26.460 | And it's now a coroutine.

00:32:28.020 | And it behaves differently.

00:32:29.420 | So this is kind of a weird thing where

00:32:31.700 | you have to pull your yields out into separate methods.

00:32:35.540 | Yeah, it's a minor digression.

00:32:37.700 | So yeah, you can see it's nice that it's now

00:32:44.660 | tracking our usage across all these things.

00:32:49.180 | And we can add the two together, prefill and streaming.

00:32:54.100 | OK.

00:32:55.740 | Yeah, any questions or comments so far?

00:32:59.500 | Mm-hmm.

00:33:00.500 | Yeah.

00:33:02.500 | And is there a way to try to reset the counter if you wanted

00:33:06.300 | just to be able to start over at some point?

00:33:10.020 | I mean, the way I would do it was I would just

00:33:13.060 | create a new client, C equals client.

00:33:15.780 | OK.

00:33:17.020 | But you could certainly go C.use equals usage 0, 0.

00:33:22.260 | In fact, 0, 0 is the default. So actually, now I

00:33:25.540 | think about it, we could slightly improve our code

00:33:28.540 | to remove three characters, which

00:33:32.300 | would be a big benefit because we don't like characters.

00:33:34.700 | We could get rid of those.

00:33:36.460 | Well, there you go.

00:33:39.460 | So yeah, you could just say C.use equals usage.

00:33:47.420 | And in general, I think people don't

00:33:54.580 | do this manipulating the attributes of objects

00:33:59.900 | directly enough.

00:34:01.420 | Why not?

00:34:02.420 | You don't have to do everything through--

00:34:04.940 | people would often create some kind of set usage method

00:34:07.980 | or something.

00:34:08.740 | No, don't do that.

00:34:09.860 | Paranoid Java style that--

00:34:11.380 | Exactly.

00:34:12.340 | --emerged like a decade ago for some reason.

00:34:14.460 | Oh, more stuff.

00:34:14.980 | Speaking of--

00:34:15.660 | Yeah, oh, yeah?

00:34:17.020 | Yeah, it was longer.

00:34:18.740 | Speaking of directly manipulating

00:34:22.180 | properties of objects, so you showed

00:34:24.780 | how we can use prefill to predefine

00:34:27.780 | the beginning of the assistance response.

00:34:30.500 | If I've had a multi-turn exchange with an assistant,

00:34:33.140 | can I just go in there and clobber

00:34:35.940 | one of the earlier assistant messages

00:34:38.220 | to convince the assistant that it said something it didn't?

00:34:40.940 | Because sometimes that's actually useful.

00:34:43.140 | Yeah, because we don't have any state in our class, right?

00:34:48.660 | OK.

00:34:49.180 | So we're passing in--

00:34:51.980 | so here, we're passing in a single string, right?

00:34:58.140 | But we could absolutely pass in a list.

00:35:02.740 | So I said hi, and the model said hi to you too.

00:35:09.420 | OK.

00:35:15.580 | I am Plato.

00:35:16.740 | I am Socrates.

00:35:23.940 | OK.

00:35:26.180 | Tell me about yourself.

00:35:29.340 | I don't know what will happen here,

00:35:30.820 | but we're just convincing it that this is a conversation

00:35:33.180 | that it's occurred.

00:35:34.740 | So now Claude is probably going to be slightly confused

00:35:37.780 | by the fact that it reported itself not to be Claude.

00:35:42.340 | No, I mean, I don't-- we haven't set a system message

00:35:44.540 | to say it's Claude.

00:35:45.460 | So there you go.

00:35:48.780 | No, not at all confused.

00:35:50.100 | I am Socrates.

00:35:53.500 | So as I said, Claude's very happy to be told what it said,

00:36:00.260 | and it will go along with it.

00:36:02.060 | I'm very fond of Claude.

00:36:03.060 | Claude has good vibes.

00:36:05.620 | Oh, you may be surprised to hear that I am actually Australian.

00:36:16.420 | This is the point of the video where

00:36:21.900 | we get sidetracked and talk to Rupes for a good long time.

00:36:24.540 | [LAUGHTER]

00:36:27.100 | Oh, not very interesting.

00:36:32.300 | What do you say, mate?

00:36:33.460 | [LAUGHTER]

00:36:35.460 | Fair enough.

00:36:36.020 | OK.

00:36:36.780 | So yeah, you can tell it anything you like,

00:36:38.980 | is the conversation, because it's got no state.

00:36:42.220 | Now it's forgotten everything it's just said.

00:36:44.300 | The only thing it remembers is its use.

00:36:46.700 | So that's what we've done so far.

00:36:49.940 | So remembering-- oh, actually, so before we do that,

00:36:55.260 | we'll talk about tool use.

00:36:57.740 | So yeah, basically, I wanted to, before we

00:37:01.740 | got into multi-turn dialogue automatic stuff,

00:37:04.100 | I wanted to have the basic behavior that Anthropic's SDK

00:37:12.700 | provides.

00:37:13.460 | I wanted to have it conveniently wrapped.

00:37:17.580 | So tool use is officially still in beta,

00:37:24.580 | but I'm sure it won't be for long.

00:37:27.860 | Can I ask one more pre-tool use case

00:37:29.860 | that I think occurs to me right away?

00:37:31.540 | And so maybe it'll occur to other people

00:37:33.220 | if they're curious.

00:37:35.580 | One thing you often find yourself

00:37:37.340 | doing when you're experimenting with prompts

00:37:39.380 | is going through a lot of variations of the same thing.

00:37:42.300 | So you have your template, and then you

00:37:43.980 | want very different parts of it.

00:37:46.180 | And before you write code to churn out variations,

00:37:50.580 | you're usually doing it a bit ad hoc.

00:37:52.500 | So using this API the way it is now,

00:37:54.780 | if I had a client and a bit of an exchange already built up,

00:37:59.180 | and then I wanted to fork that and create five of them

00:38:02.020 | and then continue them in five different ways,

00:38:04.700 | can I just duplicate--

00:38:08.020 | would the right way to do that would

00:38:09.540 | be to duplicate the client?

00:38:10.620 | Would the right way to do that just

00:38:12.080 | to be to extract the list that represents

00:38:14.020 | the exchange and create new clients?

00:38:15.540 | I'm just getting a sense for what

00:38:16.900 | would be the fluid way of doing it with this API?

00:38:19.460 | Let's do it.

00:38:22.100 | OK.

00:38:22.600 | OK.

00:38:23.100 | OK.

00:38:32.200 | Options equals-- how do you spell Zimbabwean, Johnno?

00:38:38.880 | Zimbab--

00:38:40.360 | Yep.

00:38:41.480 | E-W-E, yep.

00:38:44.560 | Print contents-- oh, you know what?

00:39:00.160 | This is boring again, because I think we've gone back

00:39:02.360 | to our old model.

00:39:04.560 | C.model equals models 0.

00:39:09.440 | No wonder it's got so dull.

00:39:12.240 | Haiku is just really doesn't like pretending.

00:39:16.400 | Oh, come on.

00:39:21.360 | Look what it's doing.

00:39:22.200 | All right, anyway, Claude is being a total disappointment.

00:39:37.240 | So the fact that it's reasonable to do this just

00:39:39.280 | by slamming a new list into the C function

00:39:42.600 | is an indication of what you just said,

00:39:44.960 | which is that there's no state hiding inside the C function

00:39:47.640 | that we need to worry about mangling when we do that.

00:39:49.880 | That's right.

00:39:50.380 | There's no state at all.

00:39:52.160 | Got it.

00:39:54.240 | So when we recall it, it knows nothing

00:39:56.160 | about being Socrates whatsoever.

00:39:57.960 | Everyone is a totally independent REST call to a--

00:40:01.040 | there's no session ID.

00:40:02.480 | There's no nothing to tie these things together.

00:40:07.680 | All right, we probably just spent cents on that question.

00:40:12.480 | It's so funny, they're like a few dollars per million tokens

00:40:19.480 | or something.

00:40:20.200 | I look at this and like, whoa, all those tokens.

00:40:22.200 | I'm like, oh, yeah, it's probably

00:40:23.620 | like $0.01 or something less.

00:40:25.440 | I've got to get used to not being too scared.

00:40:27.680 | OK, so tool use refers to things like this,

00:40:37.920 | get the weather in the given location.

00:40:39.880 | So there's a tool called getWeather.

00:40:41.640 | And then how would it work?

00:40:43.080 | I don't know.

00:40:43.760 | It would call some other weather API or something.

00:40:48.120 | So both in OpenAI and in Lord, tools

00:40:53.160 | are specified using this particular format, which

00:40:56.720 | is called JSON schema.

00:40:57.960 | And my goal is that you should never have to think about that.

00:41:06.760 | For some reason, nearly everything

00:41:08.240 | you see in all of their documentation

00:41:10.680 | writes this all out manually, which

00:41:12.440 | I think is pretty horrible.

00:41:14.160 | So instead, we're going to create a very complicated tool.

00:41:17.240 | It's something that adds two things together.

00:41:20.160 | And so I think the right way to create a tool

00:41:22.240 | is to just create a Python function.

00:41:26.600 | So the thing about these tools, as you see from their example,

00:41:33.280 | is they really care a lot about descriptions of the tool,

00:41:38.240 | descriptions of each parameter.

00:41:40.520 | And they say quite a lot in their documentation

00:41:44.720 | about how important all these details are to provide.

00:41:48.640 | So luckily, I wrote a thing a couple of years

00:41:56.040 | ago called Documents that makes it really easy to add

00:41:59.640 | information to your functions.

00:42:02.520 | And it basically uses built-in Python stuff.

00:42:05.480 | So the names of each parameter is just

00:42:07.280 | the name of the parameter.

00:42:08.360 | The type of the parameter is the type.

00:42:10.400 | The default of the parameter is the default.

00:42:12.320 | And the description of the parameter

00:42:13.820 | is the comment that you put after it.

00:42:15.920 | Or if you want more room, you can put the comment before it.

00:42:20.400 | Documents is happy with either.

00:42:23.600 | And you can also put a description of the result.

00:42:27.280 | You can also put a description of the function.

00:42:29.760 | And so if you do all those things, then you can see here.

00:42:36.480 | I said tools equals getSchema.

00:42:38.040 | So this is the thing that creates the JSON schema.

00:42:40.120 | So if I say tools, there you go.

00:42:42.680 | You can see it's created the JSON schema from that,

00:42:46.480 | including the comments have all appeared.

00:42:50.760 | And the return comment ends up in this returns.

00:42:56.600 | Yeah.

00:42:57.240 | And if you didn't do any of that,

00:42:58.720 | like if you just wrote a function sums that

00:43:00.520 | took two untyped variables A and B,

00:43:02.920 | you would still get something functional.

00:43:04.880 | The model would probably still be able to use it.

00:43:07.240 | But it just wouldn't be recommended.

00:43:08.880 | Is that right?

00:43:11.480 | I think-- well, I mean, my understanding

00:43:14.560 | is you have to pass in a JSON schema.

00:43:18.880 | So if you don't pass in a JSON schema,

00:43:21.120 | so you would have to somehow create that JSON schema.

00:43:24.480 | I don't know if it's got some default thing that

00:43:26.520 | auto-generates one for you.

00:43:28.360 | Oh, I'm more thinking like if we don't follow the documents

00:43:33.880 | format, for example.

00:43:34.840 | Oh, yeah.

00:43:35.360 | So if we got rid of these, so if we got rid of the documents,

00:43:42.240 | yep, you could get rid of the doc string.

00:43:44.440 | You could get rid of the types and the defaults.

00:43:49.480 | You could do that, in which case the--

00:43:52.160 | OK, so it does at least need types.

00:43:58.600 | So let's add types.

00:44:01.320 | Ah, well, that's a bit of a bug in my--

00:44:03.480 | You need to annotate it.

00:44:05.080 | Huh?

00:44:06.560 | No, yeah.

00:44:07.720 | It appears like you have to annotate it there.

00:44:10.360 | Well, I'll fix that.

00:44:11.680 | It shouldn't be necessary.

00:44:14.200 | Description.

00:44:15.080 | Oh, OK.

00:44:16.520 | It at least wants a doc string.

00:44:17.960 | OK, so currently, that's the minimum that it needs.

00:44:25.600 | And I don't know if it actually requires a description.

00:44:28.560 | I suspect it probably does, because otherwise, maybe

00:44:31.360 | I guess it could guess what it's for from the name.

00:44:34.560 | But yeah, it wouldn't be particularly useful.

00:44:37.640 | So OK, so now that we've got a tool, when we call Claude,

00:44:48.240 | we can tell it what tools there are.

00:44:50.800 | And now we're also going to add a system prompt.

00:44:53.560 | And I'm just going to use that system prompt.

00:44:57.920 | You don't have to, right?

00:44:59.280 | If you don't say you have to use it,

00:45:01.160 | then sometimes it'll try to add it itself,

00:45:03.640 | but it's not very good at adding.

00:45:05.120 | So I would like to--

00:45:08.240 | I also think user-facing, I think it's weird the way

00:45:11.880 | Claude tends to say, OK, I will use the sum tool

00:45:15.000 | to calculate that sum.

00:45:16.200 | It loves doing that.

00:45:17.400 | OpenAI doesn't.

00:45:19.360 | I think this is because Anthropic's a bit--

00:45:21.160 | like, they haven't got as much user-facing stuff.

00:45:23.160 | They don't have any user-facing tool use yet.

00:45:25.680 | So yeah, I don't think their tool use is quite

00:45:28.080 | as nicely described.

00:45:31.320 | So if we pass in this prompt, what is that plus that?

00:45:39.080 | We get back this.

00:45:44.200 | So we don't get back an answer.

00:45:46.800 | Instead, we get back a tool use message.

00:45:52.040 | The tool use says what tool to use

00:45:55.400 | and what parameters to pass it.

00:45:59.400 | So I then just wrote this little thing

00:46:05.480 | that you pass in your tool use block.

00:46:10.320 | So that's this thing.

00:46:12.800 | And it grabs the name of the function to call.

00:46:16.280 | And it grabs that function from your symbol table.

00:46:21.320 | And it calls that function with the input that was requested.

00:46:26.720 | So when I said the symbol table or the namespace,

00:46:29.760 | basically, this is just a dictionary

00:46:32.000 | from the name of the tool to the definition of the tool.

00:46:38.320 | So if you don't pass one, it uses globals, which,

00:46:40.720 | in other words, is every Python function

00:46:42.560 | you currently have available.

00:46:44.480 | You probably don't want to do that if it's

00:46:47.160 | like os.unlink or something.

00:46:51.960 | So this little make namespace thing

00:46:54.720 | is just something that you just pass

00:46:56.880 | in a bunch of functions to.

00:46:59.520 | And it just creates a mapping from the name to the function.

00:47:03.040 | So that way, this way, I'm just--

00:47:04.800 | yeah.

00:47:06.800 | I'm just going to say, if I'm a somewhat beginner,

00:47:09.240 | I'm approaching LLMs.

00:47:11.120 | I've seen your Hacker's Guide.

00:47:13.040 | This screen full of code is quite a lot

00:47:14.880 | of fairly deep Python stuff.

00:47:16.560 | We've got some typing going on.

00:47:18.120 | And I might not know what mappings are or callables.

00:47:20.840 | There's namespaces and get_atras and dicts and is_instance.

00:47:25.520 | How should I approach this code versus maybe the examples that

00:47:28.560 | are being interleaved?

00:47:29.560 | Because this is the source code of this library.

00:47:32.480 | But you're not writing this with lots of comments or explanations.

00:47:35.440 | It's more like the usage.

00:47:37.080 | So what should I-- like, if I come to this library

00:47:39.120 | and I'm reading the source code, how much

00:47:41.120 | should I be focusing on the deep Python internals

00:47:44.280 | versus the usage versus like the big picture?

00:47:46.880 | That's a good question.

00:47:47.840 | So for someone who doesn't particularly

00:47:54.480 | want to learn new Python things but just

00:47:56.680 | wants to use this library, this probably

00:48:00.320 | isn't the video for you.

00:48:01.560 | Instead, just read the docs.

00:48:05.520 | And none of that-- like, you can see in the docs,

00:48:07.960 | there's nothing weird, right?

00:48:09.280 | The docs just use it.

00:48:11.040 | And you don't need this video.

00:48:17.280 | It's really easy to use.

00:48:20.280 | So yeah, the purpose of this discussion

00:48:23.400 | is for people who want to go deeper.

00:48:27.880 | And yeah, the fact that I'm skipping over these details

00:48:33.880 | isn't because either they're easy

00:48:36.000 | or that everybody should understand them or any of that.

00:48:39.520 | It's just that they're all things

00:48:44.880 | that Google or ChatGPT or whatever

00:48:47.240 | will perfectly happily teach you about.

00:48:49.240 | So these are all things that are built into Python.

00:48:54.440 | But yeah, that'd probably be part of something

00:48:56.600 | called Python Advanced Course or something.

00:48:59.080 | So one of the things a lot of intermediate Python programmers

00:49:01.720 | tell me is that they like reading my code

00:49:05.440 | to learn about bits of Python they didn't know about.

00:49:07.960 | And then they use it as jumping off points to study.

00:49:13.280 | And that's also why, like, OK, why do I not have many comments?

00:49:17.240 | So my view is that comments should describe

00:49:21.760 | why you're doing something, not what you're doing.

00:49:25.880 | So for something that you could answer, like, oh,

00:49:29.040 | what does isInstanceABC.mapping do?

00:49:32.400 | You don't need a comment to tell you that.

00:49:34.160 | You can just Google it.

00:49:35.200 | And so in this case, all of the things I'm doing,

00:49:38.760 | once you know what the code does, why is it doing it

00:49:41.920 | is actually obvious.

00:49:43.760 | Like, why do we get the name of the function from the object?

00:49:48.720 | Or why do we pass the input to the function?

00:49:51.880 | I mean, that's literally what functions are.

00:49:53.720 | They're things you call them, and you pass in the input.

00:49:57.000 | Yeah.

00:49:57.840 | So I think that's a good question.

00:50:00.400 | Let's say, like, yeah, don't be--

00:50:04.320 | you actually don't need to know any of these details.

00:50:07.240 | But if you want to learn about them,

00:50:11.720 | yeah, the reason I'm using these features of the language

00:50:14.040 | is because I think they're useful features

00:50:15.800 | of the language.

00:50:16.640 | And if I haven't got a comment on them,

00:50:19.240 | it's because I'm using them in a really normal, idiomatic way

00:50:21.880 | that isn't worthy of a comment.

00:50:23.640 | So that means if you learn about how

00:50:25.160 | to use this thing for this reason,

00:50:27.640 | that's a perfectly useful thing to learn about.

00:50:30.800 | And you can experiment with it.

00:50:32.120 | And I'll add that, like, I'm learning this stuff

00:50:34.080 | as we code together on this as well, right?

00:50:35.960 | Like, you don't have to know any of this to be a good programmer,

00:50:39.000 | but it's really fun as well.

00:50:40.200 | And I think, like, some of these things we wrote multiple ways,

00:50:43.360 | maybe one that was more verbose first, and then we say,

00:50:45.600 | oh, I think we can do this in this more clever way

00:50:47.760 | if we condense it down.

00:50:48.800 | So if you are watching this and you are wanting to learn

00:50:51.080 | and you're still like, oh, I still

00:50:52.520 | don't know what some of these things are,

00:50:54.000 | I can't remember what the double--

00:50:55.400 | like, yeah, dig in and find out.

00:50:57.120 | But it's also, like, it's totally OK if you're not,

00:50:59.240 | like, comfortable at this.

00:51:01.120 | Yeah, the other thing I would say

00:51:02.880 | is the way I write all of my code, pretty much,

00:51:07.560 | is I don't write it in a function.

00:51:10.000 | I write nearly all of it outside of a function in cells.

00:51:16.360 | So you can do the same thing, right?

00:51:19.240 | So, like, let's set ns to none, so then I can run this.

00:51:24.680 | It's like, oh, what the hell is globals?

00:51:27.400 | It's like, oh, wow, everything in Python is a dictionary.

00:51:31.240 | Now, this is a really powerful thing, which

00:51:33.560 | is well worth knowing about.

00:51:38.240 | If I could offer just--

00:51:39.960 | yeah, sorry.

00:51:41.120 | But just offer one perspective to maybe make a little bridge

00:51:43.600 | from the kind of user point of view

00:51:45.320 | to the why these internals might be unfamiliar point of view,

00:51:49.000 | just to recap and make sure I understand it right.

00:51:51.800 | From the user point of view here, when we use tools,

00:51:54.520 | we get a response back from a plod,

00:51:59.000 | in the way we're doing it now, that describes a function

00:52:02.160 | that we now want to execute.

00:52:04.160 | Correct?

00:52:04.800 | That's the function to execute, and that's

00:52:06.560 | the input to provide to it.

00:52:09.600 | So with this library, I can write a function in Python

00:52:13.600 | and then tell clod to call the function that's

00:52:16.600 | sitting there on my system, right?

00:52:19.560 | Yeah, if it wants to.

00:52:21.000 | For that to work, if it wants to, if it chooses to.

00:52:23.640 | But for that to work, this library

00:52:25.800 | needs to do the magic of reading a text string that

00:52:29.680 | is clod's response, and then in Python,

00:52:33.680 | having that not be a text string,

00:52:35.240 | but having that become Python code that runs in Python.

00:52:38.720 | And that's a somewhat unfamiliar thing to do in Python.

00:52:41.800 | And that's what's called eval in JavaScript,

00:52:44.520 | or back in Lisp, where a lot of this stuff got started.

00:52:47.560 | And because that's not that sort of--

00:52:50.000 | Well, it's actually not.

00:52:51.640 | We're not actually doing an eval, right?

00:52:54.160 | OK, that's interesting.

00:52:55.560 | Yeah, we're definitely not doing eval.

00:52:59.400 | So in the end, this is the function we want to call.

00:53:04.200 | So I can call that, and there's the answer.

00:53:07.920 | In Python, this is just syntax sugar, basically,

00:53:13.160 | for passing in a dictionary and dereferencing it.

00:53:19.080 | So those are the same.

00:53:20.520 | Those are literally the same thing,

00:53:22.840 | as far as Python is concerned.

00:53:27.280 | So we were never passed a string of code to eval or execute.

00:53:36.200 | We were just told, call this tool

00:53:39.040 | and pass in these inputs.

00:53:41.600 | So to find the tool by string, we look it up

00:53:46.080 | in the symbol table.

00:53:47.440 | So let's just change fc.name to fc_name.

00:53:49.760 | And the name it's giving us is the one

00:53:55.880 | that we provided earlier.

00:53:57.120 | Yeah, it's the name that came from our schema, which

00:54:03.120 | is this name.

00:54:04.880 | Yeah, so if you look back at our tool schema,

00:54:09.520 | this tool has a name.

00:54:10.680 | And you can give it lots of tools.

00:54:12.640 | So later on, we might see one where we've

00:54:14.280 | got both sums and multiply.

00:54:15.960 | And it can pick.

00:54:19.840 | We'll see this later, can pick and choose.

00:54:22.160 | So the flow is, we write our function in Python.

00:54:26.760 | The library automatically knows how to interpret the Python

00:54:29.760 | and turn it into a structured representation,

00:54:31.640 | the JSON schema, that is then fed to Cloud.

00:54:35.040 | It's fed to Cloud.

00:54:35.880 | We're also feeding it the name for the function

00:54:38.880 | that it's going to use when it wants to come back to us

00:54:41.400 | and say, hey, now call the function.

00:54:43.200 | When it comes back to us and says, hey, call the function,

00:54:45.640 | it uses that name.

00:54:46.440 | We look up the original function, and then we execute.

00:54:48.600 | Yeah, and so it decides.

00:54:50.360 | It knows it's got a function that can do this

00:54:53.640 | and that it can return this.

00:54:56.360 | And so then if it gets a request that can use that tool,

00:55:04.240 | then it will decide of its own accord, OK,

00:55:06.720 | I'm going to call the function that Jeremy provided,

00:55:09.640 | the tool that Jeremy provided.

00:55:11.960 | Yeah, so we'll see a bunch of examples of this.

00:55:14.400 | And this is generally part of what's called the React

00:55:17.560 | framework, nothing to do with React, the JavaScript GUI

00:55:21.280 | thing, but React was a paper that basically said like, hey,

00:55:24.520 | you can have language models or tools.

00:55:28.720 | And again, my LLM Hackers video is the best place

00:55:32.640 | to go to learn about the React pattern.

00:55:36.360 | And so here we're implementing the React pattern,

00:55:38.400 | or at least we're implementing the things necessary for Cloud

00:55:41.160 | to implement the React pattern using

00:55:43.320 | what it calls tool calling.

00:55:45.320 | So we look up the function, which

00:55:47.280 | is a string into this dictionary,

00:55:50.640 | and we get back the function.

00:55:54.320 | And so we can now call the function.

00:55:59.040 | So that's what we're doing.

00:56:00.160 | And so I think the key thing here

00:56:05.480 | is this idea that all this is in a notebook.

00:56:08.760 | The source code here to this whole thing

00:56:11.600 | is in a notebook, which means you can play with it, which

00:56:16.320 | I think is fantastically powerful because you never

00:56:19.880 | have to guess what something does.

00:56:21.320 | You literally can copy and paste it into a cell and experiment.

00:56:25.400 | And it's also worth learning these keyboard shortcuts

00:56:28.200 | like CV to copy and paste the cell, and like Apple A,

00:56:35.200 | Apple left square bracket, control shift hyphen.

00:56:39.640 | There's all these nice things worth learning,

00:56:42.360 | all these keyboard shortcuts to be able to use this Jupyter

00:56:46.520 | tool quickly.

00:56:49.480 | Anyway, the main thing to know is

00:56:50.880 | we've now got this thing called call function, which

00:56:54.800 | can take the tool use request from Cloud, this function call

00:56:58.600 | request, and call it.

00:57:01.440 | And it passes back a dictionary with the result of the tool

00:57:08.000 | call, and when it asked us to make this call,

00:57:16.040 | it included an ID.

00:57:18.680 | So we have to pass back the same ID

00:57:20.800 | to say this is the answer to this question.

00:57:24.320 | And that's the bit that says this is the answer

00:57:28.120 | to this question.

00:57:30.120 | That's the answer.

00:57:32.120 | And so we can now pass that back to Cloud,

00:57:37.800 | and Cloud will say, oh, great, I got the answer,

00:57:40.160 | and then it will respond with text.

00:57:44.040 | So I put all that together here and make a tool response

00:57:47.640 | where you pass in the tool response request from Cloud,

00:57:52.360 | the namespace to search for tools,

00:57:54.400 | or an object to search for tools,

00:57:58.200 | and we create the message from Cloud.

00:58:03.240 | We call that call function for every tool use request.

00:58:07.360 | There can be more than one, and we add that to their response.

00:58:12.880 | And so if you have a look now here, when we call that,

00:58:18.800 | it calculates the sum, and it's going to pass back the--

00:58:24.720 | going to add in the tool use request and the response

00:58:27.960 | to that request.

00:58:30.800 | So we can now go ahead and do that,

00:58:36.840 | and you can see Cloud returns the string, the response.

00:58:43.240 | So it's turned the result of the tool request into a response.

00:58:48.240 | And so this is how stuff like Code Interpreter in ChatGPT

00:58:52.520 | works.

00:58:53.020 | So it might be easier to see it all in one place,

00:58:58.400 | and this is like another demo of how we can use it.

00:59:00.920 | Instead of calling functions, we can also call methods.

00:59:03.600 | So here's sums again.

00:59:05.560 | But this time it's a method of a class.

00:59:08.440 | So we can do the same thing, get schema dummy dot sums.

00:59:13.320 | Yeah, so we make the message containing our prompt.

00:59:16.680 | So that's the question, what's this plus this?

00:59:20.320 | We pass that along to Cloud.

00:59:22.320 | Cloud decides that it wants you to create a tool request.

00:59:25.720 | We make the tool request, calculate the answer,

00:59:29.800 | add that to the messages, and put it all together.

00:59:33.960 | Oops, crazy.

00:59:34.800 | And there we go.

00:59:38.960 | OK, anything worth adding to that?

00:59:54.120 | So if you're not comfortable and familiar with the React

00:59:58.160 | framework, this will feel pretty weird.

01:00:04.520 | Definitely worth spending time learning about,

01:00:07.560 | because it's an incredibly powerful technique

01:00:12.360 | and opens up a lot of opportunities to--

01:00:19.440 | because I think a lot of people, I certainly

01:00:22.160 | feel this way, that there's so many things that language

01:00:26.520 | models aren't very good at.

01:00:28.480 | But they're very good at recognizing

01:00:30.960 | when it needs to use some tool.

01:00:33.440 | If you tell it like, oh, you've got access to this proof

01:00:37.960 | checking tool, or you've got access to this account creation

01:00:42.480 | tool, or whatever, it's good at using those.

01:00:45.240 | And those tools could be things like reroute this code

01:00:53.400 | or a customer service representative.

01:00:55.560 | They don't have to be text generating tools.

01:00:59.400 | They can be anything.

01:01:01.800 | And there's also no reason--

01:01:03.000 | you're not under obligation to send the response back

01:01:05.280 | to the model, right?

01:01:05.880 | It can actually be a useful endpoint.

01:01:07.400 | It's like, oh, I tell the model to look

01:01:10.280 | at this query from a customer and then respond appropriately.

01:01:13.280 | And one of the tools is like escalate.

01:01:15.560 | Well, if it sends a tool response,

01:01:19.160 | a tool request for that function,

01:01:21.320 | that could be like, oh, I should exit this block,

01:01:24.360 | forget about it, throw away the history

01:01:25.960 | because now I need to bump this up

01:01:27.640 | to some actual human in the loop,

01:01:29.360 | or store the result somewhere.

01:01:31.120 | It's just a very convenient way to get--

01:01:33.600 | Yeah, we're going to see a bunch more examples

01:01:36.520 | in the next section because there's a whole module called

01:01:40.560 | tool loop, which has a really nice example, actually,

01:01:43.520 | that came from the Anthropic examples of how

01:01:46.520 | to use this for customer service interaction.

01:01:51.640 | But for now, yeah, you can put that aside.

01:01:54.200 | Don't worry about it because we're

01:01:56.960 | going to go on to something much more familiar to everybody,

01:01:59.360 | which is chat.

01:02:01.880 | So chat is just a class which is going

01:02:08.560 | to keep track of the history.

01:02:10.120 | So self.h is the history.

01:02:11.720 | And it's going to start out as an empty list.

01:02:13.680 | There's no history.

01:02:14.480 | And it's also going to contain the client, which

01:02:23.680 | is the thing we just made.

01:02:25.720 | And so if you ask the chat for its use,

01:02:28.200 | it'll just pass it along to the client to get its use.

01:02:30.520 | You can give it some tools.

01:02:35.920 | And you can give it a system prompt.

01:02:37.440 | OK, so the system prompt, pass it in, no tools, no usage,

01:02:46.320 | no history.

01:02:49.080 | Again, there's a stream version and a non-stream version.

01:02:53.080 | So you can pass in stream as true or false.

01:02:55.520 | If you pass in stream, it'll use the stream version.

01:02:58.840 | Otherwise, it won't.

01:03:01.360 | So again, we patch in, done to call.

01:03:10.960 | Now, of course, we don't need to use patch.

01:03:13.320 | We could have put these methods directly in inside here.

01:03:17.440 | But I feel like I really prefer to do things much more

01:03:21.080 | interactively and step by step.

01:03:22.680 | So this way, I can create my class.

01:03:25.280 | And then I can just gradually add a little bit to it

01:03:27.480 | at a time as I need it.

01:03:28.920 | And I can also document it a little bit as the time,

01:03:31.840 | rather than having a big wall of code, which

01:03:34.000 | is just, I find, overwhelming.

01:03:35.480 | So all right, so there's a prompt.

01:03:45.160 | So if you pass in the prompt, then we

01:03:46.840 | add that to the history as a message.

01:03:50.200 | Now, get our tools.

01:03:57.560 | So I just call get schema for you automatically.

01:03:59.800 | And then at the end, we'll add to the history

01:04:06.000 | the results, which may include tool use.

01:04:08.640 | So now I can just call chat.

01:04:10.560 | And then I can call chat again.

01:04:17.640 | And as you can see, it's now got state.

01:04:20.240 | It knows my name.

01:04:21.040 | And the reason why is because each time it calls the client,

01:04:28.000 | it's passing in the entire history.

01:04:29.640 | So again, we can also add pre-fill, just like before.

01:04:36.520 | We can add streaming, just like before.

01:04:40.640 | And that's it, right?

01:04:41.560 | So you can see adding chat required almost no code.

01:04:47.680 | Really, it's just a case of adding everything

01:04:50.600 | to the history, and every time you call the client,

01:04:52.720 | passing in the whole history.

01:04:54.320 | So that's all a stateful-seeming language model is.

01:05:00.320 | So I don't actually have to write anything

01:05:05.960 | to get tool use to work, which is nice.

01:05:08.400 | I can just pass in my tools.

01:05:11.160 | And the nice thing, the kind of interesting thing

01:05:13.160 | here is that because the tool use request and response are

01:05:19.680 | both added to the history, to do the next step,

01:05:23.920 | I don't pass in anything at all.

01:05:25.360 | That's already in the history.

01:05:27.000 | So I just run it, and it goes ahead and tells me the answer.

01:05:31.280 | OK.

01:05:37.120 | Anything to say about that?

01:05:39.200 | So I know in chat GPT, it sometimes

01:05:41.760 | is, would you like to go ahead with this tool activation?

01:05:44.920 | And here, the model is responding

01:05:46.720 | with the tool use block, like it would like to use this tool.

01:05:49.400 | Do you have a way of interrupting

01:05:51.440 | before it actually runs the code that you gave it?

01:05:53.720 | Maybe you want to check the inputs or something like that?

01:05:57.720 | So you would need to put that into your function.

01:06:03.200 | So I've certainly done that before.

01:06:06.280 | So one of the things-- in fact, we'll see it shortly.

01:06:09.440 | I've got a code interpreter.

01:06:11.320 | And you don't want to run arbitrary code,

01:06:13.640 | so it asks you if you want to complete it.

01:06:15.800 | And part of the definition of the tool

01:06:19.040 | will be, what is the response that you

01:06:21.000 | get, Claude, if the user's declined

01:06:23.280 | your request to run the tool?

01:06:26.080 | Cool.

01:06:28.040 | OK.

01:06:28.960 | So I think people might imagine--

01:06:31.960 | Good question.

01:06:32.600 | Yeah.

01:06:33.800 | So I had my earlier question before we introduced chat,

01:06:36.720 | where we forked a conversation, as it were,

01:06:40.360 | just by forcing stuff into earlier exchanges.

01:06:43.400 | And at that point, we were talking about how

01:06:45.360 | the interface was stateless, because there

01:06:47.120 | were no session IDs.

01:06:49.240 | Now that we have these tool interactions with tool IDs,

01:06:52.600 | does that change the story?

01:06:54.600 | Like, let's say I had a sequence of interactions

01:06:57.960 | that involved tool use, and now I

01:07:00.960 | want to create three variations to explore

01:07:03.480 | different ways I might respond.

01:07:06.120 | Is that problematic?

01:07:09.000 | No, not at all.

01:07:10.080 | I mean, let's do it.

01:07:10.840 | So actually--

01:07:11.600 | OK, let's try it.

01:07:12.400 | Actually, I'm not Jeremy.

01:07:13.440 | I'm actually Alexis.

01:07:17.640 | You might want to zero index there.

01:07:23.520 | Thanks.

01:07:24.020 | So at this point, it's now going to be very confused,

01:07:35.880 | because I'm Alexis.

01:07:39.720 | It's nice to meet you, Jeremy.

01:07:41.760 | What's my name?

01:07:42.440 | Your name is Jeremy.

01:07:43.960 | So yeah, let's try.

01:07:46.680 | Poor thing.

01:07:50.600 | Lord, really?

01:08:04.600 | Does that answer your question, Alexis,

01:08:06.480 | if that is your real name?

01:08:08.120 | This is abuse of Claude.

01:08:09.240 | One day, this will be illegal.

01:08:13.920 | Yeah.

01:08:18.880 | And so I also had a question, too.

01:08:20.440 | Yes.

01:08:24.600 | And if Claude returns a tool block,

01:08:29.920 | and is that added as a tool block to the history?

01:08:34.600 | Yes.

01:08:35.480 | Does it have to be converted to a string?

01:08:39.480 | No, no, no.

01:08:42.080 | It's just the tool block as part of the history.

01:08:44.040 | The history is perfectly-- the messages

01:08:46.440 | can be those message objects.

01:08:49.040 | They don't have to be dictionaries.

01:08:50.760 | The contents don't have to be strings.

01:08:53.640 | OK.

01:08:54.280 | OK, cool.

01:08:55.320 | Yeah.

01:08:57.160 | All right.

01:08:58.160 | So I was delighted to discover how straightforward images are

01:09:05.840 | to deal with.

01:09:07.960 | So yeah, I mean, we can read in our image,

01:09:16.800 | and it's just a bunch of bytes.

01:09:21.440 | And Anthropx documentation describes

01:09:26.480 | how they expect images to come in.

01:09:31.040 | Here we are.

01:09:35.280 | So yeah, basically, this is just something

01:09:43.960 | which takes in the bytes of an image

01:09:46.520 | and creates the message that they expect,

01:09:49.720 | which is type image.

01:09:51.400 | And the source is a dictionary containing base64,

01:09:56.360 | a MIME type, and the data.

01:09:57.760 | Anyway, you don't have to worry about any of that

01:09:59.760 | because it does it all for you.

01:10:01.960 | And so if we have a look at that,

01:10:09.480 | that's what it looks like.

01:10:12.160 | And so because you could--

01:10:14.440 | and so they're kind of quite like this.

01:10:16.040 | You can have multiple images and multiple pieces of text

01:10:18.960 | in a request.

01:10:19.840 | They can be interleaved, whatever.

01:10:21.920 | So to do that, it means you can't just pass in strings.

01:10:25.440 | You have to pass in little dictionaries type

01:10:28.560 | text in the string.

01:10:30.680 | So here we can say, all right, let's create--

01:10:33.560 | this is a single message containing multiple parts.

01:10:37.760 | So maybe these functions should be called image part and text

01:10:40.800 | part.

01:10:41.280 | I don't know.

01:10:42.560 | But they're not.

01:10:44.240 | A single message contains an image and this prompt.

01:10:51.040 | And so then we pass that in.

01:10:52.560 | And you see I'm passing in a list

01:10:54.120 | because it's a list of messages.

01:10:55.880 | And the first message contains a list of parts.

01:10:58.520 | And yep, it does contain purple flowers.

01:11:01.880 | And then it's like, OK, well, there's

01:11:07.120 | no particular reason to have to manually do these things.

01:11:09.920 | We can perfectly well just look and see, oh, it's a string.

01:11:13.080 | We should make it a text message, or it's bytes.

01:11:15.120 | We should make it an image message.

01:11:16.600 | So I just have a little private helper.

01:11:20.080 | And then finally, I've changed makeMessage.

01:11:23.160 | This is something I remember Jono and I talked about.

01:11:25.360 | Jono was saying--

01:11:26.320 | I think you said you feel like this is kind of like part

01:11:29.440 | of the Jeremy way of coding is I don't go back and refactor

01:11:33.280 | things, but I just redefine them later in my notebook.

01:11:36.960 | And so I previously hadn't exported makeMessage.

01:11:39.560 | I don't export it till now.

01:11:42.480 | And so here's my final version that's

01:11:44.160 | now going to actually call makeContent to automatically

01:11:49.640 | handle images as well.

01:11:52.160 | And so now we can just pass in-- we can call our client.

01:11:54.840 | We can pass in a list of one message.

01:11:58.120 | The list of one message contains a list of parts,

01:12:02.640 | as you can see.

01:12:06.400 | So behind the scenes, when we then run the last cell,

01:12:15.040 | it actually generates a Python file containing

01:12:23.000 | all of the exports code.

01:12:24.960 | So it's 229 lines, which isn't much,

01:12:29.240 | particularly when you look at how much empty space there is.

01:12:32.160 | And these all things say which cell it comes from and so forth.

01:12:34.800 | So in terms of actual code, it'll

01:12:36.560 | be well under 200 lines of code.

01:12:37.960 | OK, so that is the first of two modules to look at.

01:12:47.200 | Any thoughts or questions before we move on to the tool loop?

01:12:50.600 | I think it's coming through.

01:12:55.760 | Maybe do you want to go into your objective

01:12:59.680 | when you started this, if it was beyond what you've already

01:13:02.840 | shown, like what was the goal always

01:13:07.000 | to keep this a simple self-contained thing?

01:13:08.920 | Is there plans for this to grow into a fully stateful chat

01:13:13.680 | thing that can offer up different functionality?

01:13:15.880 | What's the journey of, oh, I should write this thing that's

01:13:20.440 | going into this?

01:13:21.720 | I mean, I imagine I must be a very frustrating person

01:13:24.800 | to work with because I'd never have any plans, really.

01:13:28.840 | I just have this vague, intuitive feeling

01:13:32.360 | that maybe I should do something in this general direction.

01:13:34.920 | And then people ask me, like, oh, why are you doing that?

01:13:37.600 | So it's like, I don't know.

01:13:41.160 | Just seems like, why not?

01:13:43.120 | Seems like a good idea.

01:13:44.120 | So yeah, I don't think I had any particular plans

01:13:47.560 | to where it would end up.

01:13:48.600 | Just a sense of like--

01:13:49.840 | the way I saw things being written,

01:13:57.800 | including in the Anthropic documentation for Claude,

01:14:00.080 | seemed unfairly difficult. I didn't

01:14:04.400 | think people should have to write stuff like that.

01:14:07.840 | And then when I started to write my own thing using

01:14:11.240 | the Anthropic client, I didn't find

01:14:13.640 | it very ergonomic and nice.

01:14:16.400 | I looked at some of the things that are out there,

01:14:20.040 | kind of general LLM toolkits, APIs, libraries.

01:14:25.120 | And on the whole, I found them really complicated, too long,

01:14:31.800 | too many new abstractions, not really

01:14:34.200 | taking advantage of my existing Python knowledge.

01:14:36.800 | So I guess that was my high-level hope.

01:14:40.560 | Simon Willison has a nice library

01:14:42.160 | called LLM, which Jono and I started looking at together.

01:14:47.200 | But it was missing a lot of the features that we wanted.

01:14:50.800 | And we did end up adding one as a PR.

01:14:54.160 | Not that it's been merged yet.

01:14:55.400 | But yeah, in the end, I guess the other thing about--

01:15:00.680 | so the interesting thing about Simon's approach with LLM

01:15:03.840 | is it's a general front end to dozens of different LLM

01:15:09.760 | backends, open source, and proprietary,

01:15:12.560 | and inference services.

01:15:15.160 | And as a result, he kind of has to have this lowest common

01:15:20.280 | denominator API of like, oh, they all support this.

01:15:24.040 | So that's kind of all we support.

01:15:26.800 | So this was a bit of an experiment in being like,

01:15:28.840 | OK, I'm going to make this as Claude-friendly as possible.

01:15:32.440 | Which is why I even gave it a name based on Claude.

01:15:35.720 | Because I was like, I want this to be--

01:15:38.440 | you know, that's why I said this is Claude's friend.

01:15:41.720 | I wanted to make it like something that

01:15:43.400 | worked really well with Claude.

01:15:44.800 | And I didn't know ahead of time whether that would turn out

01:15:47.200 | to be something I could then use elsewhere

01:15:49.520 | with slight differences or not.

01:15:52.880 | So that was kind of the goal.

01:15:54.160 | So where it's got to--

01:16:00.120 | I think what's happened in the few weeks

01:16:05.560 | since I started writing it is there's

01:16:07.680 | been a continuing standardization.

01:16:12.600 | Like, the platforms are getting-- which is nice,

01:16:14.960 | more and more similar.

01:16:16.680 | So the plan now, I think, is that there

01:16:20.840 | will be GPT's friend and Gemini's friend as well.

01:16:28.560 | GPT's friend is nearly done, actually.

01:16:30.240 | Maybe they'll have an entirely consistent API.

01:16:35.520 | We'll see, you know, or not.

01:16:39.440 | But again, I'm kind of writing each of them

01:16:41.680 | to be as good as possible to work with that LLM.

01:16:44.800 | And then I'll worry about like, OK,

01:16:46.480 | is it possible to make them compatible with each other

01:16:49.200 | later?

01:16:51.320 | And I think that's something--

01:16:52.560 | I mean, I'd be interested to hear your thoughts, Jono.

01:16:54.760 | But like, when we wrote the GPT version together,

01:17:01.360 | and we literally just duplicated the original Baudet notebook,

01:17:06.640 | started at the top cell, and just changed the--

01:17:10.880 | and did a search and replace of Anthropic with OpenAI,

01:17:15.080 | and of Claude with GPT, and then just went through each cell one

01:17:18.760 | at a time to see how do you port that to the OpenAI API.

01:17:24.200 | And I found that it took us, what, a couple of hours?

01:17:27.280 | It felt like a very--

01:17:28.200 | It was very quick, yeah.

01:17:29.440 | Simple.

01:17:30.080 | Didn't have to use my brain much.

01:17:32.840 | Yeah.

01:17:33.360 | I mean, that's maybe worth highlighting,

01:17:35.040 | is that this is not the full and only output of the AnswerAI

01:17:38.600 | organization over the last month.

01:17:40.000 | This is like, oh, you saw things Jeremy is tinkering with just

01:17:43.600 | on the side.

01:17:44.320 | So maybe, yeah, it's good to set expectations appropriately.

01:17:46.800 | But also, yeah, it didn't really feel like it was pretty easy,

01:17:48.680 | especially because I think they've all

01:17:50.520 | been inspired by, is the generous way of saying it,

01:17:52.520 | each other.

01:17:52.960 | And OpenAI, I think, maybe led the way

01:17:54.520 | with some of the API stuff.

01:17:55.640 | So yeah, it's chat.completions.create

01:17:58.240 | versus anthropic.client.messages.create

01:18:01.120 | or something.

01:18:02.200 | In a standard IDE environment, I think

01:18:04.520 | I would have found it a lot harder.

01:18:07.920 | You know, because it's so sequential, in some ways,

01:18:10.800 | it could feel like a bit of a constraint.

01:18:12.480 | But it doesn't mean you can do it from the top

01:18:16.720 | and you go all the way through until you get to the bottom.

01:18:19.120 | You don't have to jump around the place.

01:18:21.040 | Right.

01:18:21.560 | And the only part that was even mildly tricky

01:18:24.040 | then ended up being a change that we

01:18:25.960 | made for the OpenAI one, which instantly got mirrored back

01:18:28.600 | to Claude.

01:18:29.120 | And then, again, because they were built in the same way,

01:18:30.880 | it was like, oh, we've tweaked the way we do.

01:18:32.680 | I think it was streaming.

01:18:33.840 | One of the things that-- yeah.

01:18:35.120 | Like, OK, we've figured out a nice way

01:18:36.320 | to do that in the second rewrite.

01:18:37.960 | It was very easy to just go and find the equivalent function,

01:18:40.520 | because the two are so close together.

01:18:43.120 | Yeah, so I also-- it's quite a nice way

01:18:44.720 | to write software, especially for this kind of like--

01:18:47.080 | it's not going to grow too much in scope

01:18:49.240 | beyond what is one or two notebooks of stuff.

01:18:51.520 | I don't think it's not--

01:18:52.600 | If it did, I would add another project or another notebook.

01:18:56.720 | Like, I wouldn't change these.

01:18:58.560 | These are kind of like the bases which we can build on.

01:19:02.240 | Yeah.

01:19:03.120 | Yeah.

01:19:03.640 | OK, let's keep going then.

01:19:06.320 | So there's just one more notebook.

01:19:08.520 | And this, hopefully, will be a useful review

01:19:14.160 | of React framework.

01:19:16.400 | So yeah, Anthropic has this nice example

01:19:20.560 | in their documentation of a customer service agent.

01:19:24.200 | And again, there's a lot of this boilerplate.

01:19:30.960 | And then it's all a second time, because it's now the functions.

01:19:40.080 | And so basically, the idea is here,

01:19:41.680 | there's like a little pretend bunch of customers

01:19:46.080 | and a pretend bunch of orders.

01:19:48.400 | And I made this a bit more--

01:19:50.200 | a little bit more sophisticated.

01:19:51.520 | These customers don't have orders.

01:19:53.040 | The orders are not connected to customers.

01:19:54.800 | In my version, I have the orders separately.

01:19:59.760 | And then each customer has a number of orders.

01:20:03.440 | So it's a kind of a relational, but it's more like MongoDB

01:20:07.400 | style or whatever, denormalized.

01:20:10.760 | Not a relational database.

01:20:13.400 | Yeah, so they basically describe this rather long, complex

01:20:19.800 | process.

01:20:20.320 | And as you can see, they do absolutely everything

01:20:22.400 | manually, which maybe that's fine

01:20:25.440 | if you're really trying to show people the very low level

01:20:28.760 | details.

01:20:29.800 | But I thought it'd be fun to do exactly the same thing,

01:20:32.600 | but make it super simple.

01:20:33.920 | And also make it more sophisticated

01:20:35.520 | by adding some really important features

01:20:37.760 | that they never implemented.

01:20:39.720 | So the first feature they implement is getCustomerInfo.

01:20:44.400 | You pass in a customer ID, which is a string,

01:20:48.040 | and you get back the details.

01:20:52.400 | So that's what it is, customers.get.

01:20:55.720 | And so you'll see here we've got the documents,

01:20:59.160 | we've got the doc string, and we've got the type.

01:21:02.960 | So everything necessary to get a schema.

01:21:09.440 | Same thing for order details, orders.get.

01:21:14.400 | And then something that they didn't quite implement

01:21:18.080 | is a proper cancel order.

01:21:20.400 | So if order ID not in orders, so you

01:21:23.760 | can see we're returning a bool.

01:21:27.040 | So if the order ID is not there, we

01:21:28.640 | were not able to cancel it.

01:21:30.360 | If it is there, then we'll set the status to cancel

01:21:34.520 | and return true.

01:21:36.560 | OK, so this is interesting now.

01:21:40.440 | We've got more than one tool.

01:21:43.640 | And the only reason that Claude can possibly

01:21:48.480 | know what tool to use when, if any,

01:21:51.240 | is from their descriptions in the doc string here.

01:21:55.600 | So if we now go chat.tools, because we passed it in,

01:22:02.920 | you can see all the functions are there.

01:22:05.120 | And so when it calls them, it's going to, behind the scenes,

01:22:08.120 | automatically call getSchema on each one.

01:22:13.560 | But to see what that looks like, we could just do it here.

01:22:16.920 | And so getSchema is actually defined

01:22:24.200 | in a different library, which we created called tools.lm.

01:22:34.600 | OK, getSchema, oops, O for O in chat.tools, there you go.

01:22:46.800 | So you basically end up with something

01:22:48.400 | pretty similar to what Anthropx version had manually.

01:22:54.160 | So yeah, we can say, tell me the email address of customer C1.

01:22:59.080 | And I mean, Claude doesn't know.

01:23:03.400 | So it says, oh, you need to do a tool use.

01:23:08.720 | You need to call this function.

01:23:10.680 | You need to pass in this input.

01:23:13.400 | And so remember, with our thing, that's already now

01:23:17.040 | already got added to the history.

01:23:18.760 | So we just call chat.

01:23:20.120 | And it automatically calls it on our history.

01:23:24.120 | And there it is.

01:23:25.400 | And you can see this retrieving customer C1

01:23:27.480 | is because we added a print here.

01:23:31.240 | So you can see, as soon as we got that request,

01:23:35.640 | we went ahead and retrieved C1.

01:23:37.640 | And so then we call chat.

01:23:38.800 | It just passes it back.

01:23:40.280 | And there we go.

01:23:41.400 | There's our answer.

01:23:43.680 | So can I channel my inner--

01:23:45.600 | our dear friend Hamel has a thing about saying,

01:23:47.720 | you've got to show me the prompt.

01:23:49.080 | I already want to be able to inspect what's going on.

01:23:51.320 | Maybe we could do this at a couple of different levels.

01:23:53.600 | But can we see what was fed to the model?

01:23:55.560 | What was the history?

01:23:56.600 | Or what was the most recent request?

01:23:57.800 | Something like that.

01:23:58.600 | Yeah, so there's our-- here's our history.

01:24:00.800 | So the first message we passed in

01:24:03.800 | was, tell me the email address.

01:24:06.160 | It passed back an assistant message, which

01:24:08.320 | was a tool use block asking for calling this function

01:24:14.080 | with these parameters.

01:24:16.080 | And then we passed back-- and that had a particular ID.

01:24:18.480 | We passed back saying, oh, that tool ID request,

01:24:20.720 | we've answered it for you.

01:24:23.000 | And this was the response we got.

01:24:30.360 | And then it told us--

01:24:33.440 | OK, there's-- it's just telling us what we told it.

01:24:37.880 | Right.

01:24:39.520 | And then if I was really paranoid,

01:24:40.840 | like I wanted to see the actual tool definitions and things,

01:24:43.440 | and the actual requests, is there

01:24:44.920 | a way to dig to that deeper level

01:24:46.960 | beyond just looking at the history?

01:24:48.520 | Yeah, so we can do this.

01:24:52.680 | It's a bit of fun.

01:24:58.320 | So that has to be done before you import Anthropic.

01:25:02.680 | So we'll set it to debug.

01:25:05.880 | And so now if we call that, OK, it

01:25:15.280 | tells us everything that's going on.

01:25:17.440 | And so here is the request method post URL, headers,

01:25:25.000 | JSON data, HTTP request.

01:25:29.240 | Nice.

01:25:33.160 | Yeah.

01:25:34.600 | So if we do that here.

01:25:38.040 | So now this is including all of that same information again,

01:25:44.160 | because the model on Anthropic's side is not stateful.

01:25:47.240 | So we pass the full history.

01:25:48.400 | We can see, OK, we've still got all of the tools,

01:25:52.480 | the definitions in there.

01:25:53.800 | We've still got all the previous messages.

01:25:56.160 | Yeah, so this is like, it's a bit of a pain

01:25:58.320 | to have all this output all of the time.

01:26:00.000 | But if you're playing around with this,

01:26:01.600 | I'd recommend turning this on until you

01:26:03.320 | can trust that the library does what you want.

01:26:07.480 | And it's nice to be able to have--

01:26:08.920 | Thank you, Anthropic, for having that environment variable.

01:26:11.400 | It's very nice.

01:26:13.080 | Yeah, because in the end, if you're stuck on something,

01:26:15.680 | then all that's happening is that those pieces of text

01:26:18.880 | are being passed over an HTTP connection

01:26:21.160 | and passed back again.

01:26:22.120 | So there is nothing else.

01:26:24.400 | So that's a full debug.

01:26:26.920 | Thanks.

01:26:27.400 | That's a good question, Jono.

01:26:28.640 | So yeah, this is an interesting request.

01:26:33.080 | Please cancel all orders for customer C1.

01:26:36.480 | So this is interesting, because it can't be done in one go.

01:26:41.760 | So the answer it gave us was, OK, tell me about customer C1.

01:26:48.600 | But that doesn't finish it.

01:26:50.840 | So what actually happens?

01:26:52.960 | Well, I mean, we could actually show that.

01:26:54.720 | So if we pass it back, then it says, OK, there are two orders.

01:27:06.840 | Let's cancel each one.

01:27:10.000 | And it has a tool use request.

01:27:15.560 | So it's passed back some text and a tool use request

01:27:19.040 | to cancel order A1.

01:27:22.000 | It's not being that smart.

01:27:23.240 | And I think it's because we're using Haiku.

01:27:25.000 | If we are using Opus, it probably

01:27:27.200 | would have had both tool use requests in one go.

01:27:31.040 | Well, let's find out.

01:27:32.000 | So if we change this model to model 0--

01:27:38.760 | Definitely slower.

01:27:45.120 | Definitely slower.

01:27:47.120 | Yeah.

01:27:49.200 | Oh, you can see here.

01:27:50.120 | So this is something interesting that it does.

01:27:52.400 | It has these thinking blocks.

01:27:57.960 | That's something that Opus, in particular, does.

01:28:00.560 | So then-- no, OK, it's still only doing one at a time.

01:28:10.120 | So it's fine.

01:28:10.920 | Does it only use those thinking blocks

01:28:15.480 | or does it need to use tool use?

01:28:16.360 | I haven't seen them before when I do API access.

01:28:18.840 | Yes.

01:28:19.360 | OK, that's why I haven't seen them.

01:28:20.920 | As far as I know.

01:28:22.560 | OK, so basically, you can see we're

01:28:24.280 | going to have to-- given that it's only doing one at a time,

01:28:26.800 | it's going to take at least three goes--

01:28:28.640 | one to get the information about the customer,

01:28:30.840 | then to cancel order A1, and then to cancel order A2.

01:28:33.920 | But each time that we get back another tool use request,

01:28:38.200 | we should just do it automatically.

01:28:39.680 | There's no need to manually do this.

01:28:42.400 | So we've added a thing here called tool loop.

01:28:46.280 | And for up to 10 steps, it'll check whether it's

01:28:51.480 | asked for more tool use.

01:28:53.680 | And if so, it just calls self again.

01:29:01.880 | That's it.

01:29:03.520 | Just like we just called-- because self is chat, right?

01:29:06.960 | Just keeps doing it again and again.

01:29:09.880 | Optionally, I added a function that you could just

01:29:12.120 | call, for example, trace func equals print.

01:29:14.200 | It'll just print out the request each time.

01:29:16.480 | And I also added a thing called continuation func, which

01:29:19.440 | is whether you want to continue.

01:29:21.600 | So if these are both empty, then nothing happens.

01:29:24.160 | It's just doing that again and again and again.

01:29:28.000 | So super simple function.

01:29:30.280 | So now, if we say, can you tell me the email address

01:29:35.200 | for customer C1, we never have to--

01:29:39.520 | we never have to do that.

01:29:43.200 | Just does it for us until it's finished.

01:29:44.960 | And it says, sure, there you go.

01:29:46.840 | It's like, OK, please cancel all orders for customer C1.

01:29:51.000 | Retrieving, canceling.

01:29:52.880 | Now, why did it only do O2?

01:29:57.680 | Oh, I think we already canceled O1.

01:30:00.360 | Let's do that again.

01:30:01.280 | There we go.

01:30:07.200 | So we are agentic, are we not?

01:30:11.320 | Yes, definitely.

01:30:14.880 | Yeah, so when people say I've made an agent,

01:30:16.680 | it's like, oh, congratulations.

01:30:17.960 | You have a for loop that calls the thing 10 times.

01:30:23.400 | It's not very fancy, but it's nice.

01:30:28.360 | It's such a simple thing.

01:30:32.200 | And so now we can ask it, like, OK, how

01:30:34.200 | do we go with O2 again?

01:30:35.680 | And remember, it's got the whole history, right?

01:30:40.400 | So it now can say, like, oh, yeah,

01:30:42.000 | you told me to cancel it.

01:30:43.080 | It is canceled.

01:30:44.360 | Cool, cool.

01:30:46.920 | So something I never tried is--

01:30:58.240 | great, now cancel order O6.

01:31:04.920 | I think it should get back false, and it should know.

01:31:09.880 | Yeah, there we go.

01:31:10.640 | Not successful, that's good.

01:31:17.040 | Nice.

01:31:19.200 | So here's a fun example.

01:31:21.560 | Let's implement Code Interpreter,

01:31:23.640 | just like ChatGPT, because Claude doesn't

01:31:25.360 | have the Code Interpreter.

01:31:26.800 | So now it does.

01:31:29.480 | So I created this little library called ToolsLM.

01:31:32.800 | So we've already used one thing from it,

01:31:40.200 | which is GetSchema, which is this little thing here.

01:31:45.840 | And it's actually got a little example of a Python Code

01:31:53.200 | Interpreter there.

01:31:55.760 | Yeah, it's also got this little thing called Shell.

01:32:00.360 | So yeah, we're going to use that.

01:32:09.880 | So GetShell is just a little Python Interpreter.

01:32:12.600 | So we're going to create a subclass of chat

01:32:20.400 | called CodeChat.

01:32:23.920 | And CodeChat is going to have one extra method in it,

01:32:28.800 | which is to run code.

01:32:31.520 | So code to execute in a persistent session.

01:32:34.880 | So this is important to tell it all this stuff,

01:32:37.080 | like it's a persistent IPython session.

01:32:39.480 | And the result of the expression on the last line

01:32:41.720 | is what we get back.

01:32:43.360 | If the user declines request to execute, then it's declined.

01:32:51.080 | And so you can see here, I have this little confirmation

01:32:54.320 | message.

01:32:54.880 | So I call input with that message.

01:32:57.560 | And if they say no thank you, then I return declined.

01:32:59.840 | And I try to encourage it to actually do

01:33:06.200 | complex calculations yourself.

01:33:08.800 | And I have a list of imports that I do automatically.

01:33:13.840 | So that's part of the system prompt.

01:33:17.120 | You've already got these imports done.

01:33:20.240 | And just a little reminder, Haiku is not so smart.

01:33:22.960 | So I tend to be a little bit more verbose about reminding it

01:33:26.880 | about things.

01:33:27.920 | And I wanted to see if it could combine the Python

01:33:32.520 | tool with other tools.

01:33:33.600 | So I created a simple little thing here called getUser.

01:33:36.160 | It just returns my name.

01:33:37.280 | So if I do CodeChat--

01:33:44.400 | so I'm going to use Sonnet, which

01:33:47.840 | is less stupid than Haiku.

01:33:50.600 | So in trying to figure out how to get Haiku to work--

01:33:56.720 | in fact, let's use Haiku.

01:33:59.440 | One thing that I found really helped a lot

01:34:01.440 | was to give it more examples of how things ought to work.

01:34:05.240 | So I actually just set the history here.

01:34:08.040 | So I said, oh, I asked you to do this.

01:34:10.160 | So I asked you to do this.

01:34:17.040 | And then you gave me this answer.

01:34:18.720 | And then I asked you to do this.

01:34:20.240 | And you gave me this answer.

01:34:21.560 | So these aren't the actual messages

01:34:32.160 | that would include the actual tool calling syntax or anything?

01:34:35.000 | That doesn't cause trouble--

01:34:35.840 | No, I didn't bother with that.

01:34:37.120 | --in plain text?

01:34:37.760 | Yeah.

01:34:39.000 | It seems to be enough for it to know what I'm talking about.

01:34:44.080 | Yeah.

01:34:45.000 | If you wanted that full thing, I guess

01:34:46.800 | you could have this conversation with it,

01:34:49.080 | install the history or something like that.

01:34:50.880 | Well, I'm going to add it.

01:34:53.120 | So for the OpenAI one, I just added today, actually,

01:34:56.440 | something called mockToolUse, which

01:34:59.640 | is a function you can call because GPT does care.

01:35:03.560 | So we might add the same thing here, mockToolUse.

01:35:08.440 | And you just pass in the--

01:35:10.480 | yeah, here's the function you're pretending to call.

01:35:12.680 | Here's the result we're pretending that function had.

01:35:14.920 | Yeah.

01:35:15.420 | OK, so create a one-line--

01:35:25.240 | no, must have broken it at some point.

01:35:29.200 | OK, we'll use Sonnet.

01:35:35.400 | Create a one-line function for a string s

01:35:38.040 | that multiplies together the ASCII values of each character

01:35:40.720 | in s using reduce.

01:35:44.480 | Call tool loop with that.

01:35:48.120 | And OK, press Enter to execute or N to skip.

01:35:52.160 | So that's just coming from this input with this message.

01:35:57.120 | So it's actually, if you enter anything at all, it'll stop.

01:36:08.240 | We'll press N. OK, so it responded with a tool use

01:36:17.400 | request to run this code.

01:36:19.920 | And because it's in the tool loop, it did run that code.

01:36:28.480 | And it also responded with some text.

01:36:31.000 | So it's responded with both text as well as a tool use request.

01:36:37.720 | And this doesn't return anything.

01:36:44.240 | The print doesn't return anything.

01:36:45.840 | So all that's happened is that behind the scenes,

01:36:52.840 | we created this interactive Python shell

01:37:00.960 | called a self.shell.

01:37:02.760 | Self.shell ran the code it requested.

01:37:05.880 | So that shell should now have in it

01:37:08.960 | a function called checksum.

01:37:12.040 | So in fact, we can have a look at that.

01:37:14.920 | There's the shell.

01:37:16.920 | And we can even run code in it.

01:37:18.800 | So if I just write checksum, that should show it to me.

01:37:24.600 | Result equals function lambda.

01:37:33.240 | Checksum, there you go.

01:37:41.080 | So you can play around with the interpreter yourself.

01:37:43.880 | So you can see it has the interpreter has now

01:37:45.640 | got this function defined.

01:37:46.960 | And so this is where it gets quite interesting.

01:37:48.920 | Use it to get the checksum of the username of this session.

01:37:51.960 | So it knows that one of the tools it's been told exists

01:37:57.640 | is getUser.

01:37:59.240 | And in this code chat, I automatically

01:38:01.880 | added a setTools, append the self.runCell automatically.

01:38:12.720 | So it now knows that it can get the user and it can run a cell.

01:38:15.640 | Sorry, it can run a cell.

01:38:18.080 | So if I now call that, you can see it's called getUser.

01:38:24.440 | Found out the name's Jeremy.

01:38:25.960 | Then asked to get the checksum of Jeremy.

01:38:28.080 | There it goes.

01:38:28.840 | So you can see this is doing a tool

01:38:31.120 | use with multiple tools, including our code interpreter,

01:38:34.760 | which I think is a pretty powerful concept, actually.

01:38:43.440 | And if you wanted to see the actual code it was writing,

01:38:46.160 | you could change the trace function,

01:38:47.720 | or look at the history, or inspect that in some other way.

01:38:50.440 | Yeah, so we could change the trace function

01:38:53.480 | to print, for example.

01:38:55.960 | So we've used showContents, which is specifically just

01:38:59.760 | trying to find the interesting bit.

01:39:01.440 | If we change it to print, it'll show everything.

01:39:05.120 | Or yeah, you can do whatever you like in that trace function.

01:39:07.620 | You don't really have to show things.

01:39:10.400 | And of course, we could also set the anthropic lobbying debug

01:39:13.920 | thing to see all the requests going through.

01:39:16.880 | So yeah, none of this needs to be mysterious.

01:39:22.720 | Yeah, so at the end of all this, we

01:39:25.880 | end up with a pretty convenient wrapper,

01:39:33.480 | where the only thing I bother documenting on the home page

01:39:36.240 | is chat, because that's what 99.9% of people use.

01:39:39.760 | You just call chat, you just pass in a system prompt,

01:39:42.880 | you pass in messages, you can use tools,

01:39:49.760 | and you can use images.

01:39:52.000 | So for the user, there's not much to know, really.

01:39:56.680 | Right.

01:39:57.200 | It's only if you want to mess around making your own code

01:39:59.640 | interpreter, or trying something that you'd even

01:40:02.880 | look a little bit deeper.

01:40:06.000 | Yeah.

01:40:06.920 | Yeah.

01:40:07.560 | Exactly.

01:40:08.060 | I don't know.

01:40:11.680 | I mean, I feel like I'm quite excited about this way

01:40:15.040 | of writing code as being something

01:40:19.200 | like I feel like I can show you guys.

01:40:22.120 | Like, all I did just now was walk through the source

01:40:25.880 | code of the module.

01:40:28.880 | But in the process, hopefully, you

01:40:33.040 | might have done it all already, but if you didn't,

01:40:35.160 | you learned something about Claude,

01:40:38.280 | and the anthropic API, and the React pattern,

01:40:42.360 | and blah, blah, blah.

01:40:45.400 | And I remember I asked you, Jono, when I first

01:40:48.200 | wrote it, actually, and I said, oh,

01:40:50.160 | could you read this notebook and tell me what you think?

01:40:52.480 | And you said the next day, OK, I read the notebook.

01:40:54.600 | I now feel ready that I could both use and work

01:40:58.680 | on the development of this module.

01:41:01.720 | I was like, OK, that's great.

01:41:04.040 | Right, yeah.

01:41:05.120 | And I think it definitely depends

01:41:06.960 | where you're coming from, how comfortable you

01:41:08.760 | are at those different stages.

01:41:10.040 | I think using it, it's very nice, very approachable.

01:41:12.920 | You get the website with the documentation right there.

01:41:15.240 | The examples that you use to develop the pieces

01:41:17.880 | are always useful examples.

01:41:20.440 | Yeah, reading through the source code definitely felt like, OK,

01:41:23.520 | I think I could grasp where I needed to make changes.

01:41:26.320 | I had to add something for a different project

01:41:28.320 | we were working on.

01:41:29.120 | It was, OK, I think I can see the bits that

01:41:30.920 | are important for this.

01:41:32.880 | Yeah, so it's quite a fun way to build stuff.

01:41:35.040 | I think it's quite approachable.

01:41:40.080 | I think the bits that I'd expect people

01:41:44.280 | might find a little tricky if they

01:41:45.680 | haven't seen this sort of thing before.

01:41:47.560 | There's two bits.

01:41:48.320 | One is you have some personal metaprogramming practices that

01:41:54.720 | aren't part of normal Python.

01:41:55.960 | So patching into classes instead of defining in classes,

01:41:59.960 | liberal use of delegates, liberal use of double star

01:42:04.200 | keyword arg unpacking, stuff like that.

01:42:07.480 | And then the second is the not exactly eval,

01:42:12.280 | but eval-ish metaprogramming around interpreting--

01:42:14.960 | I mean, it's using the symbol table as a dictionary.

01:42:18.160 | Yeah, yeah.

01:42:19.080 | I mean, these are all things that you

01:42:20.800 | would have in an advanced Python course.

01:42:24.200 | They're beyond loops and conditionals.

01:42:29.040 | And I think they're all things that

01:42:33.880 | can help people to create stuff that otherwise

01:42:38.680 | might be hard to create or might have otherwise

01:42:40.800 | required a lot of boilerplate.

01:42:42.600 | So in general, my approach to coding

01:42:44.400 | is to not write stuff that the computer should

01:42:50.680 | be able to figure out for me.

01:42:55.320 | And you can take the same approach,

01:42:58.400 | even if you're not quite the Jeremy, where

01:43:00.760 | instead of importing from FastCore and from the tools

01:43:04.760 | on library that you've written separately,

01:43:06.640 | it's like, oh, often I'll have a utils notebook, right?

01:43:09.480 | Which is like, oh, there's things

01:43:10.880 | that are completely orthogonal to what I'm actually doing,

01:43:14.000 | which is like a tool loop with a chatbot.

01:43:15.760 | It's like, oh, I have a thing for getting a list

01:43:21.200 | into a different format or a thing for reading files

01:43:23.800 | and changing the image type and resizing it to a maximum size.

01:43:27.480 | You can still have the same idea where, like,

01:43:29.360 | here's the main notebook.

01:43:30.400 | This is me implementing the pieces one by one.

01:43:32.360 | Maybe there's somewhere else where, like, oh, there's

01:43:34.560 | a few utility functions that are defined and documented

01:43:37.080 | in their own separate place so they don't clutter things up.

01:43:39.640 | This more general literate notebook-driven small example

01:43:43.880 | of atomic piece-driven development

01:43:45.440 | doesn't require that you've written the FastCore library,

01:43:48.280 | but it's helped along by having those pieces at hand.

01:43:51.440 | Yeah.

01:43:51.960 | And also, in general, FastCore, particularly FastCore.basics,

01:43:56.520 | is designed to be the things I feel like maybe

01:44:00.320 | could have been in Python.

01:44:03.440 | I almost never write any script or notebook

01:44:06.880 | without importing from that and without using stuff from that

01:44:09.600 | because I'm, like, the things that I just

01:44:12.000 | think you use all the time.

01:44:15.000 | I'll say, like, it's something I've definitely noticed.

01:44:17.520 | There's two reactions I see to people reading my code, which,

01:44:26.360 | as you say, it's kind of like it's

01:44:28.720 | got a particular flavor to it.

01:44:30.840 | And it's very intentionally not the same as everybody else's.

01:44:35.040 | Some people get really angry.

01:44:36.960 | And they're like, why don't you just do everything the same way

01:44:41.480 | as everybody else?

01:44:42.880 | And some people go, like, wow, there's a bunch of things here

01:44:47.120 | I haven't seen before.

01:44:48.800 | I'm so excited this is an opportunity

01:44:50.320 | to learn those things.

01:44:51.560 | And so, like, our friend Hamill, in particular,

01:44:54.760 | this happens all the time.

01:44:55.800 | He's just like, oh, I was just reading your source code

01:44:57.560 | to that thing you did, and I learned about this whole thing.

01:45:00.400 | That's really helpful.

01:45:01.720 | So I don't mind.

01:45:03.760 | People can react in either way.

01:45:05.200 | [LAUGHTER]

01:45:07.040 | They've been to the stage in the video.

01:45:08.760 | They're likely in the second class.

01:45:10.400 | Probably.

01:45:11.840 | Otherwise, they would have given up long ago.

01:45:14.160 | I think also it helps--

01:45:15.200 | What's this star, star, fuck, dot, args?

01:45:16.960 | No.

01:45:17.460 | [LAUGHTER]

01:45:19.840 | If people can relate it to things in other languages,

01:45:21.640 | it doesn't seem so alien.

01:45:22.600 | Like, every time you do patch, I'm just like, OK,

01:45:24.200 | it's a Swift extension.

01:45:25.520 | Or every time you, you know, a lot of these things

01:45:27.840 | have analogies in other languages.

01:45:28.720 | Yeah, it's exactly a Swift extension.

01:45:30.280 | Or in Ruby, they even call it monkey patching.

01:45:32.640 | Yeah.

01:45:33.360 | But because it's, like, not built

01:45:35.840 | into the standard library, some Python programmers are like,

01:45:38.360 | no, you're not allowed to use the dynamic features

01:45:41.960 | of this dynamic language that were created to make it dynamic.

01:45:47.160 | Anyhow, yeah.

01:45:48.520 | Nice.

01:45:49.200 | Cool.

01:45:49.720 | Well, thank you, Jeremy.

01:45:50.680 | I think this is hopefully--

01:45:52.120 | this is like the ultra in-depth.

01:45:54.280 | If you were reading the source code and you wanted more,

01:45:56.840 | well, I think you got all the more you could want.

01:46:00.480 | Yeah, hopefully we might try and do these for a few other

01:46:02.920 | projects as well and work through the backlog

01:46:04.920 | of the little side things that haven't really been documented.

01:46:08.000 | But yeah, is there anything else you wanted to add for this

01:46:10.680 | or, like, introduce this series, I guess?

01:46:14.360 | I mean, series is probably too grand a word, you know?

01:46:18.280 | I think it's just like, from my point of view,

01:46:22.040 | I wanted an opportunity to--

01:46:23.720 | mainly to hear from other folks at answer.ai

01:46:31.040 | more about their work.

01:46:32.680 | And so partly, this is like my cunning plan

01:46:35.800 | is to, like, if I do one, maybe other people

01:46:38.000 | will feel some social pressure to do the same thing.

01:46:40.400 | And I will then get to learn more about their work.

01:46:44.800 | And Alexis and I have had a similar conversation.

01:46:46.840 | And he's promised to teach me about some of his work

01:46:50.680 | soon as well.

01:46:51.520 | So hopefully that will be happening soon.

01:46:55.320 | And also, I think, in general, answer.ai

01:47:03.840 | is a public benefit corporation.

01:47:06.280 | And hopefully this is something that provides some level

01:47:08.720 | of public benefit to at least some people is--

01:47:11.800 | there's no real-- like in a normal company,

01:47:13.560 | this is probably something that would be a private, secret,

01:47:16.560 | password-protected internal series.

01:47:19.680 | And here, it's like, no, there's no need to do that.

01:47:23.920 | Other people can benefit, too, if they want to.

01:47:26.120 | Did any of you have anything else to add to that?

01:47:35.200 | My one thought is I think these are a good idea.

01:47:38.040 | And I think we should also explore

01:47:39.480 | the full range of durations.

01:47:41.320 | So it's good to do deep dives on something that has depth to it.

01:47:45.000 | It's also good to do shallow dives on something

01:47:48.400 | that is quick and might be trivial to the person

01:47:52.680 | explaining it, but is totally unfamiliar and, therefore,

01:47:55.800 | high value to the person who hasn't seen it before.

01:47:57.880 | And we can put those ones on TikTok, too.

01:48:00.720 | Yes, there we go.

01:48:01.680 | Let's see who achieves the first TikTok-length explainer.

01:48:07.200 | Awesome.

01:48:08.200 | Thanks, all.

01:48:09.440 | Thanks, everybody.

01:48:10.640 | Well done.

01:48:11.160 | Bye.

01:48:12.120 | Sure.

01:48:12.760 | Bye.

01:48:13.520 | Thanks.