back to index

Beating GPT-4 with Open Source Models - with Michael Royzen of Phind


Chapters

0:0 Introductions
1:2 Founding SmartLens in High School (2017)
3:44 Shifting to NLP
5:10 Sparking Interest in Long-Form Q&A (HuggingFace Demo)
8:32 Creating a Search Engine (Common Crawl, 2020)
11:29 Early Days: Hello Cognition to Phind
13:35 Phind Launch & In-Depth Look
20:58 Envisioning Phind: Integrating Reasoning with Code & Web
23:26 Exploring the Developer Productivity Landscape
26:28 Phind's Top Use Cases & Early Adoption
30:0 Behind Phind’s Rebranding (Advice from Paul Graham)
39:40 Crafting a Custom Model (Code Llama & Expanded Data)
44:34 Phind's Model: Evaluation Tactics & Metrics
47:0 Enhancing Accuracy with Reinforcement Learning
51:18 Running Models Locally: Interest & Techniques (Quantization)
67:13 Michael’s Autodidact Journey in AI Research
72:0 Lightning Round

Whisper Transcript | Transcript Only Page

00:00:00.000 | (upbeat music)
00:00:02.580 | - Hey everyone, welcome to the Latent Space Podcast.
00:00:10.040 | This is Alessio, partner and CTO
00:00:12.040 | of Residence Investable Partners,
00:00:13.840 | and I'm joined by my co-host, Swiggs, founder of Small AI.
00:00:17.280 | - Hey, and today we have in the studio
00:00:18.800 | Michael Voisin from Fines, welcome.
00:00:20.600 | - Thank you so much, it's great to be here.
00:00:22.000 | - Yeah, we are recording this in a surprisingly hot October
00:00:25.200 | in San Francisco, and I mean, sometimes the studio works,
00:00:29.600 | but--
00:00:30.440 | - The blue angels are flying by right now.
00:00:31.260 | - And the blue angels are flying by.
00:00:32.100 | (laughing)
00:00:32.920 | - Sorry about the noise.
00:00:33.760 | - I don't think they can hear it.
00:00:35.080 | We have enough damping.
00:00:36.320 | Anyway, so welcome.
00:00:38.320 | I've seen Fine blow up this year,
00:00:41.640 | mostly I think since your launch in Feb,
00:00:44.000 | and V2, and then your Hacker News post.
00:00:48.800 | We tend to like to introduce our guests,
00:00:50.560 | but then obviously you can fill in the blanks
00:00:52.200 | with the origin story.
00:00:53.880 | So you actually were a high school entrepreneur.
00:00:56.960 | You started SmartLens,
00:00:58.280 | which is a computer vision startup in 2017.
00:01:00.720 | - That's right, yeah.
00:01:01.640 | So I remember when TensorFlow came out
00:01:04.640 | and people started talking about,
00:01:07.680 | oh, obviously at the time, after AlexNet,
00:01:10.720 | the deep learning revolution was already in flow,
00:01:13.480 | and good computer vision models were a thing.
00:01:16.940 | And what really made me interested in deep learning
00:01:19.180 | was I got invited to go to Apple's WWDC conference
00:01:25.440 | as a student scholar,
00:01:26.960 | 'cause I was really into making iOS apps at the time.
00:01:30.000 | And so I go there and I go to this talk
00:01:32.000 | where they add an API that let people
00:01:36.560 | run computer vision models on the device
00:01:41.240 | using far more efficient GPU primitives.
00:01:43.760 | And after seeing that, I was like, oh, this is cool.
00:01:46.680 | This is gonna have a big explosion
00:01:48.600 | of different computer vision models
00:01:50.720 | running locally on the iPhone.
00:01:52.920 | And so I had this crazy idea where it was like,
00:01:57.360 | what if I could just make this model
00:02:02.080 | that could recognize just about anything
00:02:05.200 | and have it run on the device?
00:02:07.520 | And that was the genesis
00:02:08.720 | for what eventually became SmartLens.
00:02:10.980 | I took this data set called ImageNet 22K.
00:02:17.800 | So most people, when they think of ImageNet,
00:02:21.160 | think of ImageNet 1K.
00:02:22.680 | But the full ImageNet actually has,
00:02:24.960 | I think, 22,000 different categories.
00:02:28.200 | Yeah, so I took that, filtered it, pre-processed it,
00:02:32.800 | and then did a massive fine tune on Inception V3,
00:02:37.800 | which was, I think, the state-of-the-art
00:02:40.680 | deep convolutional computer vision model at the time.
00:02:44.420 | And to my surprise, it actually worked insanely well.
00:02:49.320 | I had no idea what would happen
00:02:50.520 | if I give a single model.
00:02:53.220 | I think it ended up being 17,000 categories approximately
00:02:57.520 | that I collapsed them into.
00:02:59.360 | It actually ended up working so well.
00:03:00.920 | It worked so well that it actually worked better
00:03:04.200 | than Google Lens,
00:03:05.720 | which released its V1 around the same time.
00:03:07.940 | And so, and on top of this, the model ran on the device.
00:03:12.440 | So it didn't need an internet connection.
00:03:14.120 | A big part of the issue with Google Lens at the time
00:03:16.400 | was that connections were slower.
00:03:19.320 | 4G was around, but it wasn't nearly as fast.
00:03:21.480 | And so there was a noticeable lag
00:03:22.480 | having to upload an image to a server and get it back.
00:03:25.720 | But just processing it locally,
00:03:28.000 | even on the iPhones of the day in 2017, much faster.
00:03:31.500 | And so it was a cool little project.
00:03:35.960 | It got some traction.
00:03:36.780 | TechCrunch wrote about it.
00:03:37.620 | And there was kind of one big spike in usage,
00:03:41.000 | and then over time it tapered off.
00:03:42.800 | But people still pay for it, which is wild.
00:03:44.840 | - That's awesome.
00:03:45.680 | Oh, it's like a monthly or annual subscription?
00:03:46.920 | - Yeah, it's like a monthly subscription.
00:03:48.560 | - Even though you don't actually have any servers.
00:03:50.240 | - Even though we don't have any servers.
00:03:52.960 | That's right, I was in high school.
00:03:53.960 | I wanted to make a little bit of money.
00:03:54.960 | I was like, yeah.
00:03:56.520 | - That's awesome.
00:03:57.360 | The modern equivalent is kind of Be My Eyes.
00:03:59.880 | And they actually disclosed
00:04:01.920 | in the GPT-4 Vision system card recently
00:04:04.200 | that the usage was surprisingly not that frequent.
00:04:08.280 | The extent to which all three of us have a sense of sight,
00:04:11.920 | I would think that if I lost my sense of sight,
00:04:13.520 | I would use Be My Eyes all the time.
00:04:15.320 | The average usage of Be My Eyes per day is 1.5 times.
00:04:18.640 | - Exactly.
00:04:19.520 | And I was thinking about this as well,
00:04:21.360 | where I was also looking into image captioning,
00:04:24.240 | where you give a model an image,
00:04:26.480 | and then it tells you what's in the image.
00:04:28.440 | But it turns out that what people want
00:04:29.920 | is the exact opposite.
00:04:30.840 | People want to give you a description,
00:04:32.360 | well, people want to give a description of an image,
00:04:34.200 | and then have the AI generate the image.
00:04:36.600 | - Oh, the other way.
00:04:37.600 | - Exactly.
00:04:38.440 | And so, at the time,
00:04:41.160 | I think there were some GANs,
00:04:43.720 | NVIDIA was working on this back in 2019, 2020.
00:04:46.480 | They had some impressive, I think, phase scans,
00:04:49.320 | where they had this model that would produce
00:04:51.640 | these really high quality portraits.
00:04:54.560 | But it wasn't able to take a natural language description
00:04:58.840 | the way Midtourney or DALI 3 can,
00:05:01.320 | and just generate you an image
00:05:04.800 | with exactly what you described in it.
00:05:07.640 | - Awesome.
00:05:08.480 | And how'd that get into NLP?
00:05:10.240 | - I released the Smart Lens app,
00:05:11.880 | and that was around the time,
00:05:12.720 | I was a senior in high school,
00:05:13.560 | I was applying to college.
00:05:14.800 | College rolls around,
00:05:17.640 | I'm still sort of working on updating the app in college.
00:05:21.760 | But I start thinking like,
00:05:24.280 | hey, what if I make an enterprise version of this as well?
00:05:28.600 | At the time, there was Clarify
00:05:30.160 | that provided some computer vision APIs.
00:05:32.680 | But I thought, this massive classification model
00:05:37.000 | works so well, and it's so small, and so fast,
00:05:39.880 | might as well build an enterprise product.
00:05:41.760 | And I didn't even talk to users,
00:05:43.240 | or do any of those things that you're supposed to do.
00:05:44.760 | I was just mainly interested in building a type of backend
00:05:48.800 | I've never built before.
00:05:49.640 | So I was mainly just doing it for myself, just to learn.
00:05:53.040 | And so I built this enterprise classification product,
00:05:56.800 | and as part of it,
00:05:57.880 | I'm also building an invoice processing product,
00:06:03.200 | where using some of the aspects that I built previously,
00:06:07.640 | although obviously it's very different from classification,
00:06:10.720 | I wanted to be able to just extract
00:06:13.640 | a bunch of structured data
00:06:14.880 | from an unstructured invoice through our API.
00:06:18.560 | And that's what led me to HuggingFace for the first time,
00:06:21.800 | 'cause that involves some natural language components.
00:06:23.880 | And so I go to HuggingFace,
00:06:25.280 | and with various encoder models
00:06:29.640 | that were around at the time, I think,
00:06:31.320 | I used the standard BERT, and also LongFormer,
00:06:34.720 | which came out around the same time.
00:06:37.400 | And LongFormer was interesting because it allowed,
00:06:40.120 | it had a much bigger context window
00:06:41.960 | than those models at the time.
00:06:43.000 | Like BERT, all of the first-gen encoder-only models,
00:06:46.640 | they only had a context window of 512 tokens.
00:06:50.800 | And it's fixed.
00:06:51.840 | There's none of this alibi or ROPE that we have now,
00:06:54.960 | where we can basically massage it to be longer.
00:06:57.040 | It was, they're fixed, 512 absolute encodings.
00:07:00.720 | And so LongFormer at the time was the only way
00:07:04.000 | that you can fit, say, a sequence length,
00:07:06.560 | or ask a question about like 4,000 tokens worth of text.
00:07:10.000 | And so implemented LongFormer, it worked super well.
00:07:14.960 | But nobody really kind of used the enterprise product.
00:07:20.400 | And that's kind of what I expected,
00:07:22.120 | 'cause at the end of the day, it was COVID.
00:07:24.560 | I was building this kind of mostly for me,
00:07:26.440 | mostly just kind of to learn.
00:07:28.960 | And so nobody really used it, and my heart wasn't in it,
00:07:32.440 | and I kind of just shelved it.
00:07:35.200 | But a little later, I went back to Hugnyface,
00:07:38.520 | and I saw this demo that they had,
00:07:40.640 | and this is in the summer of 2020.
00:07:42.440 | They had this demo made by this researcher, Yassin Jarnit.
00:07:47.480 | And he called it LongForm Question Answering.
00:07:52.480 | And basically, it was this self-contained notebook demo
00:07:59.400 | where you can ask a model a question,
00:08:04.560 | the way that we do now with ChatGPT.
00:08:06.960 | It would do a lookup into some database,
00:08:10.800 | and it would give you an answer.
00:08:12.040 | And it absolutely blew my mind.
00:08:15.200 | The demo itself, it used, I think, BART as the model.
00:08:18.160 | And in the notebook, it had support
00:08:19.720 | for both an Elasticsearch index of Wikipedia,
00:08:24.720 | as well as a Dense index, powered by Facebook's Face,
00:08:29.240 | Vice, I think that's how you pronounce it.
00:08:32.400 | It had both, and it was very iffy.
00:08:36.760 | But when it worked, I think the question in the demo was,
00:08:41.080 | why are all boats white?
00:08:42.840 | When it worked, it blew my mind
00:08:45.080 | that instead of doing this few-shot thing,
00:08:48.960 | like people were doing GPT-3 at the time,
00:08:50.760 | which is all the rage,
00:08:51.800 | you could just ask a model a question,
00:08:53.880 | provide no extra context,
00:08:56.800 | and it would know what to do and just give you the answer.
00:08:59.640 | It blew my mind to such an extent
00:09:00.920 | that I couldn't stop thinking about that.
00:09:03.040 | And I started thinking about ways to make it better.
00:09:05.920 | I tried training or doing the fine-tune
00:09:09.800 | with a larger BART model.
00:09:12.160 | And this BART model, yeah,
00:09:14.280 | it was fine-tuned on this Reddit dataset called Eli Five.
00:09:19.280 | So basically-
00:09:21.200 | - The subreddit.
00:09:22.040 | - Yeah, the subreddit, yeah.
00:09:23.800 | Someone had scraped, I think, I forget who did it,
00:09:26.720 | but someone had scraped a subreddit.
00:09:30.320 | And put it into a well-formatted,
00:09:33.000 | relatively clean dataset of human questions
00:09:35.720 | and human answers.
00:09:36.720 | So we're bootstrapping this model from Eli Five,
00:09:39.640 | and that made it pretty good
00:09:42.880 | at at least getting the right format
00:09:44.640 | when doing this rag retrieval from these databases
00:09:49.120 | and then generating the final answer.
00:09:51.280 | And so Eli Five actually turned out to be a good dataset
00:09:54.600 | for training these types of question-answering models
00:09:56.800 | because the question's written by a human.
00:10:00.040 | The answer's written by a human,
00:10:01.360 | and at least helps the model get the format right.
00:10:04.440 | Even if the model is still very small
00:10:06.960 | and it can't really think super well,
00:10:08.080 | at least it gets the format right.
00:10:09.520 | And so it ends up acting
00:10:11.360 | as kind of a glorified summarization model
00:10:13.360 | where if it's fed in high-quality context
00:10:17.240 | from the retrieval system,
00:10:19.800 | it's able to have a reasonably high-quality output.
00:10:22.400 | And so once I made the model as big as I can,
00:10:24.600 | just fine-tuning on BART large,
00:10:28.280 | I started looking for ways to improve the index.
00:10:33.080 | So in the demo, in the notebook,
00:10:35.640 | it was there were instructions
00:10:38.560 | for how to make an Elasticsearch index just for Wikipedia.
00:10:42.280 | And I was like, "Why not do all of Common Crawl?"
00:10:45.480 | So I downloaded Common Crawl,
00:10:47.480 | and thankfully I had like $10,000 or $15,000
00:10:50.640 | worth of AWS credits left over from the SmartLens project.
00:10:53.880 | That's what really allowed me to do this
00:10:55.120 | 'cause there's no other funding.
00:10:56.240 | I was still in college.
00:10:58.200 | Not a lot of money.
00:10:59.200 | And so I was able to spin up a bunch of instances
00:11:01.480 | and just process all of Common Crawl, which is massive.
00:11:04.640 | So it's roughly like, it's terabytes of text.
00:11:07.760 | And so I whitelisted.
00:11:12.240 | I went to Alexa to get like the top 1000 websites
00:11:17.240 | or 10,000 websites in the world,
00:11:21.040 | and then filtered only by those websites,
00:11:23.480 | and then indexed those websites
00:11:25.760 | 'cause the webpages were already included in dump.
00:11:28.280 | So I just-
00:11:29.120 | - You mean to supplement Common Crawl
00:11:30.480 | or to filter Common Crawl?
00:11:31.560 | - Filter Common Crawl.
00:11:32.400 | - Oh, okay. - Yeah.
00:11:33.240 | So we filtered Common Crawl just by,
00:11:36.680 | yeah, the top, I think, 10,000.
00:11:39.680 | Just to limit this,
00:11:40.520 | because obviously there's this massive long tail
00:11:43.120 | of small sites that are really cool, actually.
00:11:45.760 | And there's other projects like,
00:11:47.920 | shout out to Marginalian New,
00:11:50.040 | which is a search engine specialized on the long tail.
00:11:53.000 | I think they actually exclude like the top 10,000.
00:11:55.960 | - That's what they do. - 10,000, yeah.
00:11:57.680 | - I've seen them around
00:11:58.520 | and just don't really know what their pitch is.
00:12:00.240 | - Yeah, yeah, yeah.
00:12:01.080 | So they exclude all the top stuff.
00:12:03.960 | So the long tail is cool,
00:12:04.920 | but for this, that was kind of out of the question,
00:12:07.440 | and that was most of the data anyway.
00:12:09.160 | So we've removed that.
00:12:11.920 | And then I indexed the remaining
00:12:16.680 | approximately 350 million webpages through Elasticsearch.
00:12:22.680 | So I built this index running on AWS with these webpages,
00:12:26.600 | and it actually worked quite well.
00:12:28.120 | Like you can ask it like general common knowledge,
00:12:31.360 | history, politics, current events, questions,
00:12:35.320 | and it would be able to do a fast lookup in the index,
00:12:39.120 | feed it into the model,
00:12:40.280 | and it would give like a surprisingly good result.
00:12:43.320 | And so when I saw that,
00:12:45.560 | I thought that this is definitely doable.
00:12:49.680 | And like, it kind of shocked me
00:12:50.720 | that like no one else was doing this.
00:12:52.560 | And so this was now the fall of 2020.
00:12:55.360 | And yeah, I was kind of shocked no one was doing this,
00:13:01.240 | but it costs a lot of money to keep it up.
00:13:03.360 | I was still in college.
00:13:04.200 | There are things going on.
00:13:05.080 | I got bogged down by classes.
00:13:06.360 | And so I ended up shelving this
00:13:08.120 | for almost a full year, actually.
00:13:13.120 | And I returned to it in fall of 2021
00:13:16.720 | when Big Science released T0.
00:13:20.520 | When Big Science released the T0 models,
00:13:22.680 | that was a massive jump
00:13:24.880 | in the reasoning ability of the model.
00:13:28.240 | And it was better at reasoning,
00:13:30.640 | it was better at summarization.
00:13:31.720 | It was still a glorified summarizer, basically.
00:13:33.760 | - Was this a precursor to Bloom?
00:13:35.440 | Because Bloom's the one that I know.
00:13:36.640 | - I think Bloom ended up actually coming out in 2022,
00:13:39.400 | but Bloom had other problems
00:13:43.840 | where I think for whatever reason,
00:13:47.560 | the Bloom models just were never really that good,
00:13:50.320 | which is so sad 'cause I really wanted to use them.
00:13:53.120 | But I think they didn't train on that much data.
00:13:56.040 | I think they used the original,
00:13:57.440 | they were trying to replicate GPT-3.
00:13:59.200 | So they just used those numbers,
00:14:00.360 | which we now know are far below Chinchilla Optimal.
00:14:02.840 | And even Chinchilla Optimal,
00:14:04.200 | which we can talk about later,
00:14:05.800 | what we're currently doing with the fine model goes,
00:14:07.240 | yeah, it goes way beyond that.
00:14:08.880 | But they weren't sharing enough data.
00:14:10.760 | I'm not sure how that data was clean,
00:14:11.960 | but it probably wasn't super clean.
00:14:13.120 | And then they didn't really do any fine tuning
00:14:14.800 | until much later.
00:14:16.360 | So T0 worked well because they took the T5 models,
00:14:20.920 | which were closer to Chinchilla Optimal.
00:14:24.840 | 'Cause I think they were trained on
00:14:26.200 | also like 300 something billion tokens,
00:14:28.240 | similar to GPT-3, but the models were much smaller.
00:14:30.840 | So the models, yeah, they were pre-trained better.
00:14:35.200 | And then they were fine-tuned on this.
00:14:39.200 | I think T0 is the first model
00:14:44.600 | that did large-scale instruction tuning
00:14:46.720 | from diverse data sources in the fall of 2021.
00:14:51.040 | This is before Instruct GPT.
00:14:52.560 | This is before Flan T5, which came out in 2022.
00:14:56.560 | This is the very, very first,
00:14:58.600 | at least well-known example of that.
00:15:01.680 | And so it came out and then I did,
00:15:04.200 | on top of T0, I also did the Reddit Eli5 fine tune.
00:15:09.000 | And that was the first model and system
00:15:14.120 | that actually worked well enough
00:15:16.200 | to where I didn't get discouraged like I did previously.
00:15:19.120 | 'Cause the failure cases
00:15:20.480 | of the BART-based system was so egregious.
00:15:23.600 | Sometimes it would just misinterpret your answers so,
00:15:27.520 | or questions so horribly
00:15:28.880 | that it was just extremely discouraging.
00:15:31.840 | But for the first time, it was working reasonably well.
00:15:34.880 | I'm also using a much bigger model.
00:15:36.640 | I think the BART model is like 800 million parameters,
00:15:39.520 | but T0, we were using 3B.
00:15:41.880 | So it was T0, 3B, bigger model.
00:15:45.200 | And that was the very first iteration of Hello.
00:15:51.280 | So ended up doing a show HN on Hacker News
00:15:54.560 | in January, 2022 of that system.
00:15:57.720 | Our fine tune T0 model connected to our Elasticsearch index
00:16:02.160 | of those 350 million top 10,000 common crawl websites.
00:16:06.840 | And to the best of my knowledge,
00:16:07.920 | I think that's the first example
00:16:11.360 | that I'm aware of a LLM search engine model
00:16:16.360 | that's effectively connected to like a large enough index
00:16:21.640 | that I would consider like an internet scale.
00:16:23.880 | So I think we were the first to release
00:16:28.360 | like an internet scale LLM powered rag search system
00:16:33.360 | in January, 2022.
00:16:35.320 | And around the time me and my future co-founder, Justin,
00:16:40.680 | we were like, you know, we really,
00:16:42.640 | why not do this full time?
00:16:43.880 | Like this seems like the future.
00:16:45.600 | This is really cool.
00:16:47.240 | I couldn't really sleep even.
00:16:48.400 | Like I was going to bed and I was like,
00:16:50.480 | I was thinking about it.
00:16:51.320 | Like I would say up until like 2.30 AM,
00:16:53.760 | like reading papers on my phone in bed,
00:16:56.080 | go to sleep, wake up the next morning at like eight
00:16:58.600 | and just be super excited to keep working.
00:17:01.440 | And I was also doing my thesis at the same time,
00:17:04.440 | my senior honors thesis at UT Austin
00:17:06.840 | about something very similar.
00:17:10.320 | We were researching factuality
00:17:13.320 | in abstractive question answering systems.
00:17:17.960 | So a lot of overlap with this project.
00:17:19.880 | And the conclusions of my research actually kind of helped
00:17:24.400 | guide the development path of Hello.
00:17:26.880 | In the research we found that LLMs don't,
00:17:30.720 | they don't know what they don't know.
00:17:34.120 | So the conclusion was,
00:17:35.680 | is that you always have to do a search
00:17:40.240 | to ensure that the model actually knows
00:17:42.120 | what it's talking about.
00:17:43.320 | And my favorite example of this even today
00:17:45.120 | is kind of with chat GPT browsing,
00:17:47.640 | where you can ask chat GPT browsing,
00:17:50.440 | how do I run llama.cpp?
00:17:52.920 | And chat GPT browsing will think that llama.cpp
00:17:55.160 | is some file on your computer
00:17:56.600 | that you can just compile with GCC and you're all good.
00:17:59.720 | It won't even bother doing a lookup,
00:18:02.800 | even though I'm sure somewhere in their internal prompts,
00:18:05.480 | they have something like, if you're not sure, do a lookup.
00:18:07.720 | Like that's not good enough.
00:18:09.480 | So models don't know what they don't know.
00:18:10.960 | You always have to do a search.
00:18:12.520 | And so we approached LLM powered question answering
00:18:17.520 | from the search angle.
00:18:19.440 | We pivoted to make this for programmers in June of 2022,
00:18:25.880 | around the time that we were getting into YC.
00:18:29.280 | We realized that like,
00:18:30.480 | what we're really interested in,
00:18:33.040 | is the case where the models actually have to think.
00:18:36.320 | 'Cause up until then,
00:18:37.160 | the models were kind of more glorified summarization models.
00:18:40.880 | Like we really thought of them like,
00:18:42.520 | the Google featured snippets, but on steroids.
00:18:45.320 | And so we like, we saw a future where,
00:18:48.280 | the simpler questions would get commoditized.
00:18:50.560 | And I still think that's going to happen
00:18:52.040 | with like Google SGE and like it's nowadays,
00:18:55.640 | it's really not that hard
00:18:57.560 | to like answer the more basic kind of like summarization,
00:19:03.000 | like current events questions with lightweight models.
00:19:05.320 | That'll only continue to get cheaper over time.
00:19:07.680 | And so we kind of started thinking about this trade-off
00:19:09.760 | where LLM models are going to get both better
00:19:13.760 | and cheaper over time.
00:19:15.360 | And that's going to force people
00:19:17.720 | who run them to make a choice.
00:19:19.680 | Either you can run a model of the same intelligence
00:19:22.960 | that you could previously for cheaper,
00:19:25.480 | or you can run a better model for the same price.
00:19:29.160 | And so someone like Google,
00:19:31.160 | once the price kind of falls low enough,
00:19:33.960 | they're going to deploy, and they're already doing this
00:19:35.560 | with SGE, they're going to deploy a relatively basic
00:19:39.920 | kind of glorified summarizer model
00:19:41.760 | that can answer very basic questions
00:19:43.160 | about like current events, like who won the Superbowl,
00:19:47.040 | like what's going on on Capitol Hill,
00:19:50.240 | like those types of things.
00:19:51.600 | And the flip side of that is like more complex questions
00:19:55.240 | where like you have to reason
00:19:56.160 | and you have to solve problems and like debug code.
00:19:58.760 | And we realized like we were much more interested
00:20:02.360 | in kind of going along the bleeding edge
00:20:05.280 | of that frontier case.
00:20:06.480 | And so we've optimized everything that we do for that.
00:20:10.480 | And that's a big reason of why we've built FIND
00:20:13.520 | specifically for programmers,
00:20:15.400 | as opposed to saying like,
00:20:17.240 | we're kind of a search engine for everyone
00:20:18.800 | because as these models get more capable,
00:20:21.200 | we're very interested in seeing kind of
00:20:22.760 | what the emergent properties are in terms of reasoning,
00:20:25.800 | in terms of being able to solve complex multi-step problems.
00:20:30.480 | And I think that some of those emergent capabilities,
00:20:33.840 | like we're starting to see,
00:20:35.720 | but we don't even fully understand.
00:20:37.320 | So as I think there's always an opportunity for us
00:20:42.040 | to become more general if we wanted,
00:20:43.840 | but we've been along this path of like,
00:20:48.080 | what is the best, most advanced reasoning engine
00:20:53.080 | that's connected to your code base,
00:20:55.200 | that's connected to the internet that we can just provide?
00:20:57.880 | - What is FIND today, pragmatically,
00:21:00.160 | from a product perspective?
00:21:01.840 | How do people interact with it?
00:21:03.160 | How does it plug into your workflow?
00:21:04.920 | - Yeah, so FIND is really a system.
00:21:06.640 | FIND is a system for programmers
00:21:10.320 | when they have a question or when they're frustrated
00:21:12.640 | or when something's not working.
00:21:13.960 | - You're frustrated.
00:21:14.800 | - Yeah, for them to get on block.
00:21:15.880 | The most abstract page for FIND is like,
00:21:18.120 | if you're experiencing really any kind of issue
00:21:22.120 | as a programmer,
00:21:23.280 | we'll solve that issue for you in 15 seconds
00:21:25.520 | as opposed to 15 minutes or longer.
00:21:27.600 | And so, FIND has an interface on the web.
00:21:31.200 | It has an interface in VS Code and more IDEs to come.
00:21:35.280 | But ultimately, it's just a system
00:21:37.280 | where a developer can paste in a question
00:21:39.920 | or paste in code that's not working.
00:21:41.800 | And FIND will do a search on the internet
00:21:44.560 | or they will find other code in your code base,
00:21:46.880 | perhaps that's relevant.
00:21:48.480 | FIND will find the context that it needs
00:21:50.920 | to answer your question
00:21:52.520 | and then feed it to a reasoning engine
00:21:54.520 | powerful enough to actually answer it.
00:21:56.440 | So, that's really the philosophy behind FIND.
00:21:58.080 | It's a system for getting developers
00:22:00.400 | the answers that they're looking for.
00:22:03.560 | And so, right now from a product perspective,
00:22:06.520 | this means that we're really all about
00:22:09.320 | getting the right context.
00:22:10.800 | So, the VS Code extension that we launched recently
00:22:13.800 | is a big part of this
00:22:15.280 | 'cause you can just ask a question
00:22:17.120 | and it knows where to find the right code context
00:22:22.120 | in your code.
00:22:23.280 | It can do an internet search as well.
00:22:24.280 | So, it's up to date.
00:22:25.960 | And it's not just reliant on what the model knows.
00:22:29.280 | And it's able to figure out what it needs by itself
00:22:33.640 | and answer your question based on that.
00:22:36.360 | And if it needs some help,
00:22:38.400 | you can also get yourself kind of just,
00:22:41.160 | there's opportunities for you yourself
00:22:42.760 | to put in all that context in.
00:22:44.360 | But the issue is also not everyone wants to use VS Code.
00:22:49.800 | Some people are real Neovim sticklers
00:22:53.240 | or they're using PyCharm or other IDEs, JetBrains.
00:22:57.840 | And so, for those people,
00:23:00.800 | they're actually okay with switching tabs,
00:23:03.960 | at least for now,
00:23:04.920 | if it means them getting their answer.
00:23:07.840 | 'Cause really, there's been an explosion
00:23:11.080 | of all these startups doing code, doing search, et cetera.
00:23:15.320 | But really, who everyone's competing with is ChatGPT,
00:23:18.520 | which only has that one web interface.
00:23:20.600 | And ChatGPT is really the bar.
00:23:23.000 | And so, that's what we're up against.
00:23:26.160 | - And so, your idea,
00:23:27.520 | we have Aman from Cursor on the podcast
00:23:29.840 | and they've gone through the,
00:23:31.720 | we need to own the IDE thing.
00:23:33.480 | Yours is more like,
00:23:35.160 | in order to get the right answer,
00:23:37.280 | people are happy to go somewhere else, basically.
00:23:39.800 | They're happy to get out of their IDE.
00:23:42.440 | - That was a great podcast, by the way.
00:23:44.400 | But yeah, so part of it is that
00:23:46.760 | people sometimes perhaps aren't even in an IDE.
00:23:51.640 | So, the whole task of software engineering
00:23:55.440 | goes way beyond just running code, right?
00:23:56.960 | There's also a design stage.
00:23:58.560 | There's a planning stage.
00:23:59.520 | A lot of this happens on whiteboards.
00:24:01.400 | It happens in notebooks.
00:24:03.080 | And so, the web part of it also exists for that,
00:24:05.680 | where you're not even coding it
00:24:07.720 | and you're just trying to get
00:24:08.680 | a more conceptual understanding
00:24:10.640 | of what you're trying to build first.
00:24:12.480 | But the podcast with Aman was great,
00:24:16.680 | but somewhere where I disagree with him
00:24:18.480 | is that you actually need to own the IDE.
00:24:21.720 | I think in the long, sorry, sorry.
00:24:27.120 | Oh, let's cut that.
00:24:28.040 | Yeah, so I thought the podcast with Aman was great,
00:24:31.480 | but somewhere where I disagree with him
00:24:32.960 | is that you need to own the IDE.
00:24:35.400 | I think he made kind of some good points
00:24:37.640 | about not having platform risk in the longterm,
00:24:42.160 | but some of the features that were mentioned,
00:24:47.920 | like suggesting diffs, for example,
00:24:50.800 | those are all doable with an extension.
00:24:54.600 | We haven't yet seen, with VS Code in particular,
00:24:59.280 | any functionality that we'd like to do yet in the IDE
00:25:05.200 | that we can't either do through
00:25:07.720 | directly supported VS Code functionality
00:25:09.960 | or something that we kind of hack into there,
00:25:12.200 | which we've also done a fair bit of.
00:25:15.440 | And so I think it remains to be seen where that goes.
00:25:20.000 | But I think what we're looking to be
00:25:21.800 | is we're not trying to just be in an IDE or be an IDE.
00:25:26.400 | Find is a system that goes beyond the IDE
00:25:28.440 | and is really meant to cover the entire lifecycle
00:25:32.960 | of a developer's thought process
00:25:34.760 | in going about like, hey, I have this idea
00:25:37.400 | and I want to get from that idea to a working product.
00:25:40.680 | And so then that's what the longterm vision
00:25:42.600 | of Find is really about, is starting with that,
00:25:45.520 | where in the future, I think programming
00:25:50.000 | is just going to be really just the problem solving.
00:25:53.920 | Like you come up with an idea,
00:25:55.600 | you come up with the basic design
00:25:57.600 | for the algorithm in your head,
00:25:59.680 | and you just tell the AI, hey, just do it.
00:26:02.240 | Just make it work.
00:26:03.520 | And that's what we're building towards.
00:26:05.280 | - Fantastic.
00:26:06.680 | I think we might want to give people,
00:26:10.880 | some impression about the type of traffic that you have,
00:26:14.360 | because when you present it with a text box,
00:26:18.080 | you could type in anything.
00:26:19.520 | And I don't know if you have some mental categorization
00:26:22.200 | of what are the top three use cases
00:26:25.120 | that people tend to call lesser.
00:26:26.800 | - Yeah, that's a great question.
00:26:28.560 | So the two main types of searches that we see
00:26:32.720 | are how-to questions, like how to do X using Y tool.
00:26:37.840 | And this historically has been our bread and butter,
00:26:41.000 | because with our embeddings,
00:26:43.200 | like we're really, really good
00:26:44.480 | at just going over a bunch of developer documentation
00:26:48.240 | and figuring out exactly the part that's relevant
00:26:50.200 | and just telling you, okay, like you can use this method.
00:26:52.880 | But as LLMs have gotten better,
00:26:54.720 | and as we've really transitioned
00:26:56.760 | to using GPT-4 a lot in our product,
00:26:59.920 | people organically just started pasting in code
00:27:03.520 | that's not working and just said, fix it for me.
00:27:05.760 | - Fix this. - Yeah.
00:27:06.720 | And what really shocks us is that
00:27:09.240 | a lot of the people who do that,
00:27:12.800 | they're coming from ChatGPT.
00:27:14.480 | So they tried it in ChatGPT with ChatGPT-4.
00:27:18.200 | It didn't work.
00:27:19.120 | Maybe it required like some multi-step reasoning.
00:27:21.880 | Maybe it required like some internet context
00:27:25.080 | or something found in either a Stack Overflow post
00:27:28.800 | or some documentation to solve it.
00:27:31.000 | And so then they paste it into Find and then Find works.
00:27:33.800 | So those are really those two different cases.
00:27:36.600 | Like, how can I build this conceptually
00:27:39.440 | or like remind me of this one detail
00:27:41.360 | that I need to build this thing,
00:27:43.560 | or just like, here's this code, fix it.
00:27:45.720 | And so that's what a big part of our VS Code extension is,
00:27:49.080 | is like enabling a much smoother,
00:27:51.560 | here, just like fix it for me type of workflow.
00:27:54.200 | That's really its main benefits.
00:27:55.880 | Like it's in your code base, it's in the IDE.
00:27:58.360 | It knows how to find the relevant context
00:28:01.240 | to answer that question.
00:28:02.560 | But at the end of the day, like I said previously,
00:28:05.920 | that's still a relatively, not to say it's a small part,
00:28:10.080 | but it's a limited part of the entire
00:28:13.240 | kind of mental lifecycle of a programmer.
00:28:16.880 | - Yeah.
00:28:17.720 | When you launched in, so you launched in Feb
00:28:20.200 | and then you launched V2 in August,
00:28:22.200 | you had a couple other pretty impactful
00:28:25.240 | posts/feature launches.
00:28:27.320 | The web search one was massive.
00:28:29.440 | And so you were mostly a GPT-4 wrapper.
00:28:36.280 | - We were for a long time.
00:28:37.240 | - For a long time, until recently.
00:28:38.200 | - Yeah, until recently.
00:28:39.640 | - So like people coming over from ChatGPT
00:28:41.280 | were saying, "We're gonna say model."
00:28:43.160 | - Yep.
00:28:44.000 | - "What would be your version of web search?
00:28:46.000 | "Would that be the primary value proposition?"
00:28:47.960 | - Basically, yeah.
00:28:48.800 | And so what we've seen is that any model plus web search
00:28:51.960 | is just significantly better than that model itself.
00:28:54.640 | - Do you think that's what you got right in April?
00:28:55.920 | Like, so you got 1500 points on Hacker News in April,
00:28:59.400 | which is like, if you live on Hacker News a lot,
00:29:02.280 | that is unheard of for someone so early on in your journey.
00:29:06.240 | - Yeah, super, super grateful for that.
00:29:08.360 | Definitely was not expecting it.
00:29:09.920 | So what we've done with Hacker News
00:29:11.040 | is we've just kept launching.
00:29:13.000 | - Yeah.
00:29:13.840 | - Like, what they don't tell you
00:29:14.880 | is like, you can just keep launching.
00:29:16.040 | So that's what we've been doing.
00:29:17.680 | So we launched the very first version of Find
00:29:20.800 | in its current incarnation
00:29:24.320 | after like the previous demo connected to our own index.
00:29:26.760 | Like once we got into YC, we scrapped our own index
00:29:30.160 | 'cause it was too cumbersome at the time.
00:29:33.600 | We moved over to using Bing as kind of
00:29:36.400 | just the raw source data.
00:29:39.040 | And we launched as Hello Cognition.
00:29:42.200 | And over time, every time we like added some intelligence
00:29:46.040 | to the product, a better model, we just keep launching.
00:29:47.960 | And every additional time we launched,
00:29:51.880 | we got way more traffic.
00:29:52.840 | So we actually silently rebranded to Find
00:29:55.400 | in late December of last year.
00:29:57.560 | But like, we didn't have that much traffic.
00:29:58.840 | Like nobody really knew who we were.
00:30:00.120 | - How'd you pick the name of it?
00:30:00.960 | - Paul Graham actually picked it for us.
00:30:02.520 | - All right, tell the story.
00:30:03.360 | - Yeah, so, oh boy.
00:30:05.600 | Yeah, where do I start?
00:30:06.440 | So this is a big aside.
00:30:08.160 | Should we go for like the full Paul Graham story
00:30:10.480 | or just the name?
00:30:11.320 | - Do you wanna do it now or you wanna do it later?
00:30:12.400 | I'll give you a choice.
00:30:13.600 | (laughs)
00:30:15.360 | - I think, okay, let's just start with the name for now
00:30:17.520 | and then we can do the full Paul Graham story later.
00:30:20.040 | But basically, Paul Graham,
00:30:23.120 | when we were lucky enough to meet him,
00:30:24.960 | he saw our name and our domain was at the time, sayhello.so.
00:30:29.960 | And he's just like, "Guys, like, come on.
00:30:32.800 | Like, what is this?"
00:30:34.720 | You know, like, and we were like,
00:30:37.880 | "Yeah."
00:30:38.720 | But like when we bought it,
00:30:39.560 | you know, we just kind of broke college students.
00:30:40.920 | Like we didn't have that much money.
00:30:42.000 | And like, we really liked "hello" as a name
00:30:44.120 | because it was the first like conversational search engine.
00:30:48.360 | And that's kind of,
00:30:49.240 | that's the angle that we were approaching it from.
00:30:52.000 | And so we had sayhello.so and he's like,
00:30:54.200 | "There's so many problems with that."
00:30:55.360 | Like the sayhello, like what does that even mean?
00:30:58.520 | And like .so, like, it's gotta be like a .com.
00:31:02.560 | We did some time just like with Paul Graham in the room.
00:31:05.640 | We just like looked at different domain names,
00:31:07.840 | like different things that like popped into our head.
00:31:10.240 | And one of the things that popped into,
00:31:11.920 | like Paul Graham said was fine.
00:31:13.240 | Like with the P-H-I-N-D spelling in particular.
00:31:15.720 | - Yeah, which is not typical naming advice, right?
00:31:17.960 | - Yes.
00:31:18.800 | - Because it's not, when people hear it,
00:31:19.840 | they don't spell it that way.
00:31:20.680 | - Exactly.
00:31:21.520 | It's hard to spell.
00:31:22.960 | And also it's like very nineties.
00:31:25.080 | And so at first, like we didn't like,
00:31:27.240 | I was like, like, like, I don't know.
00:31:30.040 | But over time, like it kind of, it kept growing on us.
00:31:34.360 | And eventually we're like, okay, you know,
00:31:38.880 | we like the name.
00:31:40.160 | It's owned by this elderly Canadian gentleman
00:31:42.920 | who got to know and he was willing to sell it to us.
00:31:45.760 | And so we bought it and we changed the name.
00:31:49.360 | Yeah.
00:31:50.200 | But anyways, where were you?
00:31:52.240 | - I had to ask.
00:31:53.080 | I mean, you know, everyone who looks at you is wondering.
00:31:55.800 | - A lot of people,
00:31:56.640 | and a lot of people actually pronounce it finned,
00:31:59.160 | which, you know, by now is kind of, you know,
00:32:02.240 | it's part of the game,
00:32:03.880 | but eventually we want to buy F-I-N-D.com
00:32:08.160 | and then just have that redirect to P-H-I-N-D.
00:32:10.920 | So P-H-I-N-D is like definitely the right spelling.
00:32:12.920 | But like, we'll just, yeah,
00:32:14.400 | we'll have all the cases addressed.
00:32:15.880 | - So Bing web search, and then in August you launched V2.
00:32:19.520 | Could you, is V2 the find as a system pitch?
00:32:24.520 | Or have you moved, evolved since then?
00:32:26.360 | - Yeah, so I don't, like the V2 moniker,
00:32:29.040 | like I don't really think of it that way in my mind.
00:32:31.120 | There's like, there's the version we launched during,
00:32:33.120 | last summer during YC,
00:32:34.760 | which was the Bing version directed towards programmers.
00:32:39.440 | And that's kind of like,
00:32:40.560 | that's why I call it like the first incarnation
00:32:42.280 | of what we currently are.
00:32:43.120 | 'Cause it was already directed towards programmers.
00:32:44.800 | We had like a code snippet search built in as well.
00:32:47.600 | 'Cause at the time, you know,
00:32:48.840 | the models we were using weren't good enough
00:32:50.600 | to generate code snippets.
00:32:51.640 | Even GPT, like the Text DaVinci 2,
00:32:54.160 | which was available at the time,
00:32:56.120 | wasn't that good at generating code.
00:32:58.240 | And it would generate like very, very short,
00:32:59.880 | very incomplete code snippets.
00:33:03.640 | And so we launched that last summer.
00:33:07.920 | Got some traction, but really like we were only doing like,
00:33:10.720 | I don't know, maybe like 10,000 searches a day.
00:33:13.320 | Like some people knew about it.
00:33:15.520 | Some people use it, which is impressive.
00:33:17.000 | 'Cause looking back, the product like was not that good.
00:33:19.760 | And yeah, every time we've like made an improvement
00:33:24.200 | to the way that we retrieve context
00:33:27.560 | through better embeddings,
00:33:28.920 | more intelligent, like HTML parsers,
00:33:32.640 | and importantly, like better underlying models.
00:33:35.640 | Yeah, I would really consider every kind of iteration
00:33:39.000 | after that when we,
00:33:40.760 | every major version after that was when we introduced
00:33:43.560 | the better underlying answering model.
00:33:45.480 | Like in February, we launched,
00:33:47.400 | we had to swallow a bit of our pride
00:33:51.520 | when we were like, okay, our own models aren't good enough.
00:33:54.720 | We have to go to open AI.
00:33:56.320 | And that actually, that did lead to kind of like our first
00:34:01.160 | like decent bump of traffic in February.
00:34:06.400 | And people kept using it.
00:34:07.440 | Like our attention was way better too.
00:34:09.960 | But we were still kind of running into problems
00:34:12.760 | of like more advanced reasoning.
00:34:14.280 | Some people tried it,
00:34:15.840 | but people were leaving because even like GPT 3.5,
00:34:20.240 | both turbo and non-turbo,
00:34:23.400 | like still not that great at doing like code-related
00:34:27.120 | reasoning beyond like the how do you do X,
00:34:31.760 | like documentation search type of use case.
00:34:34.520 | And so it was really only when GPT-4 came around in April
00:34:39.280 | that we were like, okay, like this is like
00:34:41.600 | our first real opportunity to really make this thing
00:34:44.880 | like the way that it should have been all along.
00:34:47.360 | And having GPT-4 as the brain
00:34:49.800 | is what led to that Hacker News post.
00:34:53.680 | And so what we did was we just let anyone use GPT-4
00:34:58.680 | on Find for free without a login,
00:35:02.960 | which I actually don't regret.
00:35:07.400 | So it was very expensive obviously,
00:35:09.680 | but like at that stage,
00:35:13.000 | all we needed to do was show like,
00:35:15.000 | we just needed to like show people,
00:35:16.240 | here's what Find can do.
00:35:17.800 | That was the main thing.
00:35:18.640 | And so that worked, that worked.
00:35:19.960 | Like we got a lot of users.
00:35:22.040 | Do you know Fireship?
00:35:25.840 | - Yeah, the YouTube Jeff Delaney.
00:35:27.520 | - Yeah, he made a short about Find.
00:35:30.120 | And that's on top of the Hacker News post.
00:35:33.600 | And that's what like really, really made it blow up.
00:35:35.360 | It got millions of views in days.
00:35:37.280 | And he's just funny.
00:35:39.160 | Like what I love about Fireship is like he,
00:35:41.480 | like you guys, yeah.
00:35:42.640 | Yeah, like humor goes a long way
00:35:46.120 | towards like really grabbing people's attention.
00:35:49.040 | And so that blew up.
00:35:50.200 | - So something I would be anxious about as a founder
00:35:52.920 | during that period.
00:35:53.760 | So obviously we all remember that pretty closely.
00:35:55.400 | There were a couple of people
00:35:56.480 | who had access to the GPT-4 API doing this,
00:35:59.080 | which is unrestricted access to GPT-4.
00:36:01.760 | And I have to imagine YC,
00:36:04.800 | OpenAI wasn't that happy about that.
00:36:08.840 | Because it was like kind of de facto access to GPT-4
00:36:13.080 | before they released it.
00:36:14.200 | - Chat GPT-4 was in Chat GPT from day one, I think.
00:36:16.920 | OpenAI actually came to our support
00:36:20.440 | because what happened was
00:36:23.080 | we had people building unofficial APIs around Find.
00:36:27.840 | Yeah, to try to get free access to it.
00:36:31.880 | And I think OpenAI actually has the right perspective
00:36:35.160 | on this where they're like,
00:36:36.000 | "Okay, people can do whatever they want with the API.
00:36:37.520 | If they're paying for it, they can do whatever they want.
00:36:39.600 | But it's not okay if paying customers
00:36:42.600 | are being exploited by these other actors."
00:36:44.200 | So they actually got in touch with us
00:36:45.160 | and they helped us set up better
00:36:47.880 | Cloudflare bot monitoring controls
00:36:50.240 | to effectively crack down on those unofficial APIs,
00:36:55.000 | which we're very happy about.
00:36:58.960 | But yeah, so we launched GPT-4.
00:37:03.760 | A lot of people come to the product.
00:37:06.040 | And yeah, for a long time we're just,
00:37:09.160 | we're figuring out like,
00:37:10.640 | how do we, like, what do we make of this, right?
00:37:13.880 | Like, how do we, A, make it better,
00:37:17.240 | but also deal with like our costs,
00:37:19.080 | which have just like massively, massively ballooned.
00:37:22.520 | And I think over time it's,
00:37:26.040 | I think it's become more clear
00:37:28.720 | with the release of Lama 2 and Lama 3 on the horizon
00:37:31.520 | that we will once again see a return
00:37:34.120 | to vertical applications running their own models.
00:37:38.000 | As was true last year and before,
00:37:40.360 | I think that GPT-4, my hypothesis is that
00:37:44.800 | the jump from 4 to 4.5 or 4 to 5
00:37:48.280 | will be smaller than the jump from 3 to 4.
00:37:52.240 | And the reason why is because
00:37:54.040 | there were a lot of different things.
00:37:56.280 | Like there was two plus,
00:37:57.880 | effectively two, two and a half years of research
00:38:00.160 | that went into going from 3 to 4.
00:38:02.880 | Like more data, bigger model,
00:38:05.840 | all of like the instruction tuning techniques, RLHF,
00:38:08.640 | all of that is known.
00:38:11.800 | And like Meta, for example,
00:38:13.520 | and now there's all these other startups like Mistral too.
00:38:15.240 | Like there's a bunch of very well-funded open source players
00:38:18.200 | that are now working on just like
00:38:20.000 | taking the recipe that's now known and scaling it up.
00:38:24.520 | So I think that even if a Delta exists in 2024,
00:38:29.440 | the Delta between proprietary and open source
00:38:32.400 | won't be large enough that a startup like us
00:38:36.560 | with a lot of data that we've collected
00:38:40.200 | can take the data that we have,
00:38:42.000 | fine tune an open source model
00:38:44.120 | and like be able to have it be better
00:38:46.920 | than whatever the proprietary model is at the time.
00:38:49.920 | That's my hypothesis.
00:38:51.480 | That we'll once again see a return
00:38:52.920 | to these verticalized models.
00:38:54.720 | And that's something that we're super excited about
00:38:58.200 | 'cause yeah, that brings us to kind of the fine model
00:39:01.840 | because the plan from kind of the start
00:39:05.280 | was to be able to return to that if that makes sense.
00:39:08.880 | And I think now we're definitely at a point
00:39:10.280 | where it does make sense
00:39:12.040 | because we have requests from users
00:39:14.840 | who like they want longer context in the model basically.
00:39:19.000 | Like they want to be able to ask questions
00:39:22.640 | about their entire code base.
00:39:24.080 | They want, and without, you know, context and retrieval
00:39:27.920 | and taking a chance of that,
00:39:28.760 | like I think it's generally been shown
00:39:31.400 | that if you have the space to just put the raw files
00:39:36.400 | inside of a big context window,
00:39:38.440 | that is still better than chunking and retrieval.
00:39:40.600 | It just is.
00:39:41.640 | So there's various things that we could do
00:39:42.760 | with longer context, faster speed, lower cost.
00:39:45.480 | Super excited about that.
00:39:46.440 | And that's the direction that we're going to find model.
00:39:48.720 | And our big hypothesis there is precisely
00:39:52.520 | that we can take a really good open source model
00:39:55.520 | and then just train it on absolutely
00:40:00.360 | all of the high quality data that we can find.
00:40:03.800 | And there's a lot of various, you know,
00:40:07.640 | interesting ideas for this.
00:40:09.080 | We have our own techniques
00:40:10.280 | that we're kind of playing with internally.
00:40:12.440 | One of the very interesting ideas that I've seen
00:40:14.720 | is Octopack from BigCode.
00:40:18.600 | I don't think that it made that big waves
00:40:20.440 | when it came out, I think in August,
00:40:21.800 | but the idea is that they have this dataset
00:40:25.960 | that maps GitHub commits to a change.
00:40:30.960 | So basically there's all this really high quality,
00:40:36.560 | like human-made, human-written diff data out there
00:40:40.200 | on every time someone makes a commit in some repo.
00:40:42.760 | And you can use that to train models.
00:40:44.560 | You take the file state before
00:40:46.960 | and like given a commit message,
00:40:48.640 | what should that code look like in the future?
00:40:51.080 | - You got it. - You can--
00:40:51.920 | - You're money though, it's any good?
00:40:54.120 | - No, unfortunately.
00:40:55.320 | So we ran this experiment, we trained the fine model.
00:40:57.360 | And if you go to the BigCode leaderboard
00:40:59.840 | as of today, October 5th,
00:41:02.760 | all of our models are at the top
00:41:07.320 | of the BigCode leaderboard by far, it's not close,
00:41:11.000 | particularly in languages other than Python.
00:41:13.320 | We have a 10 point gap between us and the next best model
00:41:18.320 | on Java, JavaScript, I think C#, multilingual.
00:41:23.400 | And what we kind of learned from that whole experience
00:41:27.560 | releasing those models is that
00:41:29.080 | human eval doesn't really matter.
00:41:31.320 | Not just that, but GPT-4 itself
00:41:34.120 | has been trained on human eval.
00:41:36.360 | And we know this because GPT-4 is able to predict
00:41:39.680 | the exact docstring in many of the problems.
00:41:42.760 | I've seen it predict like the specific example values
00:41:48.280 | in the docstring, which is extremely improbable
00:41:51.520 | for it to just, you know, no.
00:41:53.560 | So I think there's a lot of dataset contamination
00:41:56.920 | and it only captures a very limited subset
00:42:00.400 | like what programmers are actually doing.
00:42:02.360 | What we do internally for evaluations
00:42:04.240 | are we have GPT-4 score answers.
00:42:09.240 | GPT-4 is a really good evaluator.
00:42:12.200 | I mean, obviously it's by really good,
00:42:14.360 | I mean, it's the best that we have.
00:42:15.880 | I'm sure that, you know, a couple of months from now
00:42:17.080 | next year, we'll be like, oh, you know, like GPT-4.5,
00:42:19.800 | GPT-5, it's so much better, like GPT-4 is terrible.
00:42:22.400 | But like right now it's the best
00:42:23.640 | that we have short of humans.
00:42:25.640 | And what we found is that when doing like temperature zero
00:42:29.040 | evals, it's actually mostly deterministic GPT-4
00:42:34.920 | across runs in assigning scores to two different answers.
00:42:38.800 | So we found it to be a very useful tool
00:42:42.000 | in comparing our model to say GPT-4.
00:42:47.000 | Yeah, on our like internal, like real world,
00:42:51.160 | here's what people will be asking this model dataset.
00:42:54.000 | And the other thing that we're running
00:42:56.680 | is just like releasing the model to our users
00:42:59.080 | and just seeing what they think.
00:43:01.280 | 'Cause that's like the only thing that really matters
00:43:02.880 | is like releasing it for the application
00:43:05.960 | that it's intended for and then seeing how people react.
00:43:09.640 | And for the most part, the incredible thing is
00:43:11.800 | is that people don't notice a difference
00:43:15.000 | between our model and GPT-4
00:43:17.040 | for the vast majority of searches.
00:43:19.200 | There's some reasoning problems
00:43:22.000 | that GPT-4 can still do better.
00:43:23.600 | We're working on addressing that.
00:43:25.640 | But in terms of like the types of questions
00:43:27.320 | that people are asking on find,
00:43:28.880 | yeah, like there's not that much difference.
00:43:33.400 | And in fact, like I've been running my own
00:43:35.640 | kind of side-by-side comparisons.
00:43:37.720 | Shout out to Godmode by the way.
00:43:40.520 | And I've like myself,
00:43:42.440 | I've kind of confirmed this to be the case.
00:43:43.640 | And even sometimes it gives a better answer,
00:43:46.000 | perhaps like more concise
00:43:47.040 | or just like better implementation than GPT-4,
00:43:49.560 | which that's what surprises me.
00:43:51.080 | And so by now we kind of have like this
00:43:55.200 | reasoning is all you need kind of hypothesis
00:43:57.400 | where we've seen emerging capabilities in the find model
00:44:00.320 | whereby training it on high quality code,
00:44:02.880 | it can actually like reason better.
00:44:04.920 | It went from not being able to solve
00:44:09.840 | like world problems
00:44:13.240 | where like riddles were like with like temporal
00:44:17.280 | and like placement of objects
00:44:20.400 | and moving and stuff like that,
00:44:22.280 | that GPT-4 can do pretty well.
00:44:23.440 | We went from not being able to do those at all
00:44:25.360 | to being able to do them just by training on more code,
00:44:29.880 | which is wild.
00:44:31.320 | So we're already like starting to see
00:44:32.840 | like these emerging capabilities.
00:44:34.280 | - Yeah, so I just wanted to make sure that we have the,
00:44:38.120 | I guess like the model card in our heads.
00:44:40.800 | So you started from Code Llama?
00:44:42.520 | - Yes.
00:44:43.680 | - 65, 34?
00:44:45.200 | - 34.
00:44:46.040 | So unfortunately there's no Code Llama 7 to be.
00:44:48.120 | If there was, that would be super cool, but there's not.
00:44:50.280 | - 34, and then, which in itself was Llama 2,
00:44:54.960 | which was on two trillion tokens
00:44:56.080 | and the added 500 billion code tokens.
00:44:58.280 | - Yes.
00:44:59.120 | - And you just added a bunch more.
00:44:59.960 | - Yeah, and they also did a couple of things.
00:45:02.000 | So they did, I think they did 500 billion,
00:45:04.440 | the general pre-training
00:45:05.640 | and then they did an extra 20 billion
00:45:07.280 | long context pre-training.
00:45:08.680 | So they actually increased the like max position tokens
00:45:13.680 | to 16K up from 8K.
00:45:17.080 | And then they changed the theta parameter
00:45:21.800 | for the ROPE embeddings as well
00:45:24.680 | to give it theoretically better long context support
00:45:27.040 | up to 100K tokens.
00:45:29.120 | But yeah, but otherwise it's like basically Llama 2.
00:45:31.080 | - So you just took that and just added data?
00:45:32.960 | - Exactly.
00:45:33.800 | - You didn't do any other fundamental?
00:45:34.760 | - Yeah, so we didn't actually,
00:45:36.120 | we haven't yet done anything with the model architecture
00:45:39.120 | and we just trained it on like many, many more billions
00:45:41.560 | of tokens on our own infrastructure.
00:45:43.960 | And something else that we're taking a look at now
00:45:47.040 | is using reinforcement learning for correctness.
00:45:50.360 | One of the interesting pitfalls that we've noticed
00:45:55.240 | with the fine model is that in cases
00:45:57.720 | where it gets stuff wrong,
00:45:59.160 | sometime is capable of getting the right answer.
00:46:02.160 | It's just, there's a big variance problem.
00:46:03.680 | It's wildly inconsistent.
00:46:05.480 | Like there are cases when it is able
00:46:08.000 | to get the right chain of thought
00:46:09.000 | and able to arrive at the right answer, but not always.
00:46:11.800 | And so like one of our hypotheses
00:46:13.560 | is something that we're gonna try is that like,
00:46:15.640 | we can actually do reinforcement learning
00:46:19.280 | on like for a given problem,
00:46:21.760 | generate a bunch of completions
00:46:23.120 | and then like use the correct answer as like a loss
00:46:27.360 | basically to try to get it to be more correct.
00:46:31.600 | And I think there's a high chance I think of this working
00:46:33.760 | because it's very similar to the like RLHF method
00:46:36.880 | where you basically show pairs of completions
00:46:40.560 | for a given question, except the criteria is like,
00:46:43.280 | which one is like, you know, less harmful.
00:46:48.280 | But here, you know, we have a different criteria,
00:46:51.520 | but if the model's already capable
00:46:55.480 | of getting the right answer, which it is,
00:46:57.840 | we just need to cajole it into being more consistent.
00:47:00.120 | - There were a couple of things that I noticed
00:47:01.960 | in the product that were not strange, but unique.
00:47:05.240 | So first of all, the model can talk multiple times
00:47:08.760 | in a row, like most other applications
00:47:10.840 | is like human model, human model.
00:47:13.400 | And then you had outside of the thumbs up, thumbs down,
00:47:16.640 | you have things like have DLLM prioritize this message
00:47:20.160 | and its answers, or then continue from this message
00:47:23.040 | to like go back.
00:47:23.960 | How does that change the flow of the user?
00:47:27.320 | And like in terms of like prompting it,
00:47:29.640 | yeah, what are like some tricks or learnings to that?
00:47:32.280 | - Yeah, that's a good question.
00:47:33.800 | So yeah, that's specifically in our pair programmer mode,
00:47:38.320 | which is a more conversational mode
00:47:40.000 | that also like asks you clarifying questions back
00:47:46.240 | if it doesn't fully understand what you're doing
00:47:47.800 | and it kind of, it holds your hand a bit more.
00:47:50.240 | And so from user feedback, we had requests
00:47:55.240 | to make more of an auto GPT, where you can kind of give it
00:47:58.320 | this problem that might take multiple searches
00:48:00.040 | or multiple different steps,
00:48:01.280 | like multiple reasoning steps to solve.
00:48:03.120 | And so that's the impetus behind building that product,
00:48:08.120 | being able to do multiple steps
00:48:11.400 | and also be able to handle really long conversations.
00:48:14.040 | Like people are really trying to use the pair programmer
00:48:16.320 | to go from like, sometimes really from like basic idea
00:48:18.880 | to like complete working code.
00:48:20.760 | And so we noticed was, is that we were having
00:48:23.320 | like these very, very long threads,
00:48:25.600 | sometimes with like 60 messages, like a hundred messages.
00:48:29.040 | And like those become really, really challenging
00:48:31.560 | to manage like the appropriate context window
00:48:34.360 | of what should go inside of the context
00:48:37.520 | and how to preserve the context
00:48:40.800 | so that the model can continue
00:48:42.520 | or the product can continue giving good responses,
00:48:45.120 | even if you're like 60 messages deep in a conversation.
00:48:47.880 | So that's where the prioritized user messages
00:48:50.000 | like comes from is like, people have asked us
00:48:54.000 | to just like let them pin messages
00:48:56.720 | that they want to be left in the conversation.
00:48:59.920 | And yeah, and then that seems to have like
00:49:04.160 | really gone a long way towards solving that problem.
00:49:07.080 | - Yeah, and then you have a run and replit thing.
00:49:09.640 | Are you planning to build your own repl
00:49:11.600 | like learning some people trying to run the wrong code,
00:49:15.000 | unsafe code?
00:49:15.880 | - Yes, yes.
00:49:16.840 | So I think like in the long-term vision
00:49:19.280 | of like being a place where people can go from like idea
00:49:23.120 | to like fully working code, having a code sandbox,
00:49:28.120 | like a natively integrated code sandbox
00:49:30.560 | makes a lot of sense.
00:49:31.680 | And replit is great and people use that feature.
00:49:35.360 | But yeah, I think there's more we can do
00:49:37.200 | in terms of like having something
00:49:39.520 | a bit closer to code interpreter
00:49:40.760 | where it's able to run the code
00:49:43.160 | and then like recursively iterate on it, exactly.
00:49:45.920 | - I think replit is working on APIs
00:49:48.320 | to enable you to do that.
00:49:49.400 | - Yep.
00:49:50.240 | - So Amjad has specifically told me in person
00:49:51.880 | that he wants to enable that for people.
00:49:53.760 | At the same time, he's also working on his own models.
00:49:55.720 | - Right.
00:49:56.560 | - And Ghostwriter and all the other stuff.
00:49:57.800 | - Yeah.
00:49:58.640 | - So it's gonna get interesting.
00:49:59.600 | Like he wants to power you, but also compete with you.
00:50:02.200 | - Yeah.
00:50:03.040 | And like, and we love replit.
00:50:04.680 | I think that a lot of these,
00:50:06.680 | like a lot of the companies in our space,
00:50:09.120 | like we're all going to converge
00:50:11.880 | to solving a very similar problem,
00:50:16.240 | but from a different angle.
00:50:17.720 | So like replit approaches this problem from the IDE side.
00:50:20.680 | Like they started as like this IDE
00:50:22.920 | that you can run in the browser.
00:50:24.880 | And they started for like from that side,
00:50:26.360 | making coding just like more accessible.
00:50:28.240 | And we're approaching it from the side of like an LLM
00:50:32.440 | that's just like connected to everything
00:50:34.760 | that it needs to be connected to,
00:50:36.320 | which includes your code context.
00:50:37.960 | So that's why like we're kind of making,
00:50:39.640 | you know, inroads into IDEs.
00:50:42.080 | But we're kind of, we're approaching this problem
00:50:43.200 | from different sides.
00:50:44.040 | And I think it will be interesting
00:50:45.400 | to see where things end up.
00:50:48.960 | But I think that, you know, in the long, long term,
00:50:52.360 | we have an opportunity to also just have like
00:50:56.480 | this general kind of like technical reasoning engine product
00:51:00.680 | that's, you know, potentially also not just for programmers
00:51:06.440 | and it's also powered in this web interface.
00:51:07.760 | Like where there's potential,
00:51:11.400 | I think other things that we will build
00:51:14.280 | that eventually might go beyond like our current scope.
00:51:18.120 | - Exciting, we'll look forward to that.
00:51:19.840 | - Thank you.
00:51:20.680 | - We're gonna zoom out a little bit
00:51:22.120 | into sort of AI ecosystem stories,
00:51:25.560 | but first we gotta get the Paul Graham, Ron Conway story.
00:51:28.520 | - Yeah, so flashback to last summer,
00:51:32.160 | we're in the YC batch.
00:51:33.520 | And we're doing the summer batch, summer 22.
00:51:39.480 | So the summer batch runs from June to September,
00:51:43.040 | approximately.
00:51:44.360 | This was late July, early August,
00:51:47.520 | right around the time that many like YC startups
00:51:50.840 | start like going out, like gearing up,
00:51:52.640 | here's how we're gonna pitch investors and everything.
00:51:55.320 | And at the same time, me and my co-founder, Justin,
00:51:58.320 | we were planning on moving to New York.
00:52:00.800 | So for a long time, actually,
00:52:03.240 | we were thinking about building this company in New York,
00:52:06.920 | mainly for personal reasons, actually.
00:52:08.640 | 'Cause like during the pandemic,
00:52:10.720 | pre-Chad GPT, pre last year, pre the AI boom,
00:52:13.960 | SF unfortunately really kind of like--
00:52:17.640 | - So did.
00:52:18.480 | - Lost its luster, yeah, like no one was here.
00:52:20.880 | It was far from clear, like if there would be an AI boom,
00:52:26.280 | if like SF would be like the AI--
00:52:28.240 | - Back.
00:52:29.080 | - Yeah, exactly.
00:52:29.920 | If SF would be so back, as everyone is saying these days,
00:52:34.920 | it was far from clear.
00:52:36.160 | And so, and all of our friends, we were graduating college,
00:52:39.760 | 'cause like we happened to just graduate college
00:52:42.280 | and immediately start YC.
00:52:43.400 | Like we didn't even have, I think we had a week in between.
00:52:47.000 | So it was just--
00:52:47.840 | - You didn't bother looking for jobs,
00:52:48.680 | you were just like, this is all good.
00:52:50.080 | - Well, actually, both me and my co-founder,
00:52:51.240 | we had jobs that we secured in 2021
00:52:54.760 | from previous internships, but we both, like we,
00:52:57.520 | funny enough, when I spoke to my boss's boss
00:53:02.520 | at the company at which, like where I reneged my offer,
00:53:06.840 | I told him we got into YC.
00:53:08.920 | They actually said, yeah, you should do YC.
00:53:10.320 | - Wow, that's very selfless, that's great.
00:53:12.480 | - Yeah, that was really great that they did that.
00:53:14.240 | - In San Francisco, they would have offered
00:53:15.400 | to invest as well.
00:53:16.240 | - Yes, yes, they would have.
00:53:18.880 | But yeah, we were both planning to be in New York.
00:53:21.240 | And all of our friends were there from college.
00:53:24.160 | And so like at this point, like we have this whole plan,
00:53:28.360 | we're like on August 1st, we're gonna move to New York.
00:53:30.960 | And we had like this Airbnb for the month of New York,
00:53:33.400 | we're gonna stay there and we're gonna work
00:53:34.520 | and like all of that.
00:53:37.040 | The day before we go to New York, I call Justin
00:53:40.840 | and I just, I tell him like, why are we doing this?
00:53:44.000 | Like, why are we doing this?
00:53:45.400 | 'Cause in our batch, like by the time
00:53:48.720 | that August 1st rolled around, all of our mentors
00:53:51.120 | at YC were saying like, hey, like,
00:53:52.880 | you should really consider staying in SF.
00:53:54.400 | - It's the hybrid batch, right?
00:53:55.680 | - Yeah, it was the hybrid batch.
00:53:57.720 | But like there were already signs that like something
00:54:00.520 | was kind of like afoot in SF,
00:54:02.520 | even if like we didn't fully wanna admit it yet.
00:54:05.360 | And so we were like, no, I don't know.
00:54:08.080 | And so the day before, like, I don't know,
00:54:12.000 | something kind of clicked when the rubber met the road
00:54:14.760 | and it was time to go to New York.
00:54:16.280 | We were like, why are we doing this?
00:54:18.080 | And like, we didn't have any good reasons
00:54:20.520 | for staying in New York at that point
00:54:22.040 | beyond like our friends are there.
00:54:24.920 | So we still go to New York 'cause like we have the Airbnb,
00:54:28.520 | like we don't have any other kind of place to go
00:54:30.080 | for the next few weeks.
00:54:31.240 | We're in New York.
00:54:32.920 | And New York is just unfortunately too much fun.
00:54:36.720 | Like all of my other friends from college
00:54:39.480 | who are just, you know, like basically starting their jobs,
00:54:42.080 | starting their lives as adults, you know,
00:54:45.160 | they just got stuck into these jobs.
00:54:46.960 | They're making all this money and they're like partying
00:54:48.640 | and like all these things are happening.
00:54:50.400 | And like, yeah, it's just a very distracting place to be.
00:54:52.720 | And so we were just like sitting in this like small,
00:54:55.040 | you know, like cramped apartment, terrible posture,
00:54:57.920 | trying to get as much work done as we can.
00:55:00.400 | Too many distractions.
00:55:01.920 | And then we get this email from YC saying
00:55:06.160 | that Paul Graham is in town, in SF,
00:55:08.880 | and he is doing office hours with a certain number
00:55:13.880 | of startups in the current batch.
00:55:16.560 | And whoever signs up first gets it.
00:55:20.240 | And I happened to be super lucky.
00:55:21.480 | I was about to go for a run,
00:55:22.920 | but I just, I saw the email notification
00:55:24.480 | come across the street.
00:55:25.320 | I immediately clicked on the link.
00:55:26.600 | And like immediately, like half the spots were gone,
00:55:30.440 | but somehow the very last spot was still available.
00:55:35.240 | And so I picked the very, very last time slot
00:55:37.520 | at 7 p.m. semi-strategically, you know,
00:55:40.920 | so we would have like time to go over.
00:55:42.760 | And also because like, I didn't really know
00:55:47.080 | how we're going to get to SF yet.
00:55:48.320 | And so we made a plan that we're going to fly
00:55:50.880 | from New York to SF and back to New York in one day
00:55:54.720 | and do like the full round trip.
00:55:55.880 | And we're going to meet with PG
00:55:58.640 | at the YC Mountain View office.
00:56:00.320 | And so we go there, we do that.
00:56:03.840 | We meet PG, you know, we tell him about the startup.
00:56:08.040 | And one thing I love about PG is that he gets like,
00:56:11.200 | he gets so excited.
00:56:12.640 | Like when he gets excited about something,
00:56:14.120 | like you can see his eyes like really light up.
00:56:17.080 | And he'll just start asking you questions.
00:56:19.360 | In fact, it's a little challenging sometimes
00:56:20.640 | to like finish kind of like the rest of like
00:56:22.360 | the description of your pitch.
00:56:23.480 | 'Cause like, he'll just like start like, you know,
00:56:26.960 | asking all these questions about how it works
00:56:29.360 | and like, you know, what's going on.
00:56:31.160 | - And what was the most challenging question
00:56:33.640 | that he asked you?
00:56:34.520 | - I think that like, he was asking us a lot of questions
00:56:38.680 | about like, like really how it worked.
00:56:41.240 | 'Cause like, as soon as like we told him like,
00:56:42.480 | hey, like we think that the future of search
00:56:45.280 | is answers, not links.
00:56:47.360 | Like, we could really see like the gears turning
00:56:51.680 | in his head.
00:56:52.800 | I think we were like the first demo of that,
00:56:55.640 | that he saw.
00:56:56.480 | - And you're like 10 minutes with him, right?
00:56:57.400 | - We had like 45, yeah, we had a decent chunk of time.
00:57:02.200 | Yeah.
00:57:03.040 | And so we tell him how it works.
00:57:04.720 | Like, he's very excited about it.
00:57:07.000 | And I just like, I just blurt it out.
00:57:08.440 | I just like ask him to invest.
00:57:10.200 | And he hasn't even seen the product yet.
00:57:12.160 | Oh, we just asked him to invest.
00:57:14.160 | And he says, yeah.
00:57:15.040 | And like, we're super excited about that.
00:57:18.960 | - And you're like, he haven't started your batch.
00:57:21.440 | - No, no, no, this is like...
00:57:23.400 | - This is after your batch.
00:57:24.240 | - Yeah, this is about halfway through the batch.
00:57:27.440 | Or two, two, no, two thirds of the batch.
00:57:29.040 | - Which when you're like not technically fundraising yet.
00:57:31.320 | - Or about to start fundraising.
00:57:32.920 | Yeah, so we have like this demo
00:57:34.120 | and like we showed him and like,
00:57:35.240 | there was still a lot of issues with the product.
00:57:37.800 | But I think like, it must have like still kind of
00:57:40.920 | like blown his mind in some way.
00:57:42.520 | And so, yeah, so like we're having fun.
00:57:46.680 | He's having fun.
00:57:48.720 | We have this dinner planned with this other friend
00:57:53.320 | that we had in SF.
00:57:54.160 | 'Cause we were only there for that one day.
00:57:55.480 | So we thought, okay, after an hour, we'll be done.
00:57:59.040 | We'll grab dinner with our friend
00:58:00.120 | and we'll fly back to New York.
00:58:01.320 | But PG was like, I'm having so much fun.
00:58:03.400 | Like, do you wanna...
00:58:05.080 | - Have dinner?
00:58:05.920 | - Yeah, come to my house.
00:58:07.280 | Or he's like, I gotta go have dinner with my wife, Jessica.
00:58:12.280 | Who's also awesome, by the way.
00:58:14.160 | - She's like the heart of YC.
00:58:15.720 | - Yeah, yeah, like Jessica does not get enough credit
00:58:19.120 | as an aside for her role.
00:58:20.880 | - He tries, he tries.
00:58:21.880 | - He tries, but like, yeah,
00:58:23.120 | Jessica really deserves a lot of credit.
00:58:25.560 | 'Cause she like, he understands like the technical side
00:58:29.600 | and she understands people and together,
00:58:30.960 | they're just like a phenomenal team.
00:58:32.480 | But he's like, yeah, I gotta go see Jessica.
00:58:34.640 | But you guys are welcome to come with.
00:58:36.480 | Do you wanna come with?
00:58:37.320 | And we're like, we have this friend
00:58:39.240 | who's like right now outside of,
00:58:42.680 | like we're literally outside the door
00:58:44.520 | who like we also promised to get dinner with.
00:58:47.440 | So like, we'd love to, but like, I don't know if we can.
00:58:49.000 | He's like, oh, he's welcome to come too.
00:58:51.000 | So like, yeah, so all of us just like hop in his car
00:58:54.720 | and we go to his house and then we just like have this,
00:58:57.360 | like we have dinner and we have this,
00:59:01.400 | like just chat about the future of search.
00:59:03.320 | Like I remember him telling Jessica distinctly,
00:59:05.880 | like our kids and our kids' kids
00:59:10.880 | are like, are not gonna know what like a search result is.
00:59:13.760 | Like they're just gonna like have answers.
00:59:16.000 | So that was really like a mind blowing,
00:59:18.360 | like inflection point moment for sure.
00:59:21.320 | - Wow, that email changed your life.
00:59:22.880 | - Absolutely.
00:59:23.720 | - And you also just spoiled the booking system for PG.
00:59:27.160 | 'Cause now everyone's just gonna go after the last slot.
00:59:30.320 | - Oh man, yeah, but like,
00:59:31.520 | I don't know if he even does that anymore.
00:59:33.440 | - He does, he does.
00:59:34.280 | Yeah, I've met other founders that he did it this year.
00:59:36.160 | - This year, gotcha.
00:59:37.000 | But when we told him about how we did it,
00:59:39.160 | he was like, I am like frankly shocked
00:59:41.120 | that like YC just did like a random like scheduling system.
00:59:44.080 | They didn't like do anything else, but.
00:59:46.480 | - Okay, and then he introduces Duron Conway.
00:59:48.360 | - Yes.
00:59:49.200 | - Who is one of the most legendary angels in Silicon Valley.
00:59:52.600 | - Yes, so after PG invested,
00:59:54.960 | the rest of our round came together pretty quickly.
00:59:58.080 | And so like--
00:59:58.920 | - By the way, I'm surprised,
01:00:00.480 | like it might feel like playing favorites, right?
01:00:03.440 | Within the current batch to be like,
01:00:05.360 | yo, PG invested in this one.
01:00:07.080 | - Right, and like, yes.
01:00:12.080 | - Too bad for the others.
01:00:15.400 | - Too bad for the others, I guess.
01:00:17.040 | I think this is a bigger point about YC
01:00:20.160 | and like these accelerators in general is like,
01:00:21.920 | YC gets like a lot of criticism from founders
01:00:23.880 | who feel like they didn't get value out of it.
01:00:26.680 | But like, in my view, YC is what you make of it.
01:00:30.120 | Like, and the YC tells you this,
01:00:31.800 | they're like, you really got to grab this opportunity,
01:00:33.800 | like buy the balls and make the most of it.
01:00:36.800 | And if you do, then it could be the best thing in the world.
01:00:39.400 | And if you don't, and if you're just kind of like a passive,
01:00:41.800 | even like an average founder in YC, you're still gonna fail.
01:00:43.720 | And they tell you that, they're like,
01:00:45.560 | if you're average in your batch, you're gonna fail.
01:00:48.760 | Like you have to just be exceptional in every way.
01:00:51.240 | And so yeah, after PG invested,
01:00:52.600 | the rest of our round came together pretty quickly,
01:00:54.680 | which I'm very fortunate for.
01:00:55.520 | And yeah, he introduces to Ron.
01:00:57.040 | And after he did, I get a call from Ron.
01:01:00.200 | And Ron says like, hey, like, you know,
01:01:02.520 | PG tells me what you're working on,
01:01:03.600 | I'd love to come meet you guys.
01:01:05.080 | And I'm like, wait, no way.
01:01:06.760 | And we're just holed up in this like little house
01:01:10.080 | in San Mateo, which is a little small,
01:01:13.080 | but you know, it had a nice patio.
01:01:14.160 | In fact, we had like our monitors set up
01:01:16.440 | outside on the deck out there.
01:01:18.680 | And so Ron Conway comes over,
01:01:20.680 | we go over to the patio, where like our workstation is.
01:01:24.200 | And Ron Conway, he's known for having like this notebook
01:01:27.840 | that he goes around with, where he like sits down
01:01:30.960 | with the notebook and like takes very, very detailed notes.
01:01:33.680 | So he never like forgets anything.
01:01:35.440 | So he sits down with his notebook and he asks us like,
01:01:38.400 | hey guys, like, what do you need?
01:01:40.640 | And we're like, oh, we need GPUs.
01:01:42.400 | Like back then the GPU shortage
01:01:45.280 | wasn't even nearly as bad as it is now.
01:01:47.120 | But like, even then it was still challenging
01:01:49.840 | to get like the quota that we needed.
01:01:51.600 | And he's like, okay, no problem.
01:01:53.640 | And then like he leaves a couple hours later,
01:01:56.280 | we get an email and we're CC'd on an email
01:01:59.160 | that Ron wrote to Jensen, the CEO of NVIDIA,
01:02:03.400 | saying like, hey, like these guys need GPUs.
01:02:06.840 | - You didn't say how much?
01:02:07.680 | It was just like, just give them GPUs.
01:02:09.000 | - Basically, yeah.
01:02:10.120 | Ron is known for writing these like one-liner emails
01:02:13.720 | that are like very short, but very to the point.
01:02:16.400 | And I think that's why like everyone responds to Ron.
01:02:20.120 | Everyone loves Ron.
01:02:21.640 | And so Jensen responds.
01:02:23.160 | He responds quickly, like tagging this VP of AI at NVIDIA.
01:02:26.440 | And we start working with NVIDIA, which is great.
01:02:29.360 | And something that I love about NVIDIA, by the way,
01:02:31.240 | is that after that intro,
01:02:32.800 | we got matched with like a dedicated team.
01:02:35.360 | And at NVIDIA, they know that they're gonna win regardless.
01:02:40.360 | So they don't care where you get the GPUs from.
01:02:43.680 | They're like, they're truly neutral,
01:02:45.280 | unlike various sales reps that you might encounter
01:02:47.840 | at various like clouds and, you know,
01:02:50.040 | hardware companies, et cetera.
01:02:51.880 | Like they actually just wanna help you
01:02:53.160 | 'cause they know they don't care.
01:02:54.240 | Like regardless, they know that
01:02:55.640 | if you're getting NVIDIA GPUs, they're still winning.
01:02:58.280 | So I guess that's a tip is that like,
01:03:01.840 | if you're looking for GPUs, like NVIDIA,
01:03:04.200 | yeah, they'll help you do it.
01:03:05.440 | - So like, so, okay, and then just to tie up this thing,
01:03:08.600 | because it, so first of all, that's a fantastic story.
01:03:10.960 | And like, you know, I just wanted to let you tell that
01:03:13.200 | 'cause it's special.
01:03:14.600 | That is a strategic shift, right?
01:03:17.840 | That you already decided to make by the time you met Ron,
01:03:20.040 | which is we are going to have our own hardware.
01:03:22.240 | We're gonna rack him in a data center somewhere.
01:03:24.800 | - Not even that we need our own hardware
01:03:26.840 | 'cause actually we don't, but we just need GPUs period.
01:03:31.360 | And like every cloud loves,
01:03:33.760 | like they have their own sales tactics
01:03:35.400 | and like they wanna make you commit to long terms
01:03:38.480 | and like very non-flexible terms.
01:03:40.800 | And like, there's all these,
01:03:43.280 | there's a web of different things
01:03:44.600 | that you kind of have to navigate.
01:03:45.680 | NVIDIA will kind of be to the point and be like,
01:03:47.280 | okay, you can do this on this cloud,
01:03:49.480 | this on this cloud.
01:03:51.120 | If like, this is your budget,
01:03:52.640 | maybe you wanna consider buying as well.
01:03:53.800 | Like they'll help you walk through what the options are.
01:03:56.960 | And in terms of software,
01:04:00.000 | and the reason why they're helpful is
01:04:01.840 | 'cause like they look at the full picture.
01:04:03.680 | So they'll help you with the hardware.
01:04:06.480 | And in terms of software,
01:04:07.960 | they actually implemented a custom feature for us
01:04:10.520 | in Faster Transformer,
01:04:12.120 | which is one of their libraries.
01:04:13.520 | - For you?
01:04:14.360 | - For us, yeah.
01:04:15.560 | Which is wild.
01:04:16.400 | Yeah, I don't think they would have done it otherwise.
01:04:18.640 | They implemented streaming generation for T5 based models.
01:04:23.960 | Which we were running at the time,
01:04:25.560 | up until we switched to GPT in February, March of this year.
01:04:30.560 | So they implemented that just for us actually,
01:04:32.960 | and Faster Transformer.
01:04:34.120 | And so like, they'll help you look at the complete picture
01:04:36.720 | and then just help you get done what you need to get done.
01:04:40.400 | - And I know one of your interests is also local models,
01:04:44.760 | open source models and hardware kind of goes hand in hand.
01:04:48.120 | Any fun projects, explorations in the space
01:04:51.520 | that you wanna share with local Llamas?
01:04:54.200 | - Yeah, so it's something that we're very interested in.
01:04:59.200 | Because something that kind of we're hearing a lot about
01:05:04.400 | is like people want something like find,
01:05:08.200 | especially companies,
01:05:09.000 | but they wanna have it like within like their own sandbox.
01:05:11.840 | They wanna have it like on hardware that they control.
01:05:14.840 | And so I'm super, super interested
01:05:16.040 | in how we can get big models to run efficiently
01:05:19.880 | on local hardware.
01:05:22.280 | And so like, Llamas is great.
01:05:25.120 | Llamas CPP is great.
01:05:26.480 | Very interested in like where the whole quantization thing
01:05:31.840 | is going.
01:05:32.680 | 'Cause like, obviously there are all these like great
01:05:34.080 | quantization libraries now that go to four bit, eight bit,
01:05:37.960 | but specifically int eight and int four.
01:05:42.800 | - Which is the lowest it can go, right?
01:05:45.320 | - Right, but with int eight,
01:05:48.000 | there's not necessarily a speed increase.
01:05:51.560 | It's just a storage optimization.
01:05:52.800 | Yeah, so we have these great quantization libraries
01:05:55.280 | that for the most part are able to get the size down
01:05:59.360 | with not that much quality loss.
01:06:01.560 | But there is some,
01:06:02.520 | like the quantized models currently are actually worse
01:06:05.080 | than the non-quantized ones.
01:06:06.520 | And so I'm very curious if the future is something like
01:06:08.760 | what NVIDIA is doing with their implementation of FP8,
01:06:13.000 | which they're implementing
01:06:13.880 | in their transformer engine library.
01:06:15.760 | Where basically once FP8 support is kind of more widespread
01:06:20.760 | and hardware can support it efficiently,
01:06:26.520 | you can kind of switch between the two different FP8 formats
01:06:31.240 | one with greater precision, one with greater range,
01:06:34.600 | and then combine that with only not doing FP8 on every layer
01:06:39.600 | and doing like a mixed precision
01:06:41.680 | with like FP32 on some layers.
01:06:43.720 | And like NVIDIA claims that this strategy
01:06:45.840 | that they're kind of demoing with the H100
01:06:50.040 | has no degradation.
01:06:52.160 | And so it remains to be seen
01:06:55.200 | whether that is really true in practice,
01:06:57.280 | but that's something that we're excited about
01:06:58.800 | and whether that can be applied to like Macs
01:07:02.760 | and other hardware once they get FP8 support as well.
01:07:05.720 | - Oh, we should also talk about hiring.
01:07:07.600 | How do you get your info, right?
01:07:08.960 | Like you seem to know, you seem self-taught.
01:07:13.040 | - Yeah, so I've always just,
01:07:16.960 | well, I'm fortunate to have like a decent systems background
01:07:20.000 | from UT Austin and somewhat of a research background,
01:07:23.120 | even though like I didn't publish any papers,
01:07:26.360 | but like I went through all the motions.
01:07:28.200 | Like I didn't publish the thesis that I wrote
01:07:30.600 | mainly out of time because I was doing both of that
01:07:33.160 | and the startup at the same time.
01:07:34.280 | And then I graduated and then it was YC
01:07:35.480 | and then everything was kind of one after another.
01:07:38.080 | But like I'm very fortunate to kind of have like the systems
01:07:40.200 | and like a bit of like a research background.
01:07:41.960 | But yeah, for the most part, outside of that foundation,
01:07:45.600 | like I've always just,
01:07:46.880 | whenever I've been interested in something,
01:07:47.920 | I just like, I go deep.
01:07:49.720 | - Give people tips, right?
01:07:50.720 | Like where do you, what fire hose do you drink from?
01:07:53.160 | - Yeah, exactly.
01:07:54.000 | So like whenever I see something that blows my mind,
01:07:56.560 | the way that that initial Hugging Face demo did,
01:07:59.000 | that was like the start of everything.
01:08:00.600 | I'll just, yeah, I'll just like,
01:08:02.760 | I'll start from the beginning.
01:08:03.720 | I'll like, if I don't know anything,
01:08:05.600 | then like I'll just, I'll start by
01:08:10.360 | just trying to get a mental model of what is happening.
01:08:12.880 | Like first I need to understand what,
01:08:15.360 | so I can understand like the why, the how and the why.
01:08:19.120 | And once I can understand that,
01:08:20.760 | then I can make my own hypotheses about like,
01:08:24.320 | okay, here are the assumptions
01:08:25.440 | that the authors of this made.
01:08:27.880 | And here's why maybe they're correct, maybe they're wrong.
01:08:30.360 | And here's how like I can improve on it and iterate on it.
01:08:33.600 | And I guess that's the mindset that I approach it from
01:08:36.040 | is like once I understand something,
01:08:37.360 | like how can it be better?
01:08:39.120 | How can it be faster?
01:08:39.960 | How can it be like more accurate?
01:08:42.080 | And so I guess for anyone starting now,
01:08:43.400 | like I would have used Find.
01:08:44.960 | If I was starting now,
01:08:46.680 | 'cause like I would have loved to just have been able
01:08:48.560 | to say like, hey, like I have no idea what I'm doing.
01:08:51.080 | Can you just like be this like technical research assistant
01:08:54.360 | and kind of hold my hand and like ask me clarifying questions
01:08:57.320 | and like help me like formalize my assumptions
01:08:59.280 | like along the way.
01:09:00.360 | I would have loved that.
01:09:01.400 | But yeah, I just kind of did that myself.
01:09:03.320 | - Yeah.
01:09:04.160 | Recording looms of yourself using Find
01:09:05.400 | actually would be pretty interesting.
01:09:06.640 | - Yeah.
01:09:07.480 | - Because I think you would use Find differently
01:09:08.840 | than people would by themselves.
01:09:11.280 | - I think so, yeah.
01:09:12.120 | - Unprompted.
01:09:12.960 | - I generally use Find for everything,
01:09:16.680 | which is definitely, yeah.
01:09:17.640 | It's like, no, no, even like non-technical questions as well.
01:09:20.360 | 'Cause that's just something I'm curious about.
01:09:23.160 | But that's generally like,
01:09:24.480 | that's less of a usage pattern nowadays.
01:09:26.200 | Like most people generally for the most part
01:09:29.200 | do technical questions on Find.
01:09:31.680 | And that is completely understandable
01:09:34.160 | because of very deliberate decisions that we've made
01:09:36.640 | in how we've optimized the product.
01:09:38.440 | Like we've optimized the product
01:09:39.920 | very much in a quality first manner
01:09:43.080 | as opposed to a like speed first
01:09:46.040 | or like some balance of the two matters.
01:09:47.880 | So we're like, we have to run GPT-4
01:09:50.160 | or some GPT-4 equivalent by default.
01:09:52.560 | And it has to give like a good answer
01:09:54.920 | to like a very demanding technical audience
01:09:56.840 | for people who will leave.
01:09:57.760 | So that's just the trade off.
01:09:59.520 | So like sometimes it's slower for like simple questions,
01:10:04.240 | but like we did that on purpose, so.
01:10:07.160 | - Awesome.
01:10:08.000 | Before we do a lightning round,
01:10:10.160 | call for hiring any roles you're looking for.
01:10:13.640 | What should people know about working at Find?
01:10:15.400 | - Yeah.
01:10:16.240 | So we really straddled the line
01:10:18.080 | between product and research at Find.
01:10:20.760 | Like for the past little while,
01:10:25.400 | a lot of the work that we've done has been solely product.
01:10:28.320 | But we also do, especially now with the Find model,
01:10:31.480 | a very particular kind of applied research
01:10:34.800 | in trying to apply the very latest techniques
01:10:39.040 | and techniques that might not,
01:10:40.280 | that have not even been proven yet
01:10:42.320 | to training the very, very best model for our vertical.
01:10:46.800 | And the two go hand in hand because the product,
01:10:51.320 | the UI, the UX is kind of model agnostic,
01:10:54.080 | but when it has a better kind of kernel,
01:10:57.200 | as Andrej Karpathy put it, plugged into it,
01:11:00.240 | it gets so much better.
01:11:01.080 | So we're doing really kind of both at the same time.
01:11:03.560 | And so someone who like enjoys seeing both of those sides,
01:11:06.960 | like doing something very tangible that affects the user,
01:11:11.080 | high quality, reliable code that runs in production,
01:11:15.480 | but also having that chance to experiment
01:11:18.240 | with like building these models.
01:11:20.760 | Yeah, we'd love to talk to you.
01:11:23.640 | - And the title is Applied AI Engineer.
01:11:25.480 | - I don't know what the title is.
01:11:26.520 | Like that is one title,
01:11:30.600 | but I don't know if like this really exists
01:11:33.760 | 'cause I feel like we're too rigid
01:11:35.520 | about like bucketing people into categories.
01:11:37.560 | - Yeah, founding engineer is fine.
01:11:39.120 | - Yeah, well, we already have a founding engineer technically.
01:11:42.240 | - Well, for what it's worth,
01:11:43.920 | OpenAI is adopting Applied AI Engineer.
01:11:45.880 | - Really?
01:11:46.720 | - So it's becoming a thing.
01:11:48.000 | - It's becoming a thing.
01:11:49.200 | - All right. - We'll see.
01:11:51.480 | - We'll see.
01:11:52.760 | Lightning round.
01:11:53.880 | - Lightning round.
01:11:54.720 | - Yeah, we have two questions,
01:11:56.000 | acceleration, exploration, and then a takeaway.
01:11:58.440 | So the acceleration one is what's something
01:12:00.760 | that already happened in AI
01:12:02.520 | that you thought would take much longer?
01:12:04.320 | - Yeah, the jump from these like models
01:12:08.320 | being glorified summarization models
01:12:10.920 | to actual powerful reasoning engines
01:12:13.480 | happened much faster than we thought.
01:12:16.680 | 'Cause like our product itself transitioned
01:12:18.560 | from being kind of, you know,
01:12:19.840 | this glorified summarization product
01:12:21.480 | to now like mostly a reasoning heavy product.
01:12:24.320 | And we had no idea that this would happen this fast.
01:12:26.040 | Like we thought that like there'd be a lot more time
01:12:29.720 | and like many more things that needed to happen
01:12:32.600 | before we could do some level of like intelligent reasoning
01:12:37.600 | on a low level about people's code.
01:12:40.160 | But it's already happened
01:12:41.000 | and it happened much faster than we could have thought.
01:12:45.200 | But I think that leads into your next point.
01:12:49.880 | - Which is exploration. - Exploration, yeah.
01:12:52.080 | - What do you think is the most interesting
01:12:53.480 | unsolved question in AI? - Yes.
01:12:55.360 | I think solving hallucinations,
01:12:58.440 | being able to guarantee that the answer will be correct
01:13:03.960 | is I think super interesting.
01:13:05.160 | And it's particularly relevant to us
01:13:07.720 | 'cause like we operate in a space
01:13:09.880 | where like everything needs to be correct.
01:13:11.720 | Like the code, like not just the logic,
01:13:14.400 | but like the implementation,
01:13:15.840 | everything has to be completely correct.
01:13:18.320 | And there's a lot of very interesting work
01:13:19.720 | that's going on in this space.
01:13:21.320 | Some of it is approaching it
01:13:22.440 | from the angle of formal grammars.
01:13:25.600 | There's a very interesting paper that came out recently.
01:13:28.400 | I forget where it came out of,
01:13:30.320 | but it came, the paper is basically,
01:13:33.200 | you can define a grammar that restricts
01:13:37.080 | and modifies the models.
01:13:39.440 | - Log props.
01:13:40.280 | - Exactly, like decoding strategy
01:13:42.360 | to only conform to that grammar.
01:13:46.400 | And that helps it.
01:13:49.400 | - Is this LMQL?
01:13:52.160 | Because I feel like LMQL is a little bit too structured
01:13:54.520 | for if the goal is avoiding hallucination.
01:13:57.640 | That's such a vague goal.
01:13:59.360 | - Yeah.
01:14:00.440 | - Yeah, I haven't seen it.
01:14:01.280 | - This is only something we've begun to take a look at.
01:14:03.960 | I haven't fully read the paper yet.
01:14:05.680 | Like I've only kind of skimmed the abstract,
01:14:07.600 | but it's something that like,
01:14:08.680 | we're definitely interested in exploring further.
01:14:11.000 | But something that we are like a bit further along on
01:14:12.720 | is also like exploring reinforcement learning
01:14:15.520 | for correctness, as opposed to only harmfulness
01:14:19.080 | the way it has typically been used in my research.
01:14:21.080 | - We just did a CEO paper on that.
01:14:22.320 | - Yeah.
01:14:23.160 | - Just a quick follow-up.
01:14:24.000 | Do you have internal evals for what hallucination rate is
01:14:27.920 | on stock GPT-4 and then maybe what yours is
01:14:31.720 | after fine tuning?
01:14:32.720 | - We, yeah.
01:14:33.600 | So we don't measure hallucination directly
01:14:38.600 | in our internal benchmarks.
01:14:42.120 | We more measure like was the answer right or was it wrong?
01:14:45.360 | We measure hallucination indirectly
01:14:47.560 | by evaluating the context,
01:14:51.240 | like the RAG context fed into the model as well.
01:14:54.200 | So basically, if the context was bad and the answer was bad,
01:14:58.320 | then chances are like, it's the context,
01:15:00.920 | but if the context was good,
01:15:02.360 | and it just like misinterpreted that
01:15:05.320 | or had the wrong conclusion,
01:15:07.480 | then like we can take different steps there.
01:15:10.160 | - Harrison from LangChain has been talking about
01:15:11.640 | this sort of two by two matrix with the RAGs people.
01:15:15.080 | It's pretty simple concept.
01:15:16.680 | What's the source of error?
01:15:17.800 | - Exactly.
01:15:18.640 | And I've been talking to Harrison actually
01:15:19.960 | about like a more like structured way,
01:15:22.240 | perhaps within LangChain to like two evals.
01:15:24.600 | 'Cause I think that's a massive problem.
01:15:26.040 | Like every single eval is different
01:15:28.760 | for these big large language models
01:15:31.160 | and doing them in a quantitative way is really hard,
01:15:34.320 | but it's possible with like a platform
01:15:36.600 | that I think harnesses GPT-4 in the right way.
01:15:39.520 | That and also perhaps a stricter prompting
01:15:45.360 | stricter prompting language,
01:15:47.760 | like a prompting markup language for prompting models
01:15:50.960 | is something I'm also very interested in.
01:15:53.120 | 'Cause we've written some very, very complex prompts,
01:15:56.280 | particularly for a VS code extension
01:15:58.080 | to like do like very fancy things with people's code.
01:16:02.200 | And like, I wish there was a way
01:16:04.720 | that you could have like a more formal way,
01:16:06.600 | like a Python for LLM prompting
01:16:10.240 | that you could activate desired things
01:16:14.480 | within like the models execution flow
01:16:17.120 | through some other abstraction above language
01:16:22.120 | that has been like tested to do that some of the time,
01:16:26.440 | perhaps like combined with like formal grammar limitations
01:16:30.760 | and stuff like that.
01:16:32.560 | - Interesting.
01:16:33.400 | I have no idea what that looks like.
01:16:35.040 | - These are all things that have kind of emerged directly
01:16:38.080 | from the issues we're facing ourselves in China.
01:16:40.840 | But yeah, definitely very abstract so far.
01:16:43.760 | - Awesome.
01:16:44.600 | And yeah, just to wrap,
01:16:45.640 | what's one message, idea you want people to remember
01:16:49.880 | and think about?
01:16:50.720 | - Yeah, I think pay attention to those moments
01:16:55.640 | that like really jump out at you.
01:16:56.760 | Like when you see like a crazy demo
01:16:58.880 | that you can't like forget about
01:17:00.520 | or like something that you just think
01:17:02.920 | is really, really cool.
01:17:04.520 | Yeah, don't let that go.
01:17:07.800 | 'Cause I see a lot of people trying to start startups
01:17:11.120 | from the angle of like,
01:17:12.240 | hey, I just wanna start a startup
01:17:13.760 | or I'm just like bored at my job
01:17:15.560 | or like I'm like generally interested in the space.
01:17:18.880 | And I personally disagree with that.
01:17:20.640 | My take is that like, it's much easier,
01:17:25.360 | having been on both sides of that coin now,
01:17:27.120 | it's much easier to stay like obsessed every single day
01:17:32.120 | when the genesis of your startup
01:17:34.360 | is like something that really spoke to you
01:17:38.160 | in an incredibly meaningful way
01:17:41.200 | beyond just being kind of some insight
01:17:42.920 | that you've noticed.
01:17:43.920 | And I guess that's,
01:17:46.200 | I think like what we're discovering now
01:17:50.240 | is that like in the long, long term,
01:17:53.960 | like what you're really building
01:17:55.600 | is like you're building a group of people
01:17:57.680 | that believe this thing,
01:18:00.160 | that believe that like the future of solving problems
01:18:03.960 | and making things will be just like focused more
01:18:08.280 | on like the human thought process
01:18:10.240 | as opposed to the implementation part.
01:18:12.160 | And it's like, it's that belief that I think
01:18:18.760 | is what really gets you through the tough times
01:18:21.160 | and hopefully gets you to the other side someday.
01:18:25.720 | - Awesome.
01:18:26.560 | I kinda wanna play "Lose Yourself" as the outro music.
01:18:29.960 | - Then we'll get DMCA strike.
01:18:31.560 | - That'd be great though.
01:18:34.120 | - Thank you so much for coming on.
01:18:35.400 | - Yeah, thank you so much for having me.
01:18:37.000 | This was really fun.
01:18:38.600 | (upbeat music)
01:18:41.200 | (upbeat music)
01:18:43.800 | (upbeat music)
01:18:46.400 | (upbeat music)
01:18:49.000 | (upbeat music)
01:18:51.580 | [BLANK_AUDIO]