back to index

Latent Space LIVE! - Best of 2024: Startups, Vision, Open Src, Reasoning, & The Great Scaling Debate


Chapters

0:0
20:52 Conviction Startups Overview
78:44 Best of Vision 2024 (Roboflows x Moondrram)
322:59 Loubna (HF) synthetic data and smol models
457:40 The scaling/wall debate w/ Dylan Patel
465:26 Opening statements

Whisper Transcript | Transcript Only Page

00:03:35.700 | i need to share audio yeah because because i'm not sharing my screen right
00:03:42.700 | but so for the mic she's going to do it oh you don't need to
00:04:08.620 | so all the mics and like the audio from this room we're going to zoom yeah
00:04:26.620 | set up okay um can you take them yeah you just have to mute your yeah
00:04:37.260 | i might need to share your audio like if i present her yeah you can go but i'm just
00:04:47.340 | muting the music yeah we just need that yeah okay over there yeah yes um
00:05:15.980 | actually i don't know what else um i guess um yeah
00:06:45.820 | i mean the same pattern we're gonna we're gonna sleep
00:07:00.700 | you gonna do um
00:07:13.100 | um um yeah no you can't make it either two investors yeah
00:07:43.420 | but uh yeah that's what the north is um they have a great terms um i didn't know
00:08:10.300 | yeah we got in two days ago
00:08:35.740 | he doesn't know
00:09:21.740 | what uh yes there is no
00:09:44.220 | i'm sure
00:09:58.220 | right
00:10:26.700 | my question um
00:10:53.180 | yeah um
00:11:07.900 | made plans last night it's great yeah i actually realized we should probably hire a designer
00:11:12.620 | the weird thing is you have no idea like how many people are having trouble finding this place
00:11:25.820 | versus so many people like this like waking up late yes well it's okay but we're recording the
00:11:31.980 | whole thing when you said 500 i was imagining exactly just while they're going uh
00:12:04.060 | let me know if you can find the spot
00:12:07.820 | okay um you can just plug in here and i'll drop you the zoom link
00:12:21.820 | so we stream from zoom straight to youtube but we're also recording separately for the podcast
00:12:29.260 | and subsequent editing what is that
00:12:34.700 | it's the guest network here yeah
00:12:39.580 | to log in
00:12:48.700 | no that's not the link ignore the thing i just said
00:12:57.020 | yeah we need we need to show them a little harder ah
00:12:59.420 | okay that's the zoom link
00:13:04.780 | um so yeah it should be good for zoom yeah um do we need to send you a laptop there no that's
00:13:13.340 | great i i use my boots from pisa okay awesome
00:13:25.580 | we think so can you can you hear anything i'm not sure
00:13:29.180 | well there's like a slight delay but if i'm talking here it should show up there in like
00:13:36.940 | 10 seconds yeah okay oh one more thing for these mics yeah just make sure they use it so that it
00:13:44.860 | goes into zoom yeah this is on and then yeah we're using this manually but also
00:13:53.020 | simon fraser
00:13:55.020 | right
00:14:01.840 | you take it as well yeah uh all right i'm gonna wire you both up and now by the way oh i'm sorry
00:14:09.820 | hi sarah um and so there's a
00:14:13.580 | i feel like for sarah we need to give her a laugh
00:14:21.660 | no it's fine oh uh yes
00:14:24.780 | so plug in screw in
00:14:39.580 | looks like way about half of it
00:14:49.420 | oh is that it that's it there's a puppy thing but we're indoors so we don't need that
00:14:55.900 | wait that's so good i like your shirt oh thank you
00:15:01.100 | now yeah it's molly white
00:15:06.060 | yes he sent me a photo of this place and i was like we have to do this
00:15:17.340 | nice this is like the oxford union style
00:15:20.060 | i wanted to talk where the way they set up the conference there's a rotating platform
00:15:33.260 | the center it's like a stadium like thing it's like i don't know look at this is not intense
00:15:38.860 | yeah that's a really good bit yeah it's terrible but you should think about doing it next time
00:15:45.900 | yeah i mean it's funny to watch
00:15:48.460 | yeah you must be having a dinner office we just need a platform and then like
00:16:01.340 | that would be so amazing
00:16:02.620 | like we have a lot of people on youtube i don't know how many people
00:16:29.740 | i i just have all the um opening eye jokes that i've warm in my head
00:16:37.180 | um like how does uh how does rudolf update like yeah exactly i know i know i like that one too
00:16:53.580 | with 40 people online all right i am ready to transfer over to you
00:17:15.260 | yes yeah
00:17:33.500 | but this will never
00:17:46.540 | no this forever
00:17:49.580 | on it oh it's in your shirt now
00:17:57.420 | this sounds good i think that looks good actually yeah um and then on the top of it there's a button
00:18:06.780 | if you press it now um then it'll start flashing red and that's the report and it's not broadcasting
00:18:11.820 | it's just recording like that so if you push then okay there you go
00:18:19.180 | okay they're figuring out the suit okay so you want me to dial it into the sim uh yeah i think
00:18:41.100 | i sent it to you yeah i'll check text or email i'll text
00:18:45.180 | it's the second one i got that thank you
00:19:10.140 | screen so that plugs in yeah oh share screen too
00:19:14.060 | so right now it's just pinning wow not camera yeah so we can tweak it
00:19:18.780 | okay um and then i typically hide this
00:19:27.980 | messing with your computer settings there we go no it's just you know
00:19:36.300 | standard presenter issues uh this goes into stream and you're also mic'd up
00:19:42.780 | do you have the is it on yeah mine's recording yeah nice
00:19:46.700 | you want to check if
00:20:03.820 | do we get computer audio as well and do we get audio from the computer too
00:20:08.540 | okay uh we just have the hn demo but you can do a really good impression of
00:20:16.220 | i mean well um i think we can just run it in worst case we'll um
00:20:25.740 | we'll put it in the show notes it's fine yeah okay
00:20:32.940 | yeah do you do you want to start by saying anything yeah i think you should probably yeah
00:20:37.740 | okay i've been so busy with logistics and stuff that um i haven't done okay um i think we're
00:20:46.460 | going to kick this off um thanks to everyone who made it early morning um it's like really
00:20:51.580 | weird experiments that we wanted to try because one we saw this space uh and but two also i've
00:20:56.620 | been to a number of these things now and um i always felt like there was not enough like
00:21:01.100 | industry content for for people and we wanted an opportunity while everyone is in town in like one
00:21:06.780 | central spot to get everyone together um to talk about the best stuff of the year review the year
00:21:11.980 | it's very nice that new york is always the end of the year um and so i'm very honored that uh
00:21:17.420 | sarah and pranav have agreed to help us kick this off um sarah i've known for i was actually
00:21:23.100 | counting 17 years um and but she's she's gone she's uh been enormously successful as an ai
00:21:32.060 | investor um even uh even when you're doing your graylock days i was tracking your your investing
00:21:37.020 | and it's uh it's come a long way since then um and pranav uh i i've known i've known uh shorter
00:21:42.940 | but he's also starting to write uh really incredible posts and opinions about what he's
00:21:47.100 | seeing as an investor so i wanted to kick this off at the industry session um we have a great day of
00:21:52.380 | sort of like best of year recaps uh for uh lined up i think vic is here as well um and uh and the
00:21:59.580 | robo flow guys so uh i would just let you keep kick it off thank you hi everyone uh my name is
00:22:09.180 | sarah guo and thanks to uh sean and friends here for having me and pranav so um i'd start by just
00:22:16.860 | giving 30 seconds of intro i promise this isn't an ad uh we started a venture fund called conviction
00:22:22.140 | about two years ago here is a set of the investments we've made uh they range from
00:22:27.580 | companies at the infrastructure level in terms of feeding the revolution to foundation model
00:22:34.380 | companies alternative architectures domain specific training efforts and of course applications
00:22:39.420 | and the premise of the fund sean mentioned i worked at graylock for about a decade before
00:22:45.100 | that and came from the product engineering side was that uh we we thought that there was a really
00:22:50.700 | interesting technical revolution happening uh that it would probably be the biggest change in
00:22:55.580 | how people use technology in our lifetimes and that represented huge economic opportunity
00:23:00.380 | and and maybe that there would be an advantage versus the incumbent venture firms in that when
00:23:06.060 | the floor is lava the dynamics of the markets change the types of products and founders that
00:23:11.020 | you back change uh it's a lot for existing firms to ingest and a lot of their mental models may not
00:23:17.580 | apply in the same way uh and so there was an opportunity for first principles thinking and
00:23:22.060 | if we were right would we do really well and get to work with amazing people and so we are
00:23:25.980 | two years into that journey and we can share some of the opinions and predictions we have with all
00:23:29.820 | of you um sorry i'm just making sure that isn't actually blocking the whole presentation uh i'm
00:23:38.380 | proud it's going to start us off um so quick agenda for today we'll cover some of the model
00:23:43.580 | landscapes and themes that we've seen in 2024 uh what we think is happening in ai startups and then
00:23:48.220 | some of our latent priors uh on what we think is working in investing so the um i thought it'd be
00:23:54.780 | useful to start from like what was happening at neurops last year in december 2023 so in october
00:24:00.540 | 2023 opening i had just launched the ability to upload images to chat gpt which means up until
00:24:05.100 | that moment it's hard to believe but like roughly a year ago you could only input text and get text
00:24:08.940 | out of chat gpt um the mistral folks had just launched the mixed role model right before the
00:24:14.380 | beginning of neurops google had just announced gemini i very genuinely forgot about the existence
00:24:19.260 | of bard before making these slides and europe had just announced that they were doing their
00:24:24.140 | first round of ai regulation but not to be their last and when we were thinking about like what's
00:24:29.500 | changed in 2024 there's at least five themes that we could come up with that feel like they
00:24:33.980 | were descriptive of of what 2024 has meant for ai and for startups and so we'd start with um first
00:24:40.540 | it's a much closer race on the foundation model side than it was in 2023 so this is elem arena
00:24:45.900 | they're asked users to rate the evaluations from uh from generations from specific prompts so you
00:24:52.140 | get two responses from two language models answer which one of them is better the way to interpret
00:24:55.820 | this is like roughly 100 elo difference means that you're preferred two-thirds of the time
00:25:00.140 | and a year ago every open ai model was like more than 100 points better than anything else
00:25:05.020 | and the view from the ground was roughly like open ai is the ibm there is no point in competing
00:25:09.740 | everyone should just give up go work at open ai or attempt to use open ai models and i think the
00:25:15.020 | story today is not that i think it would have been unbelievable a year ago if you told people that a
00:25:20.780 | the best model today on this at least on this eval is not open ai and b that it was google
00:25:26.220 | would have been pretty unimaginable to the majority of researchers but actually there are a variety of
00:25:32.060 | of proprietary language model options and some set of open source options that are increasingly
00:25:36.060 | competitive and this seems true not just on the eval side but also in actual spend so this is
00:25:41.260 | ramp data there's a bunch of colors but it's actually just open ai and anthropic spend and the
00:25:45.900 | open ai spend at the beginning at the end of last year in november of 23 was close to 90 percent of
00:25:50.780 | total volume and today less than a year later it's closer to 60 percent of total volume which i think
00:25:56.700 | is indicative both that language models are pretty easy apis to switch out and people are trialing
00:26:01.420 | a variety of different options to figure out what works best for them related second trend that
00:26:06.700 | we've noticed is that open source is increasingly competitive so this is from the scale leader
00:26:11.740 | boards which is a set of independent evals that are not contaminated and on a number of topics
00:26:17.660 | that actually the the foundation models clearly care a great deal about open source models are
00:26:21.740 | pretty good on math instruction following and adversarial robustness the llama model is amongst
00:26:26.620 | the top three of evaluated models i included the agentic tool use here just to point out that this
00:26:32.060 | isn't true across the board there are clearly some areas where foundation model companies have
00:26:36.140 | had more data or more expertise in training against these use cases but models are surprisingly an
00:26:40.940 | increasing open source models are surprisingly increasingly effective this feels true across
00:26:45.100 | evals this is the mmlu eval i want to call it two things here one is that it's pretty remarkable
00:26:51.420 | that the ninth best model and two points behind and uh the the best state-of-the-art models is
00:26:56.540 | actually a 70 billion parameter model i think this would have been surprising to a bunch of people
00:27:00.940 | who were the belief was largely that most intelligence is just an emergent property
00:27:04.860 | and there's a limit to how much intelligence you can push into smaller form factors in fact a year
00:27:09.340 | ago the the best small model or under 10 billion parameter model would have been mistral 7b which
00:27:14.140 | on this eval if memory service is somewhere around is 60 and today that's the llama 8b model which is
00:27:19.660 | more than 10 points better the the gap between what is state-of-the-art and what you can fit
00:27:23.980 | into a fairly small uh form factor is actually actually shrinking um and again related the we
00:27:31.340 | think the price of intelligence has come down substantially this is this is a graph of flagship
00:27:35.180 | open ai model costs where the cost of the api has come down roughly 80 85 and call it the last year
00:27:42.300 | year and a half which is pretty remarkable this isn't just open ai2 this is also like the full
00:27:47.100 | set of models this is from artificial analysis which tracks cost per token across a variety of
00:27:51.340 | different apis and public inference options and like we were doing some math on this if you wanted
00:27:56.140 | to recreate like what a the kind of data that a text editor had or that like something like notion
00:28:01.260 | or coda that's somewhere in the volume of a couple thousand dollars to create that volume of tokens
00:28:06.060 | that's pretty remarkable and impressive it's clearly not the same distribution of data but
00:28:10.940 | just as like a sense of scope the there's an enormous volume of data that you can create
00:28:14.940 | and then fourth we think new modalities are beginning to work start quickly with biology
00:28:21.180 | we're lucky to work with the folks at chi discovery who just released chi1 which is
00:28:25.420 | open source model that outperforms alpha fold 3 it's impressive that this is like roughly a year
00:28:29.980 | of work with a pretty specific data set and then pretty specific technical beliefs but
00:28:33.980 | models in domains like biology are beginning to work we think that's true on the voice side as
00:28:38.300 | well point out that there were voice models before things like 11 labs have existed for a while but
00:28:43.420 | we think low latency voice is more than just a feature it's actually a net new experience
00:28:47.900 | interaction using voice mode feels very different than the historical transcription first models
00:28:52.780 | same thing with many of the cartesian models and then a new nascent use case is execution so cloud
00:28:59.340 | launch computer use openai launched code execution inside of canvas yesterday and then i think devon
00:29:03.980 | just announced that you can all try it for 500 a month which is pretty remarkable it's a set of
00:29:09.100 | capabilities that have historically never been available to vast majority of population and i
00:29:13.020 | think we're still in early innings cognition the company was founded under a year ago first product
00:29:17.340 | was roughly nine months ago which is pretty impressive if you recall like a year ago the
00:29:23.020 | point of view on swebench was like it was impossible to surpass what 15 percent or so
00:29:28.780 | and i think the the whole industry now considers that if not trivial accessible yeah
00:29:34.460 | um last new modality we wanted to call out although there are many more is video um i took
00:29:40.860 | the liberty i got early access to sora and managed to sign up before they cut off accesses so um here
00:29:46.220 | is my favorite joke in the form of a video hopefully someone here can guess it
00:29:49.740 | yeah you're telling me a shrimp fried this rice it's a pretty bad joke but i really like it
00:29:58.940 | and i think this one the next video here is uh one of our portfolio companies hey jen that
00:30:05.180 | translated and does the dubbing for or lip sync and dubbing for live speeches so this is javier
00:30:12.460 | mille who speaks in spanish but here you will hear him in english if this if this plays um
00:30:18.460 | and you can see that you can capture the original tonality of of his speech and performance i think
00:30:23.500 | auto here doesn't work but we'll we'll push something publicly sure um let's give it a shot
00:30:29.260 | yeah excellent of the western world yeah and you can hear that this captures like his original
00:30:36.700 | tone uh and like the emotion in his speech which is definitely new and pretty impressive
00:30:41.900 | from from new models um so the last uh the yeah that makes sense um the last point that we wanted
00:30:50.860 | to call out is uh the much purported end of scaling i think there is a great debate happening
00:30:55.180 | here later today on the question of this but we think at minimum it's hard to deny that there are
00:30:59.820 | at least some limits to the the clear benefits to increasing scale um but there also seems like
00:31:06.220 | there are new scaling paradigms so the question of test time compute scaling is a pretty interesting
00:31:10.220 | one it seems like openai has cracked a version of this that works and we think a foundation model
00:31:14.540 | labs will come up with better ways of doing this and b so far it largely works for very verifiable
00:31:20.700 | domains things that look like math and physics and maybe secondarily software engineering where
00:31:24.300 | we can get an objective value function and i think an open question for the next year is going to be
00:31:28.380 | how we generate those value functions for spaces that are not as well constrained or well defined
00:31:32.220 | so the question that this leaves us in is like well what does that mean for startups
00:31:37.180 | and i think a prevailing view has been that we live in an ai bubble there's an enormous amount
00:31:43.260 | of funding that goes towards ai companies and startups that is largely unjustified based on
00:31:47.100 | outcomes and what's actually working on the ground and startups are largely raising money
00:31:51.820 | on hype and so we pulled some pitch book data and the 2024 number is like probably incomplete since
00:31:56.940 | not all rounds are being reported and largely suggests like actually there is a substantial
00:32:01.100 | recovery in funding and maybe 2025 looks something like 2021 but if you break out the numbers here a
00:32:07.260 | bit more the red is actually just a small number of foundation model labs like what you would think
00:32:11.900 | of as the largest labs raising money which is upwards of 30 to 40 billion dollars this year
00:32:16.700 | and so the reality of the funding environment actually seems like much more sane and rational
00:32:21.260 | it doesn't look like we're headed to a version of 2021 in fact the the foundation model labs
00:32:25.420 | account for an outsized amount of money being raised but the the set of money going to companies
00:32:31.260 | that are working seems much more rational and we wanted to give you we can't share numbers for
00:32:35.900 | every company but this is one of our portfolio companies growing really really quickly um we
00:32:41.100 | think 0 to 20 and just plg style spending is pretty impressive if any of you are doing better than that
00:32:45.980 | you should come find us we'd love to chat and so what we wanted to try and center a discussion on
00:32:53.740 | this is certainly not all of the companies that are making 10 million more or revenue and growing
00:32:57.500 | but we took a selection of them and wanted to give you a couple ideas of patterns that we've noticed
00:33:02.300 | that seem to be working across the board um the first one that we've noticed is like first wave
00:33:07.180 | service automation so we think there's a large amount of work that doesn't get done at companies
00:33:12.460 | today either because it is too expensive to hire someone to do it it's too expensive to provide
00:33:17.020 | them context and enable them to be successful uh at uh at whatever the specific role is or
00:33:22.300 | it's too hard to manage um those set of people so prescribing it's too expensive to hire those
00:33:26.860 | specific set of people for sierra and decagon for customer support style companies it's really
00:33:30.620 | useful to do like next level automation and then there's obviously growth in that and for harvey
00:33:34.380 | and even up the story is um you can do first wave professional services and then grow beyond that
00:33:41.740 | second trend that we've noticed is better search new friends so we think that there is a it's pretty
00:33:47.340 | impressive like how effective text modalities have been so character and replica have been
00:33:51.180 | remarkably successful companies and there's a whole host of not safe for work chatbots as well
00:33:55.100 | that are pretty effective at just text generation they're pretty compelling mechanisms and on the
00:33:59.980 | productivity side perplexity and glean have demonstrated this as well i worked at a search
00:34:03.020 | company for a while i think the changing paradigms of how people capture and learn information is
00:34:08.220 | pretty interesting we think it's likely text isn't the last medium their infographics or sets of
00:34:13.020 | information that seem more useful or sets of engagement that are more engaging um but this
00:34:16.940 | feels like a pretty interesting place to start oh yeah okay mike so one thing that i've worked on
00:34:26.940 | investing in in a long time is democratization of different skills be they creative or technical
00:34:32.140 | this has been an amazing few years for that across different modalities audio video general image
00:34:39.900 | media text and now code and and really fully functioning applications um one thing that's
00:34:46.620 | really interesting about the growth driver for all of these companies is the the end users in
00:34:52.220 | large part are not people that we thought of as we the venture industry you know the royal we
00:34:57.980 | thought of as important markets before um and so a premise we have as a fund is that there's
00:35:03.660 | actually much more instinct for creativity visual creativity audio creativity technical creativity
00:35:09.660 | than like there's latent demand for it and ai applications can really serve that i think in
00:35:15.980 | particular mid journey was a company that is in the vanguard here and nobody understood for a long
00:35:20.380 | time because the perhaps outside view is like how many people want to generate images that are not
00:35:27.260 | easily you know the raster they're not easily editable they can't be using these professional
00:35:31.180 | context in a complete way and the answer is like an awful lot right for a whole range of use cases
00:35:36.220 | and i think we'll continue to find that especially as the capabilities improve and we think the the
00:35:41.420 | range of um uh quality and uh controllability that you can get in these different domains is still
00:35:49.020 | it's very deep and we're still very early um and then i i think as if if we're in the first or
00:35:55.180 | second inning of this ai wave one obvious place to go invest and to go build companies is the
00:36:02.220 | enabling layers right um shorthand for this is obviously compute and data i think the the needs
00:36:08.540 | for uh data are largely changed now as well you need more expert data you need different forms
00:36:15.340 | of table talk about that later in terms of who has like let's say reasoning traces in different
00:36:20.380 | domains that are interesting to companies doing their own training but this is this is an area
00:36:25.980 | that has seen explosive growth and we continue to invest here um okay so maybe time for some opinions
00:36:32.860 | there was a prevailing narrative that um you know some part from companies some part from investors
00:36:42.700 | it's a fun debate uh as to where is the value in the ecosystem and can there be
00:36:47.180 | opportunities for startups um if you guys remember the phrase gpt rapper it was like the dominant
00:36:52.620 | phrase in the tech ecosystem for a while of and what it what it represented with this idea that
00:36:58.540 | there was no value at the application layer you had to do pre-training and then like nobody's
00:37:02.940 | going to catch open ai and pre-training and you know this isn't this isn't like a a knock on
00:37:08.620 | open ai at all these these labs have done amazing work enabling the ecosystem and we continue to
00:37:13.420 | partner with them and and others but um but it's simply untrue as a narrative right the odds are
00:37:21.500 | clearly in favor of a very rich ecosystem of innovation you have a bunch of choices of models
00:37:27.420 | that are good at different things you have price competition you have open source uh i think an
00:37:33.340 | underappreciated impact of test time scaling is you're going to better match user value with your
00:37:39.420 | spend on compute and so if you are a new company that can figure out how to make these models
00:37:44.460 | useful to somebody the customer can pay for the compute instead of you taking as a as a startup
00:37:49.420 | the capex for pre-training or um or rl up front uh and um uh as pranav mentioned you know small
00:37:58.540 | models especially if you know the domain can be unreasonably effective uh and the product layer
00:38:03.500 | has if we look at the sort of cluster of companies that we described shown that it is creating and
00:38:09.100 | capturing value and that it's actually a pretty hard thing to build great products that leverage
00:38:13.020 | ai um so so broadly like we have a point of view that i think is actually shared by many of the
00:38:19.180 | labs that the world is full of problems in the last mile to go take even agi into all of those
00:38:26.220 | use cases is quite long okay another prevailing belief is that um or you know another great debate
00:38:34.060 | that sean could host is like does the value go to startups or incumbents uh we must admit some
00:38:38.700 | bias here even though we have you know friends and portfolio former portfolio companies that would
00:38:42.940 | be considered incumbents now but um uh oh sorry swap swap uh swap views sorry uh you know there
00:38:51.740 | are there are markets in venture that have been considered traditionally like too hard right like
00:38:57.740 | just bad markets for the the venture capital spec which is capital efficient rapid growth that's a
00:39:03.900 | venture backable company um where the end output is a you know a tens of billions of dollars of
00:39:09.900 | enterprise value company um and and these included areas like legal health care defense pharma
00:39:16.140 | education um you know any traditional venture firm would say like bad market nobody makes money
00:39:22.300 | there it's really hard to sell there's no budget etc and and one of the things that's interesting
00:39:26.460 | is if you look at the cluster of companies that has actually been effective over the past year
00:39:30.700 | some of them are in these markets that were traditionally non-obvious right and so perhaps
00:39:35.340 | one of our more optimistic views is that ai is really useful and if you make a capability that
00:39:42.300 | is novel that is several magnitudes um orders of magnitude cheaper then actually you can change the
00:39:48.620 | buying pattern and the structure of these markets and maybe the legal industry didn't buy anything
00:39:53.500 | because it wasn't anything worth buying for a really long time that's one example um we we
00:39:57.660 | also think that like what was the last great consumer company um maybe it was discord or
00:40:02.620 | roblox in terms of things that started that have just like really um enormous user basis and
00:40:07.820 | engagement uh until you know we had these consumer chatbots of different kinds and and like the next
00:40:13.900 | perhaps the next generation of search as Pranav mentioned we think that the um opportunity for
00:40:20.860 | social and media generation and games is uh large and new in a totally different way um and and
00:40:27.900 | finally uh in terms of the markets that we look at uh i think there's broad recognition now that
00:40:33.980 | you can sell against outcomes and services rather than software spend with ai because you're doing
00:40:39.740 | work versus just giving people the ability to do a workflow but um if you take that one step further
00:40:45.340 | we think there's elastic demand for many services right uh our classic example is um there's on
00:40:52.780 | order of 20 to 25 million professional software developers in the world uh you know i imagine much
00:40:58.700 | of this audience is technical uh demand for software is not being met right if we take the
00:41:05.740 | cost of software and high quality software down two orders of magnitude we're just going to end
00:41:10.540 | up with more software in the world we're not going to end up with fewer people doing development
00:41:14.940 | at least that's what we would argue um and then finally on the incumbent versus uh startup
00:41:21.820 | question uh the prevailing narrative is incumbents have the distribution the product surfaces and the
00:41:27.180 | data don't bother competing with them they're going to create and capture the value and share
00:41:30.860 | some of it back with their customers i think this is only partially true um they incumbents have the
00:41:35.820 | distribution they have always had the distribution like the point of the startup is you have to go
00:41:40.060 | fight with a better product or a more clever product um and maybe a different business model
00:41:45.340 | to go get new distribution but the specifics around the product surface and the data i think
00:41:50.940 | are actually worth understanding there's a really strong innovators dilemma if you look at the sas
00:41:55.740 | companies that are dominant they sell by seat and if i'm doing the work for you i don't necessarily
00:42:01.020 | want to sell you seats i might actually decrease the number of seats um the tens of the decades of
00:42:07.660 | years and millions of man and woman hours of code that have been written to uh enable a particular
00:42:16.860 | workflow in crm for example may not matter if i don't want people to do that workflow of filling
00:42:21.900 | out the database every friday anymore and so i i do think that this sunk cost or the incumbent
00:42:28.060 | advantage gets highly challenged by new ux and code generation as well and then one disappointing
00:42:34.620 | learning that we found in our own portfolio is no one has the data we want in many cases
00:42:40.540 | right so imagine you are trying to automate a specific type of knowledge work uh and what you
00:42:48.380 | want is the reasoning trace um all of the inputs and the output decision um like that sounds like
00:42:56.220 | a very useful set of data and the incumbent companies in any given domain they never save
00:43:00.620 | that data right like they have a database with the outputs some of the time and so i i would say uh
00:43:06.700 | one of the things that is worth thinking through as a startup is um when an incumbent says they
00:43:12.540 | have the data like what is the data you actually need to make your product higher quality
00:43:15.660 | okay so in in summary um you know our shorthand for the set of changes that are happening is
00:43:23.180 | software 3.0 we think it is a full stack rethinking and it enables um in a a new generation of
00:43:29.660 | companies to have a huge advantage the speed of change um favors startups if the floor is lava
00:43:35.500 | it's really hard to turn a really big ship uh i think that some of the ceos of large companies
00:43:40.460 | now are incredibly capable but they're still trying to make a hundred thousand people move
00:43:44.380 | very quickly in a new paradigm um the market opportunities are different right these markets
00:43:49.260 | that we think are interesting and very large like represent a trillion dollars of value
00:43:53.420 | are not just the replacement software markets of the last two decades um it's not clear what
00:43:59.500 | the business model for many of these companies should be uh sierra just started talking about
00:44:03.500 | charging for outcomes um outcomes based pricing has been this holy grail idea in software and
00:44:08.780 | it's been very hard but now we do more work um uh there are other business model challenges um
00:44:15.660 | and so you know our companies they spend a lot more on compute than they have in the past they
00:44:21.020 | spend a lot with the foundation model providers they think about gross margin uh they think about
00:44:25.660 | where to get the data uh it's a time where you need to be really creative about product um
00:44:30.220 | versus just replace the workflows of the past uh and it might require ripping out those workflows
00:44:36.140 | entirely it's a different development cycle i bet most of the people in this room have written
00:44:41.260 | evals um and like compared to you know the academic benchmark to a real world eval and said like
00:44:46.860 | you know that's not it and how do i make a user um understand uh the um non-deterministic nature
00:44:55.580 | of these outputs or gracefully fail i think that's like a different way to think about product than
00:45:00.220 | in the past um and we we need to think about infrastructure again right um there was this
00:45:05.420 | middle period where the cloud providers the hyperscalers took this problem away from software
00:45:10.380 | developers and it was all just going to be like i don't front end people at some point and it's
00:45:14.060 | like we are not there anymore we're back in the hardware era where people are um acquiring and
00:45:18.780 | managing and optimizing compute and i think that will really matter in terms of capability and
00:45:22.380 | companies um so uh i guess we'll end with a call to action here and and encourage all of you to
00:45:30.140 | seize the opportunity um it is the greatest technical and economic opportunity that we've
00:45:35.340 | ever seen like we made a decade plus career type bet on it and um uh we do a lot of work
00:45:43.580 | with the foundation model companies uh we think they are doing amazing work and they're great
00:45:48.540 | partners and even co-investors in some of our efforts but uh i think all of the focus on their
00:45:54.620 | interesting missions around agi and safety um do not mean that there are not opportunities in other
00:46:00.940 | parts of the economy the world is very large and we think much of the value will be distributed in
00:46:05.820 | the world through an unbundling and eventually a re-bundling uh as often happens in technology
00:46:10.940 | cycles um so we think this is a market that is structurally supportive of startups we're really
00:46:16.060 | excited to try to work with the more ambitious ones and the theme of 2024 um to us has been like
00:46:23.100 | well thank goodness this is a this is an ecosystem that is much friendlier to startups than 2023 it
00:46:29.500 | is what we hoped um and and so uh you know please uh ask those questions and take advantage of the
00:46:35.420 | opportunity do those things work yeah hello they do work i can kick us off okay so if some of these
00:46:56.860 | companies um can go from you know 1 to 20 in such a short amount of time do you think that they can
00:47:02.300 | also disappear in a short amount of time uh i can i can take this one i mean uh i think you've seen
00:47:10.140 | companies go from zero to 80 million and stall out pretty badly actually um so your data is correct
00:47:17.100 | um there's gonna be uh there's a set of challenges that um are just the challenges of scale right
00:47:26.060 | like i think sometimes the revenue numbers in these companies can overstate the maturity of
00:47:30.140 | the businesses themselves right they need to figure out how to serve customers they need to
00:47:33.580 | scale their leadership um they need to uh prepare to uh service these customers um with the right
00:47:41.820 | quality level and you know like the company that we showed that went zero to 20 that company has
00:47:46.540 | 20 people right and they have you know x hundred thousand users is yeah it's very challenging um
00:47:52.300 | and and so i think there there's a set of good hard problems that these companies will have
00:47:57.340 | i think part of the like most catchphrases or memes they don't catch on unless there's some
00:48:03.660 | seat of truth and so there was a set of companies that were described by this term gpt wrapper that
00:48:09.500 | were not more than a somewhat trivial set of prompts and seo pages that directed people to
00:48:17.660 | our particular use case and i think that's not uh that's like likely not a durable position as a
00:48:24.140 | technology company um and and so it's not a very clean answer for you it's a it's a nuanced one but
00:48:30.700 | some of the value that is represented by this um i'm going to scroll back to it some of this value
00:48:37.660 | that is represented by this cluster is durable and that's the thing that we are interested in
00:48:42.300 | um uh the the zero to 20 and the zero to 80 and then collapse it's actually valuable it's just
00:48:50.140 | not durable right users are voting for it and other people can compete and so you know we kind
00:48:54.780 | of separate these two questions of like you know which of these companies is defensible um and
00:49:00.220 | where is the revenue or the usage not a novelty but something that's really important to like
00:49:05.660 | work or player communication sean do you want me to take questions or do you want to do it
00:49:14.060 | yeah well yeah you can do it hi hi um i think my mic oh here it goes so if all of these companies
00:49:22.460 | need a lot more money and this is the greatest economic opportunity ever uh don't we need much
00:49:28.860 | bigger venture funds like orders of magnitude bigger and won't the economics of those funds
00:49:33.900 | be really broken if they're still raising 40 million dollar like gonna invest in a bunch
00:49:37.820 | of seed company funds okay uh this is a bit of a triggering question for me because i take a
00:49:43.820 | particular point of view on it um uh hopefully without arrogance we've chosen to raise
00:49:48.540 | funds that are relatively small um as early stage investors uh and part of it is the the view of um
00:49:55.980 | like this company that you know this company uh i think they've spent like maybe seven million
00:50:04.460 | dollars to date right um and so the view that all ai product companies or all ai companies in general
00:50:12.140 | are very expensive is not true objectively we have we have several companies that are
00:50:16.940 | um expensive in the traditional sense of sass like we got to go hire a lot of go-to-market people
00:50:22.460 | and we have to pay them and there's a j curve of that investment before it comes back in
00:50:26.540 | repeatable sass revenue um uh and you know i think um inference revenue uh we have companies that are
00:50:35.100 | profitable or break even and have been incredibly efficient and we have companies that spend a lot
00:50:39.580 | up front and so i think there's a an entire range um our view as a firm is uh that you know very
00:50:48.060 | early on um my friend a lot has a a funny phrase here which is um no gpu before product market fit
00:50:56.060 | i think that is not always true we have given people gpus before anything right but but there's
00:51:01.980 | there's a a shred of truth in this which is you can experiment like thank you to the open ai and
00:51:09.180 | anthropics and um other companies of the world that allow uh great product people to experiment
00:51:14.620 | at very low cost very incrementally and so i i think much of our portfolio looks like those
00:51:20.060 | companies where you're going to see what kind of value you can bring to users without spending a
00:51:24.940 | ton up front um as one example like we just saw um uh new fine tuning interfaces for a one come out
00:51:33.260 | the amount of data that you need to in theory improve um those models for a particular domain
00:51:40.300 | is very small if that pans out like that's incredibly encouraging as well so so i would
00:51:46.780 | say like i our goal is to work with the most important companies in ai with a relatively
00:51:52.860 | small fund and i think that um most companies don't actually they don't benefit from a huge
00:51:59.100 | amount of capital up front um the only thing i would add to that is uh i i think an interesting
00:52:05.740 | trend is that we work with a number of second time founders whose point of view this time around is
00:52:09.740 | like we're never going to make the company that big again i think it's not a surprise actually i
00:52:14.540 | was doing the math in my head and um this rough ratio of a million dollars of revenue for per
00:52:19.340 | employee of early stage company holds true for like a remarkable number of our companies like
00:52:23.420 | a number of our companies have more millions in revenue than they do employees and the point of
00:52:28.060 | view of a bunch of this is like we're going to keep it that way like we're we're not going to
00:52:31.020 | grow into a giant team uh ai will make us much more efficient and if you believe in the grand
00:52:35.660 | vision of much of the intellectual labor that we do should actually just be captured by some
00:52:39.980 | number of models and we can build much more long-term efficient businesses than we have been
00:52:44.060 | able to historically i do think it's an interesting question because um if we think
00:52:49.180 | there is this much opportunity like your opportunity doesn't come evenly right and so
00:52:54.460 | i'd say our investment pacing is higher than i guess mine has been traditionally and uh another
00:53:01.420 | part of our view is like okay well we want to offer and we want to offer founders a certain
00:53:05.980 | service level um and you know founders can decide if they want that or not but it is it's very time
00:53:12.140 | expensive to us we can only work with that many companies we think many more are really interesting
00:53:17.580 | and that is one of the reasons that pranav and i did this program for the ecosystem called embed
00:53:21.980 | where we can work with a larger set of companies we own less but we give them you know uh a network
00:53:27.340 | and some guidance and and it is genuinely because there are more interesting things that we think
00:53:31.420 | are going to work than we can work on in a traditional um like artisanal venture sense
00:53:36.620 | and shameless plug applications will open in january
00:53:38.940 | i think if i press a button so fast oh so fancy cool uh hi thanks for the talk it was awesome
00:53:53.500 | so i work for a series c enterprise focused company called writer and one of the interesting
00:53:58.380 | things about the multi-modality thing that we're seeing in the enterprises beyond vision we're not
00:54:03.500 | actually seeing a lot of like demand for multi-modality like we'll get asked about um audio
00:54:09.500 | and video stuff but then when we ask like sort of what's the use case it's sort of like i don't know
00:54:14.860 | and so i'm curious if if you and your um like portfolio companies are are seeing that in the
00:54:21.020 | enterprise space and if so like what use cases it seems very focused like the multi-modality stuff
00:54:25.260 | seems great for the consumer level i'm curious if you're seeing anything on the enterprise side
00:54:30.300 | i think it's a good call out um enterprises the data they have is mostly like it's text it's like
00:54:36.700 | structured data and some sql data like it's uh um i don't think your average enterprise has that much
00:54:43.020 | vision video audio data that is that interesting um but i think that will change um like
00:54:50.940 | maybe it's because i'm like lazy and disorganized but humans are very unstructured like they don't
00:54:57.260 | want they don't necessarily think in terms of like relational database schema and like hierarchical
00:55:02.700 | management of their own information uh and i i think there's a future where we take that away
00:55:07.900 | from people um and um the capture of information that you're going to use for different enterprise
00:55:13.260 | workflows um uh enables more multi-modal use if that makes sense and so like the sort of obvious
00:55:20.060 | example would be there are companies from like perhaps a half generation ago like the gongs of
00:55:24.940 | the world that captured video and found some um keywords and initial insights uh for sales reps
00:55:31.820 | but the communications within an organization the decisions made um the uh things that people
00:55:40.460 | create i think there will be much more capture especially of video but um uh making use of it
00:55:48.460 | requires companies to do that capture um so we kind of require this intermediate step i think
00:55:54.140 | there's a company in our uh and this is still a prosumer company today as well to your point of
00:55:59.100 | like you know the consumer prosumer side is ahead of the enterprise but there's a company in our
00:56:03.340 | last embed batch called highlight that kind of has this premise that like okay well you know
00:56:08.060 | we're going to use the multi-modality by using on-screen capture that's what this little like
00:56:12.460 | bubble is on screen and audio capture and i think that um i think it's a powerful idea
00:56:21.500 | uh hi
00:56:22.140 | by the way just a quick check uh peter isaac are you here
00:56:28.860 | uh hi thanks yeah there's sort of like a meme going around that the the price of intelligence
00:56:38.940 | is going to go to zero um and you can kind of see this with gpt40 and and with gemini flash
00:56:45.100 | you can get a million tokens a day which is probably enough for a small company right like
00:56:51.260 | so i'm curious how as these large companies lose tons of money for market share like how are
00:56:58.540 | startups going to respond to this like how does that change the market um i think it is impossible
00:57:04.300 | for anything to be too cheap so i'll start with that um i would also say this company
00:57:09.020 | with this like awesome revenue chart like i'm pretty sure we paid like five to seven million
00:57:14.460 | dollars to a uh foundation model provider in this period of time right and so um uh demand is
00:57:21.900 | like if there was like a secondary theme to this talk demand is elastic in so many ways especially
00:57:26.540 | for technology and when you make things cheaper we want things to be more intelligent right um and so
00:57:32.940 | if you make hundreds of calls in order to deliver an output um then suddenly like the fact that the
00:57:39.660 | cost of calls come down 85% doesn't do you enough uh and so yes it's like an incredibly compelling
00:57:46.380 | idea of like having intelligence too cheap to meter i'm like maybe this is really old school
00:57:51.340 | of me but for the last two decades like the internet and compute and software and data
00:57:56.300 | pipeline like they it still hasn't been cheap enough actually we would do more if it was free
00:58:02.140 | so uh the other like uh physical barrier that we've run into is um when models are really large
00:58:11.420 | if you're not going to like quantize and distill and do domain specific things like it's hard to
00:58:16.300 | run you need a lot of compute just to state the very basics and even with the foundation model
00:58:21.420 | providers we are seeing people run into inference capacity issues and so um i do not know if this
00:58:27.420 | is true but uh like one way to read anthropic pricing change is there's not enough capacity
00:58:34.300 | right uh and so i think like um incredible kudos to the open source ecosystem incredible kudos to
00:58:40.860 | open ai for like staying on this drumbeat of offering cheaper and cheaper intelligence in
00:58:45.980 | every generation but uh like we have a companies that are spending a lot of money on um you know
00:58:53.980 | let's say um search and validation systems with many calls and we think that will continue
00:58:58.940 | i think you can see that as well in like the the price charts that we had before
00:59:03.580 | the like one pricing is still absurd um it it seems like it actually is gpt3 pricing
00:59:11.180 | right yeah but i mean volume of tokens i think um like it is really interesting that
00:59:18.940 | if you believe like the i mean the the other part of this is like if you look at the test
00:59:22.700 | time compute scaling um this is it's a log scale like uh it's easy to forget that like that's a lot
00:59:30.060 | of like historically um like as a result of overtraining a small set of companies took on
00:59:35.180 | the majority of financial burden for generating high quality models which is you just overtrain
00:59:39.580 | the shit out of your model and then it's useful for everyone else um if the customer has to pay
00:59:43.740 | this like that's a lot of money um if you want high quality generation and that means that i pay
00:59:48.940 | on the order of like thousands of attempts um that's that ends up being pretty expensive
00:59:53.660 | um question from youtube uh so hi to the youtube audience
00:59:59.100 | um so we you know you talked about price right price going down uh there's also the other
01:00:05.420 | dimension of capabilities going up and people always getting steamrolled by open ai so the
01:00:10.380 | question is what are some specific ways that you've seen companies build to prepare for better models
01:00:14.860 | like gpt5 or o2 like how do you future proof that um so i i think the like the most common refrain
01:00:22.940 | from at least opening i but i think the the model companies is you should build a company where
01:00:27.340 | you're excited when you hear that a new model is coming out not anxious um i would have like one
01:00:33.260 | edit to this which is like in the limit it seems like the majority of things that are worth building
01:00:37.100 | today are actually i don't know should you hire a sales team at all if if you think that models
01:00:40.540 | would be perfectly capable um like one framing that i've thought about on this is um you should
01:00:45.500 | decide like uh how much you believe uh foundation models will improve on like some core learning or
01:00:53.100 | intelligence capability um and then build your company imagining that on that prediction so
01:00:59.020 | the like an example here would be um like if you take like i think there's a generation of these
01:01:04.300 | like copywriting companies that uh were largely subsumed by chat gpt and the the story for many
01:01:10.300 | of them was the original usage was they understood better than other people how to get the model to
01:01:16.060 | like learn what my intent was in generating some piece of content some piece of seo content or they
01:01:19.980 | understood how to ingest information about my business and it's not hard to imagine like the
01:01:23.900 | next generation of models are just natively better at this like the context length gets longer you can
01:01:28.140 | stuff more into the context length you can crawl and like learn more about external websites like
01:01:32.860 | all that is like relatively cheap and so if the the core thesis the company looks like we don't
01:01:37.580 | think models will be capable of doing that that feels uh likely short-sighted on the other hand
01:01:42.940 | like there are a number of delivery mechanisms that are like far out of range of what what models
01:01:48.380 | will do like sarah had a a good example of this which is like there are some businesses where the
01:01:52.940 | limiting factor is like not actually intelligence like the the limiting factor for a number of
01:01:57.100 | businesses is like access to a specific set of people or um like i don't know we work with a
01:02:01.740 | pharmacy services company where like a core question is like long term can you negotiate
01:02:05.340 | pricing contracts the core issue there is on intelligence you need some amount of scale and
01:02:08.940 | then the ability to negotiate contracts um so i think i think many businesses are not exactly
01:02:13.820 | just a function of your ability to efficiently compute some small set of things i gave this
01:02:18.780 | presentation um with pranav and i'm like oh i'm so biased it just sounds like startups are gonna
01:02:22.860 | win everything and i'm um we still there i like to play this game which is what investment decision
01:02:28.860 | do you regret from the past year it's a really fun game i'm super fun yes um but one of the one of
01:02:34.140 | the decisions that i regretted was actually um a company that operates in uh uh a space that feels
01:02:43.420 | very core to perhaps foundation model companies and to hyper scale software players where there's
01:02:50.860 | tons of ecosystem risk around the company and by the way the people are amazing the metrics were
01:02:56.060 | amazing we're just like oh they're gonna get crushed and so with everything i said i still
01:03:00.780 | like overestimated the incumbents like ability to compete and make aggressive strategic decisions
01:03:07.020 | and so um i i think it's like really hard to overstate how important it is to understand um
01:03:14.460 | somebody can steamroll you if they focused all of their effort and all their best people
01:03:21.500 | on a particular area um are they going to right the copywriting example is illustrative because
01:03:28.700 | it's just not hard to see that understanding the context of a business from its website and from a
01:03:36.460 | couple documents and by making prompting a little bit easier and adding like some buttons that
01:03:40.540 | replace some prompts or doing suggested queries like it's just not a lot of work right but there
01:03:46.460 | are things that are a lot of work like having taste in developer products and distributing
01:03:51.340 | something amazing and so uh i i i actually think that um uh it's if you ask me like we have to make
01:04:00.300 | predictions in this business i worry more about under projecting capability than i worry about
01:04:05.500 | over projecting at least in the short term and then i worry more about um expecting too much
01:04:11.820 | from the incumbents and being too afraid of them than uh being not afraid enough maybe it's just
01:04:18.940 | one investment regret either one of you yeah we have one more from online oh okay you can do the
01:04:28.700 | online one uh how do you see ai changing hardware or in what ways and for example do you see a new
01:04:39.100 | apple coming out transforming hardware to that level not specifically the humane situation
01:04:45.900 | they're trying to ask very general how ai interview uh i'm sorry okay i i'd approach this from um uh
01:04:55.980 | two dimensions um uh everybody every investor wants a like a new consumer hardware platform
01:05:04.700 | to exist because it's so valuable and the question is like why why should it um i can think of two
01:05:10.380 | very good reasons one is that the usage pattern that you can imagine for ai applications actually
01:05:16.460 | requires you to um like the specs you'd want are different right like what if i want to capture
01:05:22.060 | image or video 100 of the time and um that's like a determinant of my battery life of my
01:05:29.740 | sensors of how i manage my network etc what if i want to run local models all the time like maybe
01:05:35.500 | like most of the phone should be a gpu right um i don't uh i i think that the usage patterns are
01:05:42.700 | perhaps very different for the next generation of you know the the intelligence in your hand
01:05:48.460 | um i think it's a hard thing to pull off another reason that you could believe in a new hardware
01:05:54.700 | device is that the advantages of the existing consumer platforms go away right and so at the
01:06:01.260 | extreme like should you have individual applications that track a single habit like drink water today
01:06:11.740 | sarah like i don't know like i can generate that pretty easily now and like maybe the single
01:06:17.420 | function applications that live in the mobile phone ecosystems are um part of uh a more general
01:06:24.780 | intelligence and um they like that ecosystem is less important um and so i i think there are
01:06:30.700 | different arguments for this uh and like we continually look for uh opportunities to invest
01:06:37.660 | here i don't think this is exactly what you asked but i also think the um like there are
01:06:43.500 | we invested in a company this past year um that is doing uh robotics um i for many years at graylock
01:06:52.700 | my prior firm like thought of robotics as an easy way to lose a lot of money over a long period of
01:06:57.180 | time um and and like i think that is true when you look at the outcome set for classical robotics
01:07:03.100 | even for the companies that got to scale of distribution for an industrial robot or a single
01:07:07.740 | use consumer robot um but like it's really cool that algorithms and generalization from um the
01:07:15.180 | broader machine learning field seem to apply here as well uh and so i think being imaginative about
01:07:22.220 | what physical intelligence looks like is also something we're excited about
01:07:26.460 | yeah okay okay okay so related to agents i think everyone has been chatting about agents you're
01:07:39.900 | seeing more like agent usefulness and production but i'm more curious like at the infrastructure
01:07:44.860 | layer what agent what infrastructure primitives do you think are required for agents to actually work
01:07:50.620 | and continue to work in production um okay i uh i don't know we talked about this a little bit i'm
01:07:59.980 | not sure if our points of view in this are the same i think it is um i think it's really hard
01:08:03.580 | to tell um my suspicion is that um like if you look at the number of like true agents that work
01:08:11.420 | like the number roughly rounds to zero maybe it's like low single digits or low double digits now
01:08:17.180 | um double double yeah and uh like they're all like relatively recent i would say like beginning
01:08:21.900 | of this year um we saw like a bunch of agent framework companies and um like i uh like i
01:08:27.820 | empathize with like the the root of the question which is it's just really hard to tell what any
01:08:31.340 | of these companies need especially when like this set of companies that works really well is unclear
01:08:34.780 | and um i i think there's a lot of valid anxiety on what foundation model companies want the
01:08:39.900 | interface to be like the computer's interface is a pretty low level one like you the anthropic
01:08:44.700 | version is like actually just make specific clicks and you know like rumors of other interfaces are
01:08:49.820 | like much more general like they're take actions on a specific web page um or like entire browser
01:08:54.620 | environments and so um like at a high level like i imagine that there are sets of like there's like
01:08:59.660 | the full scope of tools which is like i worked in a search engine for a while like crawl seems
01:09:02.940 | pretty useful live data seems pretty useful like an api that looks something like here's a url give
01:09:08.140 | me the set of data that's available or here's a url and a user login let me take some action
01:09:13.420 | on this page seems pretty useful um and then i don't know what the right place to operationalize
01:09:18.300 | this and commercially develop a product are um if i had like uh if i was building a company here
01:09:23.980 | like one thing that i think it's useful to just remain agile like the corset of infrastructure
01:09:28.940 | is consistently useful like a crawler is consistently useful and then one day you can
01:09:33.020 | figure out how to expose this better um but i i like empathize with the difficulty of like it's
01:09:39.500 | really hard to know what works for a bunch of agent companies and my suspicion is like the
01:09:43.980 | most successful agent frameworks will come from the most successful of these agent companies that
01:09:48.220 | solve these problems in-house for themselves and then operationalize this externally like it's
01:09:52.220 | some version of like react is really useful because react was like well adopted at facebook for a
01:09:56.540 | while um i think we can say that there are like missing components in the ecosystem where that
01:10:05.180 | if there was a default lots of agent developers would use it right um and so like identity and
01:10:13.020 | access management is a big problem um uh like if you could make agent development feel more like
01:10:21.500 | traditional software development i think a lot of people would use that and be like oh like
01:10:25.260 | you know it magically retries until it gets something and then it gives me like data back
01:10:29.180 | about how well it's working like things that like it's i think it's pretty easy to actually imagine
01:10:33.420 | the utilities in the abstract that would be useful to the ecosystem and then um the entire environment
01:10:39.980 | is fluid right and so um uh do you need like if you think about other things in infrastructure
01:10:46.300 | like will more workloads need vector indices yes like what is the shape of company that gets to be
01:10:52.140 | durable here like we don't know yet um and we'll keep looking at it but as pranav said i think we
01:10:57.260 | look to the handful of companies in our portfolio that are agents working at some scale and um and
01:11:05.260 | like look for the patterns there versus try to intuit it right now my cash hit was wrong i should
01:11:10.860 | have updated it's a it's a dozen not a small number it's been a long six months guys yeah
01:11:17.260 | uh i think one last question and there's a whole bunch of online stuff you won't get to but um yeah
01:11:23.340 | mark okay um it seems like there should be more consumer companies
01:11:29.180 | like why why aren't there or is it just a matter of time
01:11:33.500 | i think simply matter of time like we uh
01:11:38.860 | we keep bringing people into embed we keep looking i i think the uh i genuinely this is not
01:11:46.060 | a um a a knock on the research community or the really young set of founders that like
01:11:52.460 | i think focused on ai companies um first but the diffusion of innovation curve that applies to
01:11:58.940 | customers i think also applies to entrepreneurs um researchers saw the capability first and they're
01:12:06.300 | like like we should do something with this this is going to be amazing and it's like that will
01:12:10.540 | continue to happen like our portfolio is heavily overrepresented with with people from the research
01:12:14.860 | community pushing the pushing the state of the art with creative technical ideas um uh i think young
01:12:21.180 | very young people also were quite early to ai because they're like oh of course like this
01:12:25.900 | makes sense i've never seen other technology like chachi pt all the way um and their opportunity
01:12:31.660 | cost is lower than like you're the best product person at an amazing product organization like
01:12:37.740 | you have to leave your job to start a new company uh and it's been a really long two years like i
01:12:43.900 | feel like that's just started to happen where some of the talent that has the and you know maybe
01:12:51.660 | maybe it's just like the next zuck you know there's some dropout that figures out like the
01:12:55.900 | pattern of social interaction and it's like really ai native about this stuff i also think there's a
01:13:00.460 | chance that um some of the people who have built intuition for um consumer adoption and consumer
01:13:08.780 | interfaces they're just taking a little bit to also build intuition for ai products and now
01:13:13.180 | they're showing up and starting companies and experimenting and so um we have a lot of confidence
01:13:18.540 | like it is going to happen over the next few years and just a matter of time okay i think we're i
01:13:24.860 | think we're out of time i'm just trying to defer to sean here but thank you so much um you know
01:13:29.420 | please call us yeah i'm sure sarah for now we'll be sticking around uh so you can you can uh sort
01:13:38.140 | of ask some questions outside or you know whatever you want to do in networking wise but we're going
01:13:42.860 | to move on in our schedule uh we have a ton of papers that we want to cover this is basically
01:13:47.740 | paper club live um and i think isaac peter uh you guys uh so the the top um the top uh when people
01:14:00.380 | signed up we actually asked people like what you wanted to cover and the top um votes were vision
01:14:06.300 | open models um post transformers and all the other stuff that's coming later we also added reasoning
01:14:12.780 | because i didn't even have the option there and i was like what am i doing doing a sort of paper
01:14:17.980 | review uh session this year without uh talking about reasoning and um you know test time compute
01:14:24.220 | so uh but first we're gonna have uh vision uh roblox has been uh really uh great friends with
01:14:29.340 | latent space we've we've had um joseph nelson on twice um with facebook talking about all the
01:14:35.820 | innovations in vision but um it's not you know only about segmentation there's a lot of foundation
01:14:41.100 | model uh progress that happened this year in both the large space and the very small space so we're
01:14:45.820 | also very proud to have vick um to to update us on moon dream which he's been hacking away from
01:14:51.820 | for the past year yeah very very short amount of time um are you guys ready are you plugged in
01:14:57.020 | uh sarah paradov do you do you guys want to take questions like
01:15:05.580 | i don't know if people want to like there are people that want to talk to you
01:15:07.980 | what's what's your availability are you
01:15:11.260 | okay good
01:15:18.380 | awesome yeah just plug in on yeah the white thing exactly do you have sound the stuff
01:15:25.900 | no sound listen do you have any audio things all right cool cool
01:15:29.500 | stay close to the mic uh hi hey are they mic'd up nice yeah
01:15:36.060 | oh okay
01:15:44.700 | uh man i was hoping to use speaker notes that's not gonna work
01:15:51.580 | um you could do like a mirroring thing yeah yeah uh so there's settings display yeah yeah
01:16:04.380 | yep there you go thank you sweet
01:16:07.340 | are you on zoom
01:16:11.260 | okay um
01:16:26.300 | i'm sending an email
01:16:40.060 | super pumped to be here this is so cool
01:16:53.420 | email yeah both of us relied on your vision capabilities um yeah so this is for the screen
01:17:01.020 | share is for the live stream and also the editing or recording that we're doing later
01:17:04.780 | okay so you just share your screen and mute yourself um we got we got the audio you just
01:17:12.140 | want to capture the screen video and share the um share the green share share the screen that
01:17:17.980 | you're actually want people to see yeah that one the the the one with with the image that one but
01:17:25.900 | this is the speaker view yeah you don't want to share the speaker yeah so so you want to share
01:17:30.460 | this out too that's right double click on it you're good okay all right all right figuring
01:17:37.900 | things out like yeah now where'd the presentation go uh you can you can do the triple yeah triple
01:17:49.260 | slide there you go let's pick pick the thing and it is it up there are you just struggling no
01:17:54.140 | let's uh kill this how do i exit out of this apologies technical difficulties
01:18:02.140 | nice okay let's okay we're going to drag this up yeah perfect
01:18:22.860 | see uh
01:18:30.300 | we're just gonna make this full screen and call it good
01:18:32.460 | okay yay okay um hi we're isaac and peter from roboflow and we're going to talk about the best
01:18:46.700 | papers of 2024 in computer vision um so for us we define best as what made the biggest shifts
01:18:56.780 | in the space uh and to determine that we looked at what are some major trends that happened uh
01:19:03.740 | and what papers most contributed to those trends so i'm going to talk about a couple
01:19:06.620 | trends peter's going to talk about a trend and then uh we're going to hand it off to moon dream
01:19:10.860 | so the trends that i'm interested in talking about are a major transition from models that
01:19:19.260 | run on per image basis to models that run using the same basic ideas on video and then also how
01:19:26.700 | debtors are starting to take over the uh real-time uh object detection scene from
01:19:33.420 | the yolos which have been dominant for years uh so as a highlight um we're going to talk about
01:19:40.540 | sora which from my perspective is the biggest paper of 2024 even though it came out in february
01:19:45.980 | um is the one yeah yeah so just it's a sora is just a uh a post um so i'm going to fill it in
01:19:55.900 | with details from replication efforts including open sora and related work such as a stable
01:20:01.020 | diffusion video um and then we're also going to talk about sam2 which applies the sam strategy to
01:20:08.780 | video and then how debtors are the improvements in 2024 to debtors that are making them a Pareto
01:20:15.580 | improvement to yellow base models um so to start this off we're going to talk about uh the state
01:20:23.260 | of the art of video generation at the end of 2023 mag v.i.t uh mag v.i.t is a discrete token video
01:20:32.940 | tokenizer akin to vq gan but applied to video sequences and it actually outperforms uh state
01:20:40.940 | of the art uh handcrafted video compression frameworks uh in terms of the uh bit rate
01:20:48.380 | versus human preference for quality and video is generated by autoregressing on these discrete
01:20:53.020 | tokens um generates some pretty nice stuff but up to like five seconds length and you know not
01:20:59.100 | super detailed and then suddenly a few uh months later we have this which when i saw it was totally
01:21:06.700 | mind-blowing to me um 1080p a whole minute long we've got light reflecting and puddles that's
01:21:13.020 | reflective uh reminds me of those rtx demonstrations for next generation video games such as cyberpunk
01:21:20.860 | but with better graphics you can see some issues in the background if you look closely but they're
01:21:26.300 | kind of with a lot as with a lot of these models the issues tend to be things that people aren't
01:21:31.660 | going to pay attention to unless they're looking for in the same way that like six fingers on a
01:21:35.340 | hand you're not going to notice is a uh giveaway unless you're looking for it um so yeah as we
01:21:43.500 | said sore does not have a paper so we're going to be filling it in with uh context from the rest of
01:21:48.140 | the uh computer vision scene attempting to replicate these efforts um so the first step
01:21:56.300 | you have an llm caption a huge amount of videos um this this is a trick that they introduced in
01:22:04.220 | dolly 3 where they train a uh image captioning model to just generate very high quality captions
01:22:10.940 | for a huge corpus and then train a diffusion model on that their uh sora and the replication
01:22:18.700 | efforts also show a bunch of other steps that are necessary for good video generation including uh
01:22:24.860 | filtering by aesthetic score and filtering by making sure the videos have enough motion so
01:22:30.060 | they're not just like kind of the generator is not learning to just generate static frames um
01:22:35.580 | so then we encode our video into a series of space-time latency once again this were a very
01:22:45.100 | sparse in details so um the replication related works uh open sora actually uses a mag vit v2
01:22:52.540 | itself to do this but swapping out the uh disc discretization step with a classic vae
01:23:00.140 | auto encoder framework and they show that there's a lot of benefit from getting the temporal
01:23:07.740 | compression which makes a lot of sense as uh the each sequential frames and videos have mostly
01:23:13.500 | redundant information um so by compressing against compressing in the temporal space you allow the
01:23:21.500 | latent to hold a lot more semantic information while uh avoiding that duplicate um
01:23:28.300 | so we've got our space-time latence possibly but via there's some 3d vae presumably a mag vat v2
01:23:39.020 | um and then you throw it into a diffusion transformer so um i i think it's personally
01:23:47.740 | interesting to note that open sora is using a mag vat v2 which originally used an autoregressive
01:23:53.980 | transformer decoder to model the latent space but uh is now using a diffusion uh diffusion
01:24:01.740 | transformer so it's still a transformer happening just the question is like is it parameterizing
01:24:06.060 | the stochastic uh differential equation is or parameterizing a uh conditional distribution
01:24:11.100 | via autoregression um it's also um it's also worth noting that most diffusion models today
01:24:21.100 | the the very high performance ones are switching away from the classic like ddpm
01:24:24.380 | denoising diffusion probability modeling framework to rectified flows um rectified
01:24:31.260 | flows have a very interesting property that as they converge they actually get closer to
01:24:36.940 | being able to be sampled with a single step which means that uh in practice you can actually
01:24:42.460 | generate high quality samples much faster um major problem of ddpm and related models for
01:24:50.380 | the past four years is just that they require many many steps to generate high quality samples
01:24:56.380 | so and naturally the third step is throwing lots of compute at the problem
01:25:02.540 | so uh i didn't i never figured out how to manage to get this video to loop
01:25:08.620 | but we see very little compute medium compute lots of compute this is so interesting because the uh
01:25:17.500 | the original diffusion transformer paper from facebook actually showed that in fact the specific
01:25:22.460 | hyperparameters of the transformer didn't really matter that much what mattered was that you were
01:25:27.660 | just increasing the amount of compute that the model had so i love how in the you know once again
01:25:35.340 | little blog posts they don't even talk about like the specific hyperparameters they say we're using
01:25:38.540 | a diffusion transformer and we're just throwing more compute at it and this is what happens
01:25:41.900 | um open sora shows similar results the uh primary issue i think here is that
01:25:49.660 | no one else has 32x compute budget so we end up with these uh uh we end up in the middle of the
01:25:58.620 | domain in most of the uh uh related work which is still super super cool it's just a little
01:26:05.260 | disappointing considering the context um so i think this is a beautiful extension of the
01:26:11.660 | uh framework that was introduced in 22 and 23 for these very high quality per image generation
01:26:19.900 | and then extending that to videos it's awesome and it's ga as of monday except no one can seem
01:26:27.020 | to get access to it because they keep shutting down the login uh the next so next paper i wanted
01:26:33.900 | to talk about is sam so we at roboflow allow users to label data and train models on that data sam
01:26:41.980 | for us has saved our users 75 years of labeling time um we are the the best of my knowledge the
01:26:48.620 | largest uh sam api that exists we also sam also allows us to have our users train just pure uh
01:26:57.660 | bounding box regression models and use those to generate high quality masks um which has the great
01:27:05.660 | side effect of requiring less training data to have a meaningful convergence so most people are
01:27:11.020 | data limited in the real world so anything that requires less data to get to a useful thing is
01:27:15.100 | super useful um most of our users actually run their object uh per frame object detectors on
01:27:22.860 | every frame in a video or maybe not most but many many and so uh sam follows into this category of
01:27:31.900 | taking sam2 falls into this category of taking something that really really works and applying
01:27:36.620 | it to a video which has the wonderful benefit of being plug and play with most of our many of our
01:27:43.420 | users use cases um we're we're still building out a sufficiently mature pipeline to take advantage
01:27:49.980 | of that but it's it's in the works um so here we've got a great example we can click on cells
01:27:58.780 | and then follow them you even notice the cell goes away and comes back and we can still uh
01:28:02.940 | keep track of it um which is very challenging for uh existing object trackers um high level
01:28:15.580 | overview of how sam2 works we uh uh there's a simple pipeline here where we
01:28:24.460 | can give provide some type of prompt and it fills out the rest of the likely masks for that object
01:28:33.260 | throughout the rest of the video so here we're giving a bounding box in the first frame a set
01:28:37.500 | of positive negative points or even just a simple mask um i'm gonna assume people are somewhat
01:28:45.580 | familiar with sam so i'm gonna just give a high level overview of how sam works you have an image
01:28:51.820 | encoder that runs on every frame um sam2 can be used on a single image in which case the only
01:28:58.780 | difference between sam2 and sam is that image encoder which sam used a standard vit um sam2
01:29:08.940 | replaced that with a uh uh hera hierarchical encoder which gets approximately the same
01:29:15.580 | results but leads to a six times faster inference which is excellent especially considering how in
01:29:22.460 | a trend of 23 was replacing the vit with more efficient backbones um in the case where you're
01:29:31.180 | doing video segmentation the the difference is that you actually create a memory bank and you
01:29:35.820 | cross attend the features from the image encoder based on the memory bank so the uh feature set
01:29:44.780 | that is created is essentially uh well i'll go more into it in a couple slides but we take the
01:29:52.860 | features from the past couple frames plus a set of object pointers and the set of prompts and
01:30:01.500 | use that to uh generate our new masks then we then fuse the new masks for this frame
01:30:07.660 | with the um image features and add that to the memory bank it's well i'll say more in a minute
01:30:16.620 | the um just like sam that sam2 actually uses a data engine to create its uh data set in that
01:30:23.180 | people are they assembled a huge amount of reference data used people to label some of it
01:30:28.780 | and train the model uh use the model to label more of it and ask people to refine the predictions of
01:30:35.820 | the model and then ultimately the data set is just uh created from the final output of the model
01:30:41.660 | on the uh reference data it's very interesting this paradigm is so interesting to me because it
01:30:47.100 | uh it uh unifies a model in a data set in a way that is very unique it seems unlikely that another
01:30:55.180 | model could come in and have such a tight relationship with the training set um yeah
01:31:02.460 | so brief overview of how the memory bank works the paper did not have a great visual so i'm just i'm
01:31:11.020 | going to fill in a bit more um so we take the last couple frames from our video and
01:31:20.940 | uh we take the last couple frames from our video uh attend that along with the set of prompts
01:31:29.500 | that we provided they could come from the future they could come from anywhere in the video
01:31:34.780 | as well as reference object pointers saying by the way here's what we've found so far
01:31:40.620 | uh attending to the last few frames has the interesting benefit of
01:31:44.460 | allowing it to model complex object motion uh without actually
01:31:50.300 | uh you by limiting the amount of frames that you attend to you manage to keep the model running in
01:31:58.300 | real time this is such an interesting topic topic for me because one would assume that attending to
01:32:04.620 | all of the frames is super essential having some type of summarization of all the frames
01:32:08.540 | is super essential for high performance um but we see in their later ablation that that actually is
01:32:14.700 | not the case so here just to make sure that there is some benchmarking happening we just compare to
01:32:22.380 | some of the stuff that's came out prior and indeed the sam2 strategy does improve on the state of the
01:32:29.980 | art um this ablation deep in their dependencies was super interesting to me uh we see in section
01:32:40.140 | c the number of memories um one would assume that increasing the count of memories would
01:32:45.580 | meaningfully increase performance and we see that it has some impact but not the type that
01:32:50.780 | you'd expect and that it meaningfully decreases speed which justifies in my mind just having this
01:32:56.540 | 50q of memories um although in the future i'm super interested to see a more dedicated
01:33:05.980 | summarization of all of the last video not just a stacking of the last frames so
01:33:13.660 | that another extension of beautiful per frame work into the uh video domain the next trend i'm
01:33:25.180 | interested in talking about is uh this interesting at roboflow we're super interested in training
01:33:31.260 | real-time object detectors those are bread and butter and so we're doing a lot to keep track of
01:33:35.660 | what is actually happening in that space uh we are finally starting to see something change
01:33:42.940 | so for years yellows have been the dominant way of doing real-time object detection and we can see
01:33:50.300 | here that they've essentially stagnated the the performance between 10 and 11 is not meaningfully
01:33:56.700 | different at least you know in in this type of high-level chart and even from the last couple
01:34:03.100 | series there's not a major change uh so yellows hit a plateau debtors have not so we can look here
01:34:14.700 | and see the yellow series has this plateau and then these are rt debtor lw data and define have
01:34:22.300 | meaningfully changed that plateau so that in fact the best define models are plus 4.6 ap on coco at
01:34:29.580 | the same latency so three major steps to accomplish this uh the first rt debtor which is technically
01:34:38.460 | a 2023 paper pre-print but published officially in 24 so i'm going to include that i hope that's
01:34:44.460 | okay um that is showed that uh rt data showed that we could actually match or outspeed yellows
01:34:50.940 | um then lw debtor showed that pre-training is hugely effective on debtors and much less so
01:34:58.060 | on yellows and then define out of the types of bells and whistles that we expect from uh
01:35:02.060 | these types this this uh arena so the major improvements that rt data shows was uh taking
01:35:11.820 | the multi-scale features that debtors typically pass into their encoder and decoupling them into
01:35:17.820 | a much more efficient uh transformer encoder uh the transformer is of course quadratic complexity
01:35:25.180 | so decreasing the amount of stuff that you pass in at once is super helpful for increasing your
01:35:31.580 | runtime or uh increasing your throughput so that change basically brought us up to yellow speed
01:35:38.700 | and then they do a hardcore analysis on uh benchmarking yellows including the nms step
01:35:46.620 | once you uh once you include the nms in the latency calculation you see that in fact these
01:35:52.380 | debtors are outperforming at least this time the uh the the yellows that existed
01:35:59.420 | then lw debtor goes in and suggests that in fact the uh um this frame the huge boost here is from
01:36:09.980 | pre-training so this uh is the defined line and this is the defined line without pre-training
01:36:16.860 | it's within range it's still an improvement over the uh yellows but the really huge boost comes
01:36:21.980 | from the benefit of pre-training uh in when yellow x came out in 2021 they showed that they got much
01:36:29.820 | better results by having a much much longer training time but they found that when they
01:36:36.780 | did that they actually did not benefit from pre-training so you see in this graph from lw
01:36:43.180 | debtor in fact yellows do have a real benefit from pre-training but it goes away as we increase the
01:36:49.180 | training time then the debtors converge much faster lw debtor trains for only 50 epochs rt
01:36:55.420 | debtors 60 epochs so one could assume that in fact the entire extra gain from pre-training is that
01:37:03.820 | you're not destroying your original weights by relying on this long training cycle
01:37:07.820 | um and then lw debtor also shows superior performance to our favorite data set roboflow 100
01:37:17.420 | which means that they do better on the real world not just on coco
01:37:20.380 | then define throws all the bells and whistles at it uh yellow models tend to have a lot of
01:37:29.340 | very specific uh complicated loss functions this uh define brings that into the debtor
01:37:36.300 | world and shows consistent improvement on a variety of debtor based frameworks
01:37:41.100 | so bring these all together and we see that suddenly we have almost 60 ap on coco while
01:37:47.900 | running in like 10 milliseconds huge huge stuff so we're spending a lot of time trying to build
01:37:56.220 | models that work better with less data and debtors are clearly becoming a promising step in that
01:38:01.660 | direction the we're interested in seeing from the debtors in this this trend to next is co-debtor and
01:38:11.660 | the the the models that are currently sitting on the top of the uh leaderboard for large scale
01:38:17.820 | inference scale really well as you switch out the backbone we're very interested in seeing and and
01:38:25.020 | having people publish a paper potentially us on what happens if you take these real-time ones
01:38:29.980 | and then throw a swing g at it like do we have a Pareto curve that extends from the real-time
01:38:34.780 | domain all the way up to the uh uh super super slow but high performance domain we also want
01:38:41.260 | to see people benchmarking an rf100 more because that type of data is what's relevant for most
01:38:46.860 | users um and we want to see more pre-training because pre-training works now it's super cool
01:38:57.500 | all right so yeah so in that theme uh one of the big things that we're focusing on
01:39:03.180 | is how do we get more out of our pre-trained models um and one of the lenses to look at this
01:39:08.540 | is through sort of this this new requirement for like fine-grained visual details and your
01:39:14.860 | representations that are extracted from your foundation model so it's sort of a hook for this
01:39:19.820 | um oh yeah this is just a list of all the the papers that i'm going to mention i just want to
01:39:24.940 | make sure i set up actual papers so you can find it later um yeah so sort of the big hook here is
01:39:30.620 | that i make the claim that llms can't see if you go to if you go to claude or um chat gpt you ask
01:39:38.860 | it to to see this uh uh watch and tell me what time it is it fails right and so you could say
01:39:45.820 | like maybe maybe the um like this is like a very classic uh test of an llm but you could say okay
01:39:53.260 | maybe this this image is like too zoomed out and it just like it'll do better if we increase the
01:39:58.700 | resolution and it has easier time finding these fine fine-grained features like where the watch
01:40:02.780 | hands are pointing no dice and you can say okay well maybe uh the model just doesn't know how to
01:40:07.660 | tell time from knowing the position of the hands but if you actually prompt it textually it's very
01:40:12.220 | easy for it to tell the time so this to me is proof that these llms literally cannot see the
01:40:17.180 | position of the watch hands and it can't see those details so the question is sort of why and uh for
01:40:22.380 | you anthropic heads out there claude fails too um so the the my first pick for best paper of 2024
01:40:30.620 | envision is this mmvp paper which tries to investigate why do llms not have the ability
01:40:35.900 | to see fine-grained details and so for instance it it comes up with a lot of images like this
01:40:40.860 | where you ask it a question that seems very visually apparent to us like which way is the
01:40:44.540 | school bus facing and it gets it wrong and then of course it makes up details to support its wrong
01:40:48.620 | claim um and so the process by which it finds these images is sort of contained in its hypothesis for
01:40:55.740 | why it can't uh see these details so it hypothesizes that models that have been initialized with with
01:41:03.260 | clip as their vision encoder they don't have fine-grained details and the features extracted
01:41:09.180 | using clip because um clip sort of doesn't need to find these fine-grained details to do its job
01:41:15.100 | correctly which is just to match um captions and images right um and sort of at a high level even
01:41:21.340 | if chat gpt wasn't initialized with clip um and wasn't trained contrastively at the vision encoder
01:41:26.780 | wasn't trained contrastively at all still in order to do its job of capturing the image uh it could
01:41:32.140 | do a pretty good job without actually finding the exact position of all the objects and visual
01:41:37.020 | features in the image right so this paper finds a set of difficult images for these types of models
01:41:44.540 | and the way it does it is it looks for embeddings that are similar in clip space but far
01:41:48.540 | in dyna v2 space so dyna v2 is a foundation model that was trained um self-supervised purely
01:41:55.020 | on image data um and it kind of uses like some complex student teacher framework but essentially
01:42:01.340 | and like it patches out like certain areas of the image or like crops with certain areas of
01:42:06.220 | the image and tries to make sure that those have consistent representations which is a way for it
01:42:09.740 | to learn very fine-grained visual uh features and so if you take things that are very close in clip
01:42:15.660 | space and very far in dyna v2 space you get a set of images that um basically a pairs of images that
01:42:22.620 | are hard for chat gpt and other big language models to distinguish so if you then ask it
01:42:27.900 | questions about this image well as you can see from this chart it's going to answer the same way
01:42:33.420 | um for both images right because to to from the perspectives of vision encoder they're the same
01:42:38.780 | image and so if you ask a question like how many eyes does this animal have it answers the same for
01:42:43.340 | both and like all these other models including lava um do the same thing right and so this is
01:42:49.260 | the the benchmark that they create which is like finding clip like clip blind pairs which is pairs
01:42:54.860 | of images that are similar in clip space and creating a data set of multiple choice questions
01:42:59.820 | based off of those um and so how do these models do well really bad um lava i think so so chat gpt
01:43:08.620 | and jim and i do a little bit better than random guessing but like half of the performance of
01:43:12.460 | humans who find these problems to be very easy uh lava is interestingly extremely negatively
01:43:19.740 | correlated with this data set it does much much much much worse than random guessing which means
01:43:24.780 | that this process has done a very good job of identifying hard images for for lava specifically
01:43:30.780 | and that's because lava is basically not trained for very long and is initialized from clip and so
01:43:37.020 | you would expect it to do poorly on this data set so one of the proposed solutions that this paper
01:43:44.140 | attempts is by basically saying okay well if clip features aren't enough what if we train
01:43:48.380 | the visual encoder of the language model also on dyno features and so it um proposes two different
01:43:54.540 | ways of doing this one out of additively um which is basically interpolating between the two features
01:44:00.460 | and then one is interleaving which is just kind of like training one on the combination of
01:44:05.340 | both features so there's this really interesting trend when you do the additive mixture of features
01:44:10.620 | so zero is all um clip features and one is all dyna v2 features so it as you in so i think it's
01:44:21.100 | helpful to look at the rightmost chart first which is as you increase the number of dyna v2 features
01:44:25.500 | your model does worse and worse and worse on the actual language modeling task and that's
01:44:29.420 | because dyna v2 features were trained completely from a self-supervised manner and completely in
01:44:34.620 | image space it knows nothing about text these features aren't really compatible with these text
01:44:38.940 | models and so you can train an adapter all you want but it seems that it's in such an alien
01:44:43.580 | language that it's like a very hard optimization for this these models to solve and so that kind
01:44:49.420 | of supports what's happening on the left which is that yeah it gets better at answering these
01:44:55.260 | questions as you include more dyna v2 features up to a point but then you when you oversaturate it
01:45:01.500 | completely loses its ability to like answer language and and do language tasks um so uh
01:45:10.140 | you can also see with the interleaving like they essentially double the number of tokens that are
01:45:14.860 | going into these models um and just train on both and it still doesn't really solve the mmvp task
01:45:20.620 | it gets lava 1.5 above random guessing by a little bit but still not close to um chachi pt or any
01:45:28.460 | like human performance obviously um so clearly this proposed solution of just using dyna v2
01:45:34.460 | features directly isn't going to work and basically what that means is that as a um
01:45:39.660 | as a vision foundation model dyna v2 is going to be insufficient for language tasks right
01:45:45.340 | so my next pick for best paper of 2024 um would be florence 2 which tries to solve this problem
01:45:52.700 | by incorporating not only this dimension of spatial hierarchy which is to say pixel level
01:45:58.940 | understanding but also in making sure to include what they call semantic granularity which ends up
01:46:05.020 | the goal is basically to have features that are sufficient for finding objects in the image so
01:46:10.860 | they're they're they have enough pixel information but also can be talked about and can be reasoned
01:46:16.780 | about um and that's on the semantic granularity axis so here's an example of um basically three
01:46:25.500 | different paradigms of labeling that they do um so they create a big data set um one is text
01:46:32.060 | which is just captioning and you would expect a model that's trained only on captioning to
01:46:35.900 | have similar performance like chachi pt and like not have uh spatial hierarchy not have
01:46:41.660 | features that are meaningful at the pixel level and so they add another type which is
01:46:46.220 | region text pairs which is essentially either classifying a region or um
01:46:51.900 | doing object detection or doing instant segmentation on that region or captioning that
01:46:59.500 | region and then they have text phrase region annotations which is essentially a triple um
01:47:05.580 | and basically not only do you have a region that you've described you also find it's like
01:47:10.860 | its place in a descriptive paragraph about the image which is basically trying to introduce even
01:47:16.700 | more like semantic understanding of these regions and so like for instance if you're saying a woman
01:47:21.260 | riding on the road right you have to know what a woman is and what the road is and that she's on
01:47:25.340 | top of it and that's that's basically composing a bunch of objects in this visual space but also
01:47:30.300 | thinking about it semantically right um and so the way that they do this is they take um basically
01:47:36.860 | they just dump uh features from a vision encoder straight into a uh encoder decoder transformer
01:47:44.860 | um and then they train a bunch of different tasks like object detection and so on uh as a language
01:47:52.540 | task and i think that's one of the big things that we saw in 2024 is these these um vision
01:47:59.260 | language models operating in on pixel space linguistically so they introduce a bunch of
01:48:04.380 | new tokens to point to locations and um in pixel space so how does it work how does it actually do
01:48:13.180 | we can see uh if you look at the graph on the right which is using the the dino the uh the dino
01:48:20.300 | framework um your your pre-trained florence 2 models transfer very very well they get 60 60
01:48:28.540 | percent map on cocoa which is like approaching state-of-the-art and they train with you're good
01:48:34.540 | and they train with a much more um uh much more efficiently so they they converge a lot faster
01:48:41.020 | which both of these things are pointing to the fact that they're actually leveraging
01:48:44.940 | their pre-trained weights effectively um so where is it falling short so these models i forgot to
01:48:52.380 | mention florence is a 0.2 billion and a 0.7 billion parameter count so they're very very
01:48:57.820 | small in terms of being a language model um and i think that this framework you can see saturation
01:49:04.460 | so what this graph is showing is that if you train a florence 2 model purely on the image
01:49:10.460 | level and region level annotations and not including the pixel level annotations like
01:49:14.860 | segmentation it actually performs better as an object detector and what that means is that
01:49:21.660 | it's not able to actually learn all the visual tasks that it's trying to learn because it doesn't
01:49:26.940 | have enough capacity so i'd like to see this paper explore larger model sizes which brings us
01:49:31.660 | to our next big paper of 2024 um or two papers so polygema came out earlier this year polygema 2 was
01:49:39.580 | released i think like a week or two ago um oh i forgot to mention you can actually train like
01:49:45.340 | label text data sets on roboflow and you can train a florence 2 model and you can actually train a
01:49:49.980 | train a polygema 2 model on roboflow which we got into the platform within like 14 hours of release
01:49:54.780 | which i was really excited about so anyway so polygema 2 and so polygema is essentially doing
01:50:00.620 | the same thing but instead of doing an encoder decoder it just dumps everything into a decoder
01:50:04.460 | only transformer model um but it also introduced the concept of location tokens to point to
01:50:08.940 | objects in pixel space polygema 2 so polygema uses gemma as the language encoder and it uses
01:50:15.820 | gemma 2b polygema 2 introduces using multiple different sizes of language encoders um so the
01:50:23.260 | way that they sort of get around having to do encoder decoder is they use the concept of prefix
01:50:28.460 | loss which basically means that when it's generating tokens um autoregressively it's
01:50:35.660 | all those uh tokens in the prefix which is like the image that it's looking at and like a
01:50:40.540 | description of the task that it's trying to do they're attending to each other fully full attention
01:50:45.420 | um which means that you know it can sort of find high level uh it's easier for the the prefix to
01:50:52.060 | color to color the output of the suffix and also to just find like features uh easily so
01:51:00.460 | this is sort of an example of like one of the tasks that was trained on which is like you
01:51:04.700 | describe the task in english um and then you give it all these like you're asking for it to segment
01:51:12.860 | these two classes um of objects and then it finds like their locations using these look tokens and
01:51:19.740 | it finds their masks using uh some encoding of the masks into tokens and yeah so one of my critiques
01:51:30.780 | i guess of polygema one at least is that um you find that performance saturates as a pre-trained
01:51:36.380 | model after only 300 million examples seen um so what this graph is representing is each blue dot
01:51:43.660 | is a performance on some downstream task you can see that after seeing 300 million examples
01:51:49.260 | it sort of does equally well on all of the downstream tasks that they tried it on which
01:51:55.340 | was a lot as 1 billion examples which to me also kind of suggests a lack of capacity for this model
01:52:02.060 | polygema 2 you can see the results on object detection so these were transferred to um
01:52:10.460 | to coco um and you can see that this sort of also points to an increase in capacity being
01:52:17.180 | helpful to the model you can see as both the resolution increases and the parameter count
01:52:23.020 | of the language model increases performance increases so resolution makes sense obviously
01:52:26.780 | it helps to find small images or small objects in the image but also makes sense from another reason
01:52:31.820 | which is that it kind of gives the model a thinking register and it gives it more tokens to
01:52:35.900 | like process when making its predictions um but yeah you could you could say oh 43.6 that's not
01:52:42.860 | that great like um Florence 2 got 60 but this is not training a dino or a debtor on top of this
01:52:50.140 | language or this image encoder it's doing the raw language modeling task on coco um so it doesn't
01:52:57.660 | have any of the bells whistles it doesn't have any of the fancy losses it doesn't even have
01:53:01.260 | bipartite graph matching or anything like that okay the big result and one of the reasons that
01:53:07.580 | I was really excited about this paper is that they blow everything else away on mmvp I mean 47.3
01:53:13.980 | sure that's nowhere near human accuracy which again is 94 but for a you know a two billion
01:53:19.500 | language two billion parameter language model to be chat2bt that's quite the achievement
01:53:23.820 | um and that sort of brings us to our final pick for paper of the year which um is aimv2 so
01:53:34.380 | aimv2 sort of says okay maybe this language model like maybe coming up with all these specific
01:53:40.780 | annotations to find features and with high fidelity and pixel space isn't actually necessary
01:53:47.420 | and we can come up with an even simpler more beautiful idea for combining um you know image
01:53:53.580 | tokens and pixel tokens in a way that's interfaceable for language tasks um and this
01:53:59.020 | is nice because it can scale you can come up with lots more data if you don't have to come up with
01:54:03.260 | all these annotations right so the way that it works is it does something very very similar to
01:54:07.900 | polygemo where you have a vision encoder that dumps image tokens into a decoder only transformer
01:54:13.420 | but the interesting thing is that it also autoregressively tries to learn
01:54:19.580 | the mean squared error of the image tokens so instead of having to come up with fancy object
01:54:24.940 | detection or semantic or segment or segmentation labels you can just try to reconstruct the image
01:54:30.060 | and have it learn fine-grained features that way um and it does this in kind of i think a beautiful
01:54:35.580 | way that's kind of compatible with the polygemo line of thinking which is randomly sampling a
01:54:39.820 | prefix prefix length and using only this number of image tokens as the prefix um and so doing a
01:54:47.580 | similar thing with the uh causal so the causal prefix is the the attention mask on the right so
01:54:53.340 | it's doing full block attention with some randomly sampled number of image tokens to then reconstruct
01:54:58.700 | the rest of the image and the downstream caption for that image and so this is the data set that
01:55:06.380 | they train on it's image or internet scale data very high quality data created by the
01:55:11.500 | data filtering networks paper essentially which is maybe the best clip data that exists
01:55:18.700 | and we can see that this is finally a model that doesn't saturate it's even at the highest
01:55:27.020 | parameter count it's it appears to be well at the highest parameter account it appears to be
01:55:34.140 | improving in performance with more and more samples seen and so you can sort of think that
01:55:39.100 | uh you know if we just keep bumping the parameter count and increasing the example scene which is
01:55:44.380 | the the line of thinking for language models then it'll keep getting better so how does it actually
01:55:49.900 | do at finding oh it also improves with resolution which you would expect for a model that um
01:55:57.100 | this is the image net classification accuracy but yeah it does better if you increase the
01:56:01.740 | resolution which means that it's actually leveraging and finding fine-grained visual
01:56:05.820 | features um and so how does that actually do compared to clip on coco well you can see that
01:56:12.620 | if you slap a transformer uh detection head on it and train on coco it's just 60.2 which is also
01:56:18.780 | within spitting distance of soda which means that it does a very good job of finding um visual
01:56:24.300 | features but you could say okay well wait a second uh clip got to 59.1 so like how does this prove
01:56:33.100 | your claim at all because doesn't that mean like clip which is known to be clip blind and do badly
01:56:38.300 | on mmvp it's able to achieve a very high performance on fine on this fine-grained visual
01:56:43.660 | features task of object detection well they train on like tons of data they train on like objects
01:56:49.740 | 365 coco flicker and everything else and so i think this benchmark doesn't do a great job of
01:56:56.300 | selling how good of a pre-trained model mv2 is and we would like to see uh performance on
01:57:02.060 | fewer data as examples and not trained to convergence on object detection so
01:57:07.100 | seeing it in the real world on like a data set like robo flow 100 i think would be
01:57:11.100 | quite interesting and our i guess our final final pick for paper of 2024 would be moondream so
01:57:17.420 | introducing vick to talk about that
01:57:21.260 | uh but overall that was exactly what i was looking for like best of 2034 amazing job
01:57:28.540 | um uh yeah you can there's any other questions while vick gets set up like vision stuff
01:57:42.540 | hi well while we're getting set up hi over here thanks for the really awesome talk one of the
01:57:48.940 | things that's been weird and surprising is um that the foundation model companies uh
01:57:56.460 | even these mlms they're just like worse than rt tether at detection still like if you wanted to
01:58:05.180 | pay a bunch of money uh to auto label your detection data set if you gave it to openai
01:58:10.060 | or claude that would be like a big waste um so i'm curious just like even polygema 2 like uh
01:58:16.700 | is worse so so i'm curious to hear your thoughts on like how come nobody's cracked the code on like
01:58:22.700 | a generalist that really uh you know beats a specialist model in computer vision like they
01:58:30.380 | have in uh in lm land i can can you hear me okay oh yeah um it's very very interesting question
01:58:46.380 | i think um it depends on the specific domain uh for image classification it's basically there
01:58:53.260 | in the aim v2 showed a simple attentional probe on the pre-trained features gets like 90 which is
01:59:00.380 | as well as anyone does um the the the bigger question like why isn't it transferring to
01:59:06.860 | uh uh object detection especially like real-time object detection um i think in my mind there are
01:59:15.100 | two answers one is object detection is really really really uh the architectures are super
01:59:21.980 | domain specific you know we see these all these super super complicated things and it's not
01:59:26.700 | super easy to to to build something that just transfers naturally like that whereas
01:59:31.740 | image classification you know clip pre-training transfers super super
01:59:34.860 | easily um and the other thing is until recently the real-time object detectors didn't even really
01:59:43.340 | benefit from pre-training like you see the yolos that are like essentially saturated showing very
01:59:48.540 | little difference with uh pre-training improvements uh with using pre-trained model at all it's not
01:59:54.700 | surprising necessarily that people aren't looking at the effects of better and better pre-training
02:00:01.420 | on real-time detection maybe that'll change in the next year does that answer your question
02:00:05.260 | cool uh can you guys hear me uh yeah one thing i want to add is just like or just to summarize
02:00:12.860 | basically is that like until 2024 you know we haven't really seen a combination of transformer
02:00:19.340 | based uh object detectors and uh fancy losses and polygema suffers from the same problem which
02:00:25.900 | is basically to say that um these resnet are like the convolutional models they have all these like
02:00:32.940 | extreme optimizations for for doing object detection but essentially i think it's kind of
02:00:38.940 | been shown now that convolution models like just don't benefit from pre-training and just don't
02:00:42.780 | like have the level of intelligence to transform models awesome hi can you hear me cool sure you
02:00:54.780 | see you are you sharing your screen i might have forgotten to do that let me do that sorry
02:01:09.260 | oh here's your screen uh-oh classic um you might have to quit zoom and restart what um
02:01:18.140 | it's fine yeah it's like we we have we have a capture of your screen i'll just make sure it's
02:01:24.220 | visible so let's get to okay easy now
02:01:33.500 | to make it likely for you
02:01:34.860 | but soon no yeah yeah there you go perfect all right hi everyone my name is vic um i've been
02:01:46.460 | working on moon dream for almost a year now like sean mentioned i just went and looked and it turns
02:01:51.580 | out the first version i released december 29 2023 um it's been a fascinating journey so moon dream
02:01:58.940 | um started off as a tiny vision language model since then we've expanded scope a little bit to
02:02:04.300 | also try and build some tooling client libraries etc to help people really deploy it
02:02:09.020 | um unlike traditional large models that are focused at assistant type use cases we're
02:02:16.700 | laser focused on building um capabilities that developers can sorry it's uh
02:02:27.100 | yeah we're laser focused on building capabilities that developers can use to build vision applications
02:02:32.060 | uh that can run anywhere so in a lot of cases for vision more so than for text you really care about
02:02:37.580 | being able to run on the edge run in real time etc so um it's really important we have um we have
02:02:44.540 | different output modalities that we support there's query where you can ask general english
02:02:48.380 | questions about an image and get back human-like answers there's captioning which allows you to
02:02:53.660 | get back human-like answers there's captioning which a lot of our users use for generating
02:02:59.340 | synthetic data sets to then train diffusion models and whatnot um we've done a lot of work to minimize
02:03:04.140 | the hallucinations there so that's um used a lot we have open vocabulary object detection built-in
02:03:09.900 | similar to a couple more recent models like pali gem etc where rather than having to train a dedicated
02:03:14.540 | model you can just say show me soccer balls in this image or show me there any deer in this image
02:03:19.820 | detected uh more recently earlier this month we released pointing capability where if all
02:03:26.860 | you're interested in is the center of an object um you can just ask it to point out where that
02:03:32.940 | is this is very useful when you're doing ui automation type stuff um let's see
02:03:38.860 | la we we have two models out right now there's a general purpose to be paramodel which um
02:03:48.300 | runs fair like it's it's uh it's fine if you're running on server it's uh good for our localama
02:03:53.260 | desktop friends and you can run on flagship flagship mobile phones but it never really
02:03:58.300 | fulfill the promise of being able to run anywhere uh last week released a new 0.5b paramodel
02:04:03.500 | which should be seen more as a distillation target as opposed to a general purpose model
02:04:08.780 | uh it's very good if you're running on like older mobile phones or edge devices uses less memory
02:04:15.980 | even with our not yet fully optimized inference client um so the way we built our 0.5b model was
02:04:24.780 | to start with the two billion parameter model um and prune it while doing continual training to
02:04:32.620 | retain performance we our objective during the pruning was to preserve accuracy across a broad
02:04:40.140 | set of benchmarks so the way we went about it was to estimate the importance of different
02:04:44.380 | components of the model like attention heads channels um mlp rows and whatnot um using
02:04:51.500 | basically a technique based on the gradient i'm not sure how much people want to know details
02:04:55.900 | we'll be writing a paper about this but uh feel free to grab me if you have more questions
02:04:59.660 | uh then we iteratively prune a small chunk that will minimize loss in performance uh retrain the
02:05:05.500 | model to recover performance and bring it back um the 0.5b we release is more of a proof of concept
02:05:11.660 | that this is possible i think the thing that's really exciting about this is it makes it possible
02:05:15.180 | for um for developers to build using the 2b parameter model and just explore build their
02:05:24.540 | application and then once they're ready to deploy uh figure out what exactly they need out of the
02:05:28.940 | model and prune those capabilities into a smaller form factor that makes sense for their deployment
02:05:33.100 | target um so yeah very excited about that let me talk to you folks a little bit about uh another
02:05:40.540 | problem i've been working on recently which is similar to the clocks example we've been talking
02:05:44.140 | about we had a customer reach out who was uh talking about like who had a bunch of gauges
02:05:50.300 | out in the field this is very common in manufacturing and oil and gas where you
02:05:54.140 | have a bunch of analog devices that you need to monitor it's expensive to have humans look at that
02:06:00.620 | and monitor stuff and make sure that uh the system gets shut down when the temperature goes over 80
02:06:06.060 | or something so i was like yeah this seems easy enough happy to happy to help you distill that
02:06:11.020 | uh let's let's get it going turns out our model couldn't do it at all uh i went and looked at
02:06:15.900 | other open source models to see if i could just generate a bunch of data and learn from that that
02:06:20.940 | did not work either so i was like let's look at what the folks with hundreds of billions of dollars
02:06:25.580 | in market cap have to offer and yeah that doesn't work either um my hypothesis is that like the
02:06:35.100 | the way these models are trained are using a large amount of image text data scraped from
02:06:40.220 | the internet and that can be biased in the case of gauges most gauge images aren't gauges in the
02:06:45.740 | wild they're product detail images like these where it's always set to zero it's paired with
02:06:51.420 | an alt text that says something like givto pressure sensor psi zero to 30 or something
02:06:58.620 | and so the models are fairly good at picking up those details it'll tell you that it's a
02:07:01.980 | pressure gauge it'll tell you what the brand is but it doesn't really learn to pay attention to
02:07:05.420 | the needle over there um and so yeah that's a gap we need to address so naturally my mind goes to
02:07:16.220 | like let's use synthetic data to solve this problem um that works but it's problematic because it
02:07:23.180 | turned out we needed millions of synthetic gauge images to get to reasonable performance and
02:07:27.660 | thinking about it reading a gauge is like not a one like it's not a zero short process in our
02:07:33.660 | minds right like if you had to tell me the reading in celsius for this real world gauge
02:07:38.860 | there's two dials on there so first you have to figure out which one you have to be paying
02:07:42.300 | attention to like the inner one or the outer one um you look at the tip of the needle you look at
02:07:48.220 | what labels it's between and you count how many and do some math to figure out what that probably
02:07:55.340 | is so what happens if we just add that as chain of thought um to give the model better understanding
02:08:04.300 | of the difference up to allow the model to better learn the subtasks it needs to perform to accomplish
02:08:09.580 | this goal um so you can see in this example this was actually generated by the latest version of
02:08:15.100 | our model uh it's like okay celsius is the inner scale it's between 50 and 60 there's 10 ticks
02:08:22.060 | it's at the second tick it's a little debatable here like there's a weird shadow situation going
02:08:25.900 | on the dial is off so i i don't know what the ground truth is but it works okay um there's
02:08:33.020 | points on there that the points over there are actually grounded i don't know if this is easy
02:08:38.140 | to see but when i click on those there's a little red dot that moves around on the image the model
02:08:42.780 | actually has to predict where uh those points are i was already trying to do this with bounding boxes
02:08:48.620 | but then malmo came out with pointing capabilities and it's like pointing is a much better paradigm to
02:08:54.620 | uh to represent this we see pretty good results this one's actually for clock reading i
02:09:01.900 | couldn't find our chart for gauge reading at the last minute so um the light blue chart is
02:09:09.980 | with uh our grounded chain of thought um this measures we have we built a clock reading
02:09:16.620 | benchmark about 500 images this measures accuracy on that um you can see it's a lot more sample
02:09:23.020 | efficient uh when you're using the chain of thought to help the model um yep another big benefit
02:09:34.300 | from this approach is like you can kind of understand how the model is doing it and how
02:09:40.300 | it's feeling so in this example the actual correct reading is 54 celsius the model output 56
02:09:46.620 | not too bad um but you can actually go and see where it messed up like it got a lot of these
02:09:53.660 | right except uh instead of saying it was on the seventh tick it actually predicted that was it was
02:10:00.300 | the eighth eighth tick and that's why it went with 56 so now that you know that this is failing in
02:10:07.340 | this way you can adjust how you're doing the chain of thought to maybe say like actually count out
02:10:10.940 | each tick from 40 instead of just trying to say it's the eighth tick or you might say like okay
02:10:15.660 | i see that there's that middle thing i'll count from there instead of all the way from 40 um
02:10:20.780 | so helps a ton the other thing i'm excited about is a few short prompting or test time
02:10:26.540 | training with this like if a customer has a specific gauge that uh like we're seeing minor
02:10:31.340 | errors on they can give us a couple of examples where like if it's misdetecting the needle they
02:10:37.340 | can go in and correct that in the chain of thought and hopefully that works the next time um
02:10:41.820 | now exciting approach we only apply it to clocks and gauges the real question is is it going to
02:10:48.380 | generalize um probably like there's some signs from text models that when you train on a broad
02:10:53.500 | number of tasks it does generalize and um i'm seeing some signs with our model as well um
02:10:59.580 | so in addition to the image-based chain of thought stuff i also added some spelling-based
02:11:03.820 | chain of thought uh to help it understand uh better understand ocr i guess um i don't understand
02:11:11.740 | why everyone doesn't do this by the way like it's trivial benchmark question that's very very easy
02:11:16.860 | to nail um but i also wanted to support it for stuff like license plate partial matching like
02:11:23.580 | hey does any license plate in this image start with wha or whatever um so yeah that sort of worked
02:11:30.700 | um all right that that ends my story about the gauges if you think about what's going on over
02:11:39.020 | here um it's interesting that like llms are showing enormous progress in reasoning especially
02:11:48.540 | with the latest set of models that we've seen but we're not really seeing i i have a feeling that
02:11:54.620 | vlms are lagging behind as we can see with these tasks that should be very simple for a human to
02:12:01.660 | do that are very easy to find um vlms failing at uh my hypothesis on why this is the case is because
02:12:08.460 | on the internet there's a ton of data that talks about how to reason there's books about how to
02:12:14.780 | solve problems there's books critiquing the books about how to solve problems but humans are just so
02:12:19.260 | good at perception that we never really talk about it like maybe in art books where it's like hey to
02:12:24.540 | show that that mountain is further away you need to desaturate it a bit or whatever but um the
02:12:31.740 | actual data on how to like look at images is isn't really present also the data we have is kind of
02:12:37.500 | sketch the best source of data we have is like image all text pairs on the internet and that's
02:12:41.500 | pretty low quality um so yeah i i think our solution here is really just we need to teach
02:12:47.180 | them how to operate on individual tasks and figure out how to scale that out um all right yep so
02:12:56.780 | conclusion uh at moon dream we're trying to build amazing blms that run everywhere very hard
02:13:02.780 | problem much work ahead but uh we're making a ton of progress and i'm really excited about
02:13:07.340 | um if anyone wants to chat about more um technical details about how we're doing
02:13:12.620 | this or interested in collaborating please please hit me up
02:13:15.260 | yeah like i always when people say when people say multi-modality like you know always think
02:13:26.460 | about vision as the first among equals in all the modalities so i really appreciate
02:13:31.260 | having the experts um okay we are a little bit out of time so we're going to move on to luca
02:13:36.940 | um and talk about open models but if anyone wants to talk to the vision guys i think there's like
02:13:42.700 | coffee and tea outside we're going to have lunch in an hour as well um so you can ask follow-up
02:13:48.620 | questions uh outside if you if you wish but yeah luca you go you get set up with uh your mic okay
02:13:56.860 | we sent you a zoom okay uh it's on it's on the calendar and then
02:14:03.180 | alan can set you up with the respondents
02:14:10.540 | hey i'm just yeah i'm just
02:14:33.740 | they just screen share for here no audio no audio no yeah speecher uh plus plug-in
02:14:39.340 | oh yeah you gotta stick around people you stick around people for sure
02:14:45.340 | are you also presenting i'm backup okay
02:14:50.300 | so i didn't know what you're because you're you're coming later yeah i don't really know either
02:14:59.100 | how was your session yesterday for the tutorial yeah
02:15:03.260 | your master class
02:15:07.580 | yeah it's just good um definitely polish the slides
02:15:27.340 | yeah so share your screen
02:15:28.540 | cool yeah i think you're set um so as you speak into that mic but any of your
02:15:42.700 | nathan's microphone no you want me to be we'll just put this on yeah
02:15:51.340 | i have the same thing yeah so these two mics they're good all right all right cool um yeah
02:16:01.980 | thanks for having me over um i'm luca i'm a research scientist at the alliance for ai
02:16:07.980 | i threw together a few slides on sort of like a recap of like interesting themes in open models
02:16:15.740 | for for 2024 um have about maybe 20-25 minutes of slides and then we can chat if there are any
02:16:22.940 | questions if i can advance to the next slide okay cool um so um i did the quick check of like
02:16:33.340 | to sort of get a sense of like how much 2024 was different from 2023 um so i went on hug and face
02:16:39.580 | and sort of tried to get a picture of what kind of models were released in 2023 and like what do
02:16:45.100 | we get in 2024 um 2023 you get we got things like uh both llama one and two we got mistro got mpt
02:16:53.020 | falcon models think the yi model came at the tail end of the year it was a pretty good year
02:16:58.460 | but then i did the same for 2024 um and it's actually quite stark difference um you have
02:17:08.860 | models that are you know reveling frontier level performance of what you can get from close models
02:17:15.420 | from like quen from deep seek we got llama three we got all sorts of different models um i added
02:17:23.260 | our own uh olmo at the bottom uh there's this uh growing group of like fully open models that i'm
02:17:29.260 | going to touch on a little bit later um but you know just looking at the slides it feels like
02:17:35.500 | 2024 was just smooth sailing happy news much better than previous year um and you know you
02:17:42.940 | can plot um you can pick your favorite benchmark or least favorite i don't know depending on what
02:17:50.460 | point you're trying to make um and plot you know your closed model your open model um and sort of
02:17:58.220 | spin it in ways that show that oh you know open models are much closer to where closed models
02:18:04.860 | are today versus to versus last year where the gap was fairly significant um so one thing that
02:18:14.860 | i think i don't know if i have to convince people in this room but usually when i give this talks
02:18:21.500 | about like open models there is always like this background question in in in people's mind of like
02:18:27.180 | why should we use open models um is it just use model apis argument you know it's it's
02:18:33.820 | just an hdp request to get output from a from one of the best model out there why do i have to set
02:18:39.500 | up infra use local models um and they're really like to answer um there is the more researchy
02:18:47.820 | answer for this which is where my background lays which is um just research if you want to do
02:18:55.180 | research on language models research thrives on on open models there is like large worth of research
02:19:01.580 | on modeling on how these models behave on evaluation and inference on uh mechanistic
02:19:08.300 | interpretability that could not happen at all if you didn't have open models um they're also um
02:19:16.140 | for ai builders there are also like good use cases for using um local models um you know you have
02:19:24.940 | some this is like a very not uh comprehensive slides but you have things like there are some
02:19:29.660 | applications where local models just blow close models out of the water um so like retrieval it's
02:19:37.020 | a very clear example um you might have like constraints like edge ai applications where it
02:19:42.860 | makes sense but even just like in terms of like stability being able to say this model is not
02:19:47.980 | changing under the hood um it's there's plenty of good cases for for um open models um and the
02:19:56.860 | community is just not models um is i stole this slide from uh one of the quen2 announcement blog
02:20:04.860 | posts uh but it's super cool to see like how much um tech exists around um open models on serving
02:20:13.660 | them on making them efficient and hosting them it's pretty cool um and um it's um if you think
02:20:23.820 | about like where the term opens come from comes from like the open source um really open models
02:20:29.740 | meet the core tenants of of um open of open source uh specifically when it comes around
02:20:37.900 | collaboration there is truly a spirit like through these open models you can build on top of others
02:20:44.060 | people innovation um we see a lot of these even in our own work of like you know as we iterate
02:20:50.860 | in the various version of almo um it's not just like every time we collect from scratch all the
02:20:57.900 | data no the the first step is like okay what are the cool data sources and datasets people have put
02:21:04.060 | together for language model for training um or when it comes to like our post-training pipeline
02:21:11.820 | we uh one of uh the steps is um you want to do some dpo and use a lot of uh outputs of other models
02:21:21.100 | uh to improve your your preference model so it's really um having like an open sort of ecosystem
02:21:28.140 | benefits and accelerates the development of open models um one thing that um we got in 2024 which
02:21:37.420 | is not a specific model but i thought it was really significant is we first got uh we got our
02:21:42.780 | first open source ai definition um so this is from the open source initiative um they've been
02:21:50.220 | generally the steward of a lot of the open source licenses when it comes to software
02:21:55.100 | and so they embarked on this journey and trying to figure out okay
02:22:00.060 | how does a license an open source license for a model look like
02:22:03.740 | um majority of the work is very dry because licenses are dry so i'm not gonna walk through
02:22:11.500 | the license step by step but um i'm just gonna pick out uh one aspect that is very good uh and
02:22:19.820 | then one aspect that personally feels like it needs improvement on the good side um this um
02:22:26.780 | this open source ai license actually this is very intuitive if you ever build open source software
02:22:33.420 | and you have some expectation around like what open source uh looks like for software uh for
02:22:41.260 | for ai sort of matches your intuition so the weights need to be fairly available uh the code
02:22:49.020 | must be released with an open source license uh and there shouldn't be like license clauses that
02:22:56.380 | block specific use cases so under this definition for example lama or some of the quen models are
02:23:03.580 | not open source because the license says you can't you can't use this this model for this
02:23:09.340 | or it says if you use this model you have to name the output this way or derivative needs to be uh
02:23:15.660 | named that way those clauses don't meet open source definition um and so they will not be
02:23:20.780 | cover the the lama license will not be cover under the open source definition um it's not perfect um
02:23:30.300 | one of the things that um um internally you know in discussion with with osi we were sort of
02:23:38.700 | disappointed is around um the language for data um so you might imagine that an open source
02:23:47.980 | ai model means a model where the data is freely available uh there were discussion around that
02:23:53.420 | but at the end of the day they decide to go with a soften stance where they say um a model is open
02:24:00.860 | source if you provide sufficient detailed information on how to sort of replicate the
02:24:06.780 | data pipeline so you have an equivalent system sufficient sufficiently detailed uh it's very
02:24:14.300 | it's very fuzzy don't like that an equivalent system is also very fuzzy um and this doesn't
02:24:21.500 | take into account the accessibility of the process right it might be that you provide enough
02:24:26.700 | information but this process costs I don't know 10 million dollars to do um now the open source
02:24:33.580 | definition like any open source license has never been about accessibility so that's never factor
02:24:40.140 | in open source software how accessible software is um I can make a piece of open source put it on
02:24:46.540 | my hard drive and never access it that software is still open source the fact that it's not widely
02:24:51.340 | distributed doesn't change the license but practically the right expectation of like what
02:24:57.020 | we want good open sources to be so it's kind of sad to see that um the the data component
02:25:04.220 | in this license is not as as open as some of us would like uh would like it to be and I linked
02:25:11.500 | the blog post that Nathan wrote on the topic that it's less rambly and easier to follow through
02:25:18.460 | um one thing that in general I think it's fair to say about the state of open models in 2024 is that
02:25:28.780 | we know a lot more than what we knew in in 2023 um like um both on the training data like the
02:25:37.260 | pre-training data you curate um on like how to do like all the post-training especially like on the
02:25:43.580 | RL side um you know 2023 was a lot of like throwing random darts at the board uh I think 2024 we have
02:25:51.900 | clear recipes that okay don't get the same results as a closed lab because there is a cost
02:25:57.260 | in in actually matching what they do um but at least we have a good sense of like okay this is
02:26:03.020 | this is the path to get state-of-the-art language model um I think that one thing that it's a
02:26:09.900 | downside of 2024 is that I think we are more research constrained than 2023 it feels that
02:26:18.220 | like you know the barrier for compute that you need to to move innovation along that's just
02:26:24.940 | being right uh rising and rising um so like if you go back to this slide there is now this this
02:26:31.660 | cluster of models that are sort of released by the compute rich club um membership is hotly debated
02:26:39.980 | um you know some people don't want to be called rich because it comes to expectations some people
02:26:45.740 | want to be called rich but I don't know there's debate but like these are players that have you
02:26:50.380 | know 10,000 50,000 GPUs at minimum um and so they can do a lot of work um and a lot of exploration
02:26:58.620 | in improving models that it's not very accessible um to give you a sense of like how I personally
02:27:06.300 | think about research budgets um for each part of the of the language model pipeline is like on the
02:27:15.340 | pre-training side you can maybe do something with a thousand GPUs really you want 10,000 and like if
02:27:21.660 | you want real estate of the art you know your deep-seek and minimum is like 50,000 um and you
02:27:27.180 | can scale to infinity the more you have the better it gets um everyone on that side still complains
02:27:32.140 | that they don't have enough GPUs uh post-training is a super wide um sort of uh spectrum you can do
02:27:40.780 | as little with like eight GPUs um as long as you're able to um run you know a a good version
02:27:51.100 | of say a llama model you can do a lot of work there um you can scale a lot of the methodology
02:27:57.420 | just like scales with compute right if you're interested in um you know your open replication
02:28:05.100 | of what OpenAI's 01 is um you're going to be on the 10k spectrum of our GPUs um inference you can
02:28:12.780 | do a lot with very few resources evaluation you can do a lot with well I should say at least one
02:28:19.020 | GPUs if you want to evaluate um open models but um in general like if you are if you care a lot
02:28:27.660 | about intervention to do on this model which is my uh prefer area of research then you know the
02:28:35.500 | resources that you need um are quite quite significant um one of the trends um that has
02:28:43.340 | emerged in 2024 is this cluster of um fully open models um so almost the model that we built AI2
02:28:53.100 | being one of them um and you know it's nice that it's not just us there's like a cluster of other
02:28:59.820 | mostly research um efforts who are working on this um and so it's good to um to give you a primer
02:29:10.860 | of what like fully open means um so fully open the easy way to think about it is instead of just
02:29:18.380 | releasing a model checkpoint that you run you release a full recipe so that um other people
02:29:25.180 | working on it uh working on that space can pick and choose whatever they want from your recipe
02:29:31.660 | and create their own model or improve on top of your model um you're giving out the full pipeline
02:29:37.180 | and all the details there um instead of just like the end output um so I pull up the screenshot from
02:29:44.380 | our recent um MOE model um and like for this model for example we released the model itself
02:29:51.340 | data that was trained on the code both for training and inference um all the logs that
02:29:57.500 | we got through um the training run as well as um every intermediate checkpoint
02:30:03.020 | um and like the fact that you release different part of the pipeline allows others to do really
02:30:10.060 | cool things um so for example this tweet from early this year from uh folks at news research
02:30:17.020 | um they use our pre-training data uh to do a replication of the bitnet paper in the open um
02:30:24.220 | so they took just a really like the initial part of a pipeline um and then did the thing on top of
02:30:31.340 | it um it goes both ways so for example for the old mode 2 model um a lot of our pre-trained data for
02:30:39.820 | the first stage of pre-training um was from this DCLM uh initiative uh that was led by folks uh
02:30:48.220 | ooh a variety of institutions it was a really nice group effort but um you know for when it was nice
02:30:57.580 | to be able to say okay you know the state of the art in terms of like what is done in the open has
02:31:01.660 | improved we don't have to like do all this work from scratch to catch up the state of the art
02:31:07.740 | we can just take it directly and integrate it and do our own improvements on top of that
02:31:13.660 | um i'm gonna spend a few minutes uh doing like a shameless plug for
02:31:19.420 | some of our fully open recipes
02:31:21.900 | um so indulge me in this um so a few things that we released this year was as i was mentioning
02:31:30.220 | this OMOE model um which is i think still is state-of-the-art um MOE model in its size class
02:31:38.780 | and it's also fully open so every components of of this model are available um we release
02:31:46.060 | a multi-modal model called MOLMO um MOLMO is not just a model but it's a full recipe of how you go
02:31:52.460 | from a text-only model to a multi-modal model and we apply this recipe on top of
02:31:58.940 | QUAN checkpoints on top of OMOE checkpoints as well on top of OMOE um and i think they've
02:32:04.380 | been replication doing that on top of Mistral as well um um on on the post-training side
02:32:14.940 | we recently released TULU 3 um same story this is a recipe on how you go from a base model
02:32:20.780 | to a state-of-the-art post-training model we use the TULU recipe on top of OMOE on top of LAMA and
02:32:28.540 | then there's been um open replication effort to do that on top of QUAN as well uh it's really nice
02:32:34.220 | to see like you know when your recipe sort of it's kind of turnkey you can apply it to different
02:32:39.340 | models and it kind of just works um and finally the last thing we released this year was OMO 2
02:32:45.260 | which so far is the best state-of-the-art fully open language model um it sort of combines aspect
02:32:52.860 | from all three of these previous models um what we learned on the data side from OMOE
02:32:57.580 | and what we learned on like making models that are easy to adapt from the multiple project
02:33:02.700 | and the TULU project um i will close with a little bit of reflection like ways this this
02:33:10.380 | ecosystem of open models um like it's not all roses it's not all happy uh it feels like day
02:33:18.060 | to day it's always in peril um and you know i talked a little bit about like the compute issues
02:33:24.300 | that come with it uh but it's really not just compute um one thing that is on top of my mind
02:33:30.860 | is due to like the environment and how um you know growing feelings about like how AI is treated
02:33:39.020 | it's actually harder to get access to a lot of the data that was used to train a lot of the
02:33:45.020 | models up to last year so this is a screenshot from really fabulous work from Shane Longpray
02:33:50.860 | who's i think is in europe um about um just access of uh like diminishing access to data
02:34:00.140 | for language model pre-training so what they did is they um went through every snapshot
02:34:07.260 | of common crawl uh common crawl is this publicly available scrape of the of a subset of the
02:34:12.860 | internet and they looked at how um for any given website uh where the website that was
02:34:19.980 | accessible in say 2017 what whether it was accessible or not in 2024 and what they found is
02:34:26.860 | as a reaction to like the close uh like of the existence of closed models like openai or clod
02:34:36.860 | gpt or clond a lot of content owners have blanket blocked any type of crawling to their website
02:34:44.380 | and this is something that we see also internally at AI2 um like one project that we started this
02:34:50.620 | year is um we wanted to we want to understand like if you're a good citizen of the internet
02:34:57.980 | and you crawl uh following sort of norms and policy that have been established in the last 25 years
02:35:05.740 | what can you crawl and we found that there's a lot of websites where um the norms of how you
02:35:13.180 | express preference of whether to crawl or not are broken a lot of people would block a lot
02:35:18.220 | of crawling but do not advertise that in robots txt you can only tell that they're crawling that
02:35:24.060 | they're blocking you in crawling when you try doing it sometimes you can't even crawl their
02:35:28.860 | robot txt to to check whether you're allowed or not and then a lot of um websites um there's like
02:35:37.340 | all these technologies that historically have been have existed to make websites serving easier
02:35:42.780 | um such as um cloudflare or dns they're now being repurposed for um blocking ai or any type of
02:35:52.300 | crawling in a way that is very opaque to the content owners themselves um so you know you go
02:35:59.420 | to these websites you try to access them and they're not available you get a feeling it's like
02:36:06.220 | oh someone changed something changed on the on the dns side that it's blocking this and likely the
02:36:13.180 | content owner has no idea they're just using uh cloudflare for better you know load balancing and
02:36:19.180 | this is something that was sort of sprung on them uh with very little notice um and i think the
02:36:26.220 | problem is this this um blocking or ideas really it impacts people in different ways um it
02:36:35.100 | disproportionately helps um companies that have a head start which are usually the closed labs
02:36:41.980 | and it hurts uh incoming uh newcomer players um where you either have now to do things in a sketchy
02:36:49.660 | way um or you're never gonna get that content uh that the closed lab might have so there's a lot
02:36:56.620 | it was a lot of coverage i'm gonna plug nathan's blog post again uh that is that um i think the
02:37:04.140 | title of this one is very succinct uh which is like we're actually not you know before thinking
02:37:09.260 | about running out of training data we're actually running out of open training data and so if one
02:37:14.540 | better open models um they should be on top of our mind um the other thing that has emerged is that
02:37:23.340 | there's strong lobbying efforts on trying to define any kind of open source ai as like a new um
02:37:34.220 | extremely risky danger um and i want to be precise here like the problem is now um
02:37:40.380 | um but the problem is not not considering the risk of this technology every technology has risks
02:37:46.380 | that that should always be considered the thing that it's like to me is um sorry it's ingenious
02:37:52.940 | is like just putting this ai on a pedestal um and calling it like an unknown alien technology
02:38:00.780 | that has like new and undiscovered potentials to destroy um humanity when in reality all the
02:38:09.260 | dangers i think are rooted in dangers that we know from existing software industry or existing
02:38:17.740 | issues that come with when using software on um on a lot of sensitive domains like medical
02:38:25.980 | areas and i also noticed a lot of efforts that have actually been going on and trying to make
02:38:31.500 | these open models safe um i pasted one here uh from ai2 but there's actually like a lot of work
02:38:38.940 | that has been going on on like okay how do you make if you're distributing this model openly
02:38:44.700 | how do you make it safe um how what's the right balance between accessibility on open models and
02:38:50.300 | safety um and then also this annoying uh brushing of um sort of concerns that are then proved to be
02:38:59.820 | unfounded under the rug you know if you remember the beginning of this year it was all about
02:39:04.140 | bio risk of these open models uh the whole thing fizzled out because there's been finally there's
02:39:11.820 | been like rigorous research not just this paper from cohere folks but it's been rigorous future
02:39:18.300 | research showing that this is really not a concern that you we should be worried about again there is
02:39:23.340 | a lot of dangerous use of ai application but this one was just like a lobbying ploy to just make
02:39:30.860 | things sound scarier uh than they actually are so i gotta preface this part it says this is my
02:39:38.060 | personal opinion it's not my employer but i look at things like uh the sp1047 from from california
02:39:45.500 | and i think we kind of dodged a bullet bullet on on this legislation we you know the open source
02:39:52.460 | community a lot of the community came together at the last sort of the last minute um and did a
02:39:59.340 | very good effort trying to explain all the negative impact of this bill um but um there's like
02:40:07.260 | i feel like there's a lot of excitement on building these open models uh or like researching on these
02:40:12.860 | open models and lobbying is not sexy uh it's kind of boring uh but um it's sort of necessary to make
02:40:20.940 | sure that this ecosystem can can really thrive um this end of presentation i have some links
02:40:29.500 | emails sort of standard thing in case anybody wants to reach out and if folks have questions
02:40:37.260 | or anything they wanted to discuss it's our open floor
02:40:40.940 | here's sofia um who wants to uh who uh one one very important open model that we haven't covered
02:40:52.540 | is mistrial so yeah yeah well it's nice to have the mistrial person yes uh talk recap the year
02:40:59.900 | mistrial but uh while sofia gets set up does anyone have like just thoughts or questions about
02:41:04.460 | the progress in this space do you always have questions always i'm very curious how we should
02:41:10.140 | build incentives to build open models things like francois choulet's uh arc prize and other
02:41:16.300 | initiatives like that what is your opinion on how we should better align incentives in the community
02:41:20.940 | so that open models stay open i think you can tap in there nice the incentive bit is like really hard
02:41:32.300 | um like even as something that i actually even we think a lot about it internally um because
02:41:39.660 | like building open models is risky it's very expensive um and so people don't want to take
02:41:45.340 | risky bets um i think the definitely like the challenges um like our challenge i think those
02:41:54.060 | are like very valid approaches for it um and then i think in general promoting building so um any
02:42:03.740 | kind of effort to participate in this challenge in those challenges if we can promoting doing that
02:42:09.180 | on top of open models um and sort of really lean into like this multiplier effect um i think that
02:42:17.580 | is a good way to go um if there were more money for um efforts um like research efforts around
02:42:27.340 | open models there's a lot of i think there's a lot of investments in companies that at the moment
02:42:33.500 | are releasing their model in the open which is really cool um but um it's usually more because
02:42:39.580 | of commercial interest and not wanting to support um this this like open models in the long term
02:42:46.380 | it's a really hard problem because i think everyone is operating sort of in what everyone
02:42:52.700 | is at their local maximum right in ways that really optimize their position on the market
02:42:58.940 | the global maximum is harder to achieve
02:43:02.460 | okay somehow it's not being shared on the screen
02:43:28.140 | uh can i ask one question you know yeah uh so i think one of the gap between the closed and
02:43:34.140 | open source models is the mutability so the closed source models like chatty was pretty
02:43:39.660 | good on the low resource languages which is not the same on the open open source models right
02:43:45.020 | so is it in your plan to improve on that space um i think in general yes is
02:43:56.220 | here yeah just just use your natural voice yeah um i think if i think we'll see a lot
02:44:02.460 | of improvements there in like chinese on the side um like there's groups um like focus on
02:44:08.700 | guys are already working on like better call for multilingual um support i think what our
02:44:18.140 | challenges there is um you really want to be experts who are actually in those countries
02:44:26.620 | that use those languages to participate in the international to give you like a very easy example
02:44:33.740 | i'm originally from italy i think i'm terribly equipped to build a model that works well in
02:44:42.140 | italy because one of the things you need to be able to do is having that knowledge of like okay
02:44:47.500 | how do i access you know libraries or content that is from this region that covers from time
02:44:54.620 | again the u.s long enough that i no longer know that um so i think that the efforts that folks
02:45:01.900 | central europe for example are doing around like okay let's let's tap into regional communities
02:45:08.300 | um to get access uh to bring in collaborators from those areas i think it's going to be like
02:45:15.180 | very crucial for getting out of this area yes let me close it up
02:45:37.580 | hello everyone
02:45:52.700 | what's that
02:45:56.060 | it's fine she's not playing any audio that's weird okay okay okay cool
02:46:06.860 | um yeah i'm super excited to be here to talk to you guys uh about mistral uh a really short
02:46:15.260 | and quick recap of what we have done what kind of models and products we have released in the past
02:46:21.900 | a year and a half so um most of you have already known that we are a small startup
02:46:29.420 | funded about a year and a half ago in paris in may 2003 it was funded by three of our co-founders
02:46:36.540 | and in september 2003 we released our first open source model mistral 7b um yeah how many of you
02:46:44.780 | have used or heard about mistral 7b hey pretty much everyone thank you uh yeah it's our uh
02:46:52.620 | pretty popular and uh community our community really love this model and in december 2003 we
02:46:59.500 | we released another popular model with the moe architecture um mr 8x 7b and
02:47:07.100 | oh going into this year you can see we have released a lot of things this year
02:47:12.620 | um first of all in february 2004 we released uh mr small mr large uh le chat which is our
02:47:20.140 | chat interface i will show you in a little bit we released a embedding model for you know converting
02:47:28.140 | your text into embedding vectors and all of our models are available um the the big cloud resources
02:47:37.820 | so you can use our model on google cloud aws asia snowflake ibm so very useful for enterprise who
02:47:46.380 | wants to use our model through cloud and in april and may this year we released another powerful
02:47:53.500 | open source um moe model ax 22b and we also released our first code model coastal which is
02:48:01.820 | amazing at 80 plus languages and then we provided another fine tuning service for customization
02:48:09.340 | so because we know the community love to fine tune our models so we provide you a very nice
02:48:15.180 | and easy option for you to fine tune our model on our platform and also we released our fine
02:48:21.020 | tuning code base called mr fine tune it's open source so feel free to take it take a look and
02:48:27.180 | more models on july to november this year we released many many other models uh first of all
02:48:37.180 | is the two new small best small models we have minister 3b great for deploying on edge devices
02:48:45.340 | we have minister 8b if you used to use mr 7b mr minister 8b is a great replacement with much
02:48:53.900 | stronger performance than mr 7b we also collaborated with nvidia and open sourced
02:49:00.140 | another model nemo 12b another great model and just a few weeks ago we updated mr large with the
02:49:08.460 | version 2 with the updated updated state of our features and really great function calling
02:49:14.940 | capabilities it's supporting function calling latently and we released two multi-modal models
02:49:21.180 | pixel 12b it's open source and pixel large just amazing model models for not understanding
02:49:29.980 | images but also great at text understanding so yeah a lot of the image models are not so
02:49:36.620 | good at text understanding but pixel large and pixel 12b are good at both image understanding
02:49:42.540 | and text understanding and of course we have models for research coastal mamba is built on
02:49:49.500 | mamba architecture and method great with working with math math problems so yeah that's another
02:49:57.580 | models uh here's another view of our model reference we have several premier models which
02:50:09.820 | means these models are mostly available through our api i mean all of the models are available
02:50:17.020 | throughout our api except for minister 7 3b but for the premium model they have a special license
02:50:25.660 | minstrel research license you can use it for free for exploration but if you want to use it for
02:50:30.940 | enterprise for production use you will need to purchase a license from us so on the top row here
02:50:37.580 | we have minstrel 3b and ab as our premier model minstrel small for best best low latency use cases
02:50:45.820 | minstrel large is great for your most sophisticated use cases pixel large is the frontier class
02:50:52.300 | multimodal model and we have coastal for great for coding and then again mr embedding model
02:50:58.540 | and the bottom the bottom the slides here we have several apache 2.0 licensed open way models
02:51:06.380 | free for the community to use and also if you want to fine tune it use it for customization
02:51:12.460 | production feel free to do so the latest we have pictures 3 12b we also have mr nemo mom
02:51:21.580 | coastal mamba and master as a real as i mentioned and we have three legacy models that we don't
02:51:28.460 | update anymore so we recommend you to move to our newer models if you are still using them
02:51:35.900 | and then just a few weeks ago we did a lot of improvements to our code interface lachette
02:51:46.300 | how many of you have used lachette oh no only a few okay i highly recommend lachette it's
02:51:54.060 | chat.mr.ai it's free to use it has all the amazing capabilities i'm going to show you right now
02:52:01.180 | but before that lachette in french means cat so this is actually a cat logo
02:52:08.860 | yeah if you can tell this is the cat eyes yeah so first of all i want to show you
02:52:17.020 | something maybe let's let's take a look at image understanding
02:52:31.100 | so here i have a receipts and i want to ask i just going to get the prompts
02:52:40.780 | going back
02:52:54.860 | going on
02:52:56.460 | yeah i had an issue with wi-fi here so hopefully it would work
02:53:03.580 | cool so basically i have a receipt and i said i ordered a coffee and a sausage how much do i owe
02:53:17.020 | at a 18 tip so hopefully it was able to get the cost of the coffee and the sausage
02:53:23.820 | and ignore the other things and um yeah i don't really understand this but i think this is coffee
02:53:30.700 | uh it's yeah nine yep and then cost of the sausage we have 22 here
02:53:38.060 | yep and then it was able to add the cost calculate the tip and all that uh great so it's great at
02:53:47.260 | image understanding is great at uh ocr tasks so if you have ocr tasks please use it as free on
02:53:54.140 | lachette it's also available through our api and also i'm going to show you a canvas example
02:54:00.380 | a lot of you may have used canvas with other tools before but uh
02:54:08.620 | with lachette is completely free again here i'm asking it to create a canvas that's used
02:54:15.420 | pi script to execute python in my browser so oh what's going on
02:54:23.020 | okay let's see if it works import this oh
02:54:30.700 | yep okay so yeah so basically it's executing python uh here exactly what we wanted uh
02:54:43.180 | and the other day i was trying to ask lachette to create a game for me let's see if we can
02:54:49.900 | make it work yeah the tetris game uh yeah
02:54:57.660 | let's just get one row maybe
02:55:10.220 | ah oh no
02:55:15.500 | okay all right you get the idea i failed my mission um
02:55:28.620 | okay here we go yay
02:55:31.580 | uh cool yeah so uh as you can see lachette can write like a code about a simple game pretty
02:55:41.420 | easily and you can ask lachette to explain the code make updates however you like um
02:55:49.100 | another example there is a bar here i want to move okay right okay and uh let's go back
02:56:00.780 | another one uh yeah we also have web search capabilities like you can ask what's the latest
02:56:10.540 | ai news uh image generation is pretty cool generate an image about researchers in vancouver
02:56:21.500 | uh yeah it's black forest labs uh flex pro uh again this is free so
02:56:31.020 | oh cool i guess researchers here are mostly from university of british columbia
02:56:39.820 | uh that's smart uh yeah so this is lachette i please feel free to use it uh and let me know
02:56:48.380 | if you have any feedback we're always looking for improvement and we're going to release
02:56:52.460 | a lot more powerful features in the coming years thank you
02:56:55.740 | yeah i think we can open up the questions there's lunch also outside but uh if anyone
02:57:06.300 | thought i don't think we have a youtube entry but if anyone has any thoughts on
02:57:10.700 | mistral or omo or any of the others the open models
02:57:15.340 | um yeah no i think we can just break for lunch and uh have a chat but thanks thanks so much to
02:57:23.020 | the speakers thank you again we'll be back here what we're gonna have like some people presenting
02:57:28.620 | during lunch um i i think i think basically just go grab lunch you can come back in and eat and
02:57:34.060 | chat uh we'll have some people presenting as well right so unless you want to say you see material
02:57:39.580 | okay maybe maybe maybe you get something off now
02:57:45.020 | yeah hi everyone thank you so much for coming today um huge shout out to SWIX and the latent
02:57:55.180 | space team i think it's been a great yeah let's just give it up for SWIX just real quick um i
02:58:02.220 | did a little bit of in terms of helping with the planning but i work at notable capital some of you
02:58:07.100 | may have heard of ggv which was our former name um on the cloud infrastructure team so basically
02:58:12.300 | anything data dev tools um ai infrastructure as well as ai applications um and so we like to stay
02:58:19.260 | close to those that are smarter than us which is all of you in this room um so if anyone ever wants
02:58:23.580 | to you know brainstorm or thinking about starting a company um we're happy to collaborate we've had
02:58:28.380 | the opportunity to partner with like amazing companies such as hoshi corp bracelle neon
02:58:32.780 | and many others over the years um and we're based in san francisco and new york so yeah feel free
02:58:38.380 | to find me laura hamilton x linkedin um you know if we become friends instagram yeah um thank you
02:58:45.740 | all for coming and then we'll kick off some of the chats with aws after everyone gets lunch all right
02:59:15.420 | hi these are up here too this is not mine although i did almost take it yeah it's not like everyone's
02:59:21.820 | happy uh
02:59:46.220 | nope i didn't even ask the url
03:34:19.120 | Like in my view, I don't know if I would do a traditional.
03:34:32.120 | Hello. Oh great.
03:34:34.120 | Awesome. Yeah, sure.
03:34:38.120 | Well, hey everyone. Hope you enjoyed lunch. Thanks for thanks for dialing in here.
03:34:44.620 | My name is Aaron wanted to give a quick shout out to the latent latent space team notable capital swicks for organizing.
03:34:53.120 | I'm with the AWS AI startups team.
03:34:55.620 | I've been in the role for about three years now.
03:34:59.120 | I was a founding product hire at a series a company had a great exit there did machine learning for a while.
03:35:06.620 | Did some strategy consulting with Google for a while and then joined AWS actually got this job on Twitter of all places.
03:35:15.120 | I liked a tweet that was like, hey, I think more more VC meetings should be over surf lessons.
03:35:21.620 | And I got a DM back saying hey, you kind of want to come work at AWS and it was off of the races from there.
03:35:27.620 | So keep your DMS open.
03:35:28.620 | I'll keep I'll keep this short here.
03:35:31.620 | Basically just wanted to kind of chat about how AWS works with founders, right?
03:35:37.620 | I think everyone's aware compute and credits are kind of like the name of the game at this point.
03:35:43.620 | I like to I like to think about ways to go deeper than that and figure out how we can add value beyond just like here's some GPUs.
03:35:51.120 | Here's some credits and run with it, right?
03:35:53.120 | Like that's kind of table six at this point.
03:35:55.120 | So I wrote the PR FAQ for an accelerator program that is a 10 week program.
03:36:02.120 | It just wrapped up at reinvent last week where we take a couple companies from around the world and really lean in and try and build co build with them.
03:36:12.620 | We find design partners.
03:36:14.120 | We do like product strategy, help them with fundraising.
03:36:18.120 | We just put them on stage at reinvent.
03:36:21.120 | There's like, you know, 700 people in the audience.
03:36:23.120 | It's a really fun, fun experience.
03:36:25.120 | And that's just kind of like, you know, putting what we do on a day to day on the world stage because our whole team is dedicated to figuring out ways to, again, go beyond beyond credits, beyond compute and support.
03:36:38.620 | Right. So we worked with founders from like day zero, haven't even incorporated.
03:36:43.120 | We're still like bouncing ideas off of off of each other, thinking about ways to go to market.
03:36:48.120 | And then, you know, beyond that, like as you're scaling, finding design partners and then getting you listed on marketplace and really co-selling together.
03:36:57.120 | And we'd love to be a small part of the journey as you're considering entrepreneurship.
03:37:02.620 | So if you want to chat about all things entrepreneurship, please please reach out.
03:37:09.120 | I'm on LinkedIn, Aaron A. Melgar.
03:37:12.120 | If you do just want GPUs and compute and credits, happy to chat about that as well.
03:37:18.120 | But but great to be here. And again, thanks to SWIX for hosting and to the notable capital team for having us and organizing.
03:37:25.120 | So thanks, everyone. Enjoy the rest of the talks today.
03:37:31.620 | Also, we have them to thank for lunch. So all the amazing lunch that we got.
03:37:36.120 | This whole thing is like self-funded, community funded. So we're very much flying by the seat of our pants.
03:37:41.120 | And also thank you to Laura for making all this happen.
03:37:44.120 | OK, so we have a couple more presentations from folks, just people like launching things.
03:37:50.120 | We got Drew, you're next, but Ben, I'm going to I'm going to call you up first.
03:37:54.120 | Ben, are you ready? I can get Drew to go first.
03:37:58.620 | Drew, Drew, you got Drew. The amazing thing about the thing that's Drew's demoing is, well, by definition, it works offline.
03:38:06.120 | And it's very, very viral. We we're just so lucky to have I mean, just for me to be friends with him
03:38:15.120 | and to invite him here to to show off the best way you can be reading papers.
03:38:20.120 | So usually we we come here, we do we demo B2B SaaS and infrastructure as a service.
03:38:25.620 | This is none of that. You want consumer hardware. We got consumer hardware. OK, go.
03:38:30.120 | Oh, all right. I have to still hype him up a little bit. What else?
03:38:34.120 | What else can I say about you? Drew's an insane violinist.
03:38:37.120 | If you if you like visit his house, like he lives in the House of Musicians
03:38:42.120 | and they just have classical music live all the time. It's insane. All right.
03:38:48.120 | Cool. Yeah. Sean is a is a very flattering hype man. Really incredible.
03:39:04.620 | Just a quick thanks to to latent space for for hosting this and for Sean, like being in.
03:39:11.620 | I think we met almost two years ago at a replica thing.
03:39:16.120 | And he's just like organized the entire scene in a way that makes it digestible for me
03:39:20.620 | and everyone else. Thanks to latent space. So I work for a company called Daylight Computer
03:39:26.620 | and we're making computers for serious people is one way that I put it.
03:39:32.620 | But we want to make a better reading experience for researchers specifically
03:39:38.620 | and a new surface for A.I. in our real lives.
03:39:44.120 | So how do we we haven't heard a whole lot about consumer applications of A.I. today,
03:39:48.620 | but I just want to show a demo, some demos we've been working on for how to integrate A.I.
03:39:55.620 | more comfortably into research workflows, especially reading papers.
03:40:00.620 | So I'll just quickly go over kind of what what is daylight.
03:40:05.620 | We invented a new screen technology that works just with the light in the room
03:40:11.120 | and has no blue light, better for eye strain, better for focus.
03:40:14.620 | And we wrote an operating system to run this screen on our first product,
03:40:19.620 | this tablet computer, the DC one, and it allows you to read outside
03:40:28.620 | or just read without distractions.
03:40:32.620 | So we've kind of made it impossible to get interrupted by notifications and other distractions.
03:40:39.120 | It's kind of like a Kindle and an iPad had a baby.
03:40:42.620 | So the kinds of things we're doing with A.I. are to kind of integrate directly
03:40:48.620 | with paper reading workflows.
03:40:51.620 | And I just have a quick demo that I can give here.
03:40:56.620 | It looks like we don't have sound, but I'll just narrate.
03:41:03.120 | Here it is.
03:41:11.620 | So the voice is going to be Anjan Kata, our founder, who invented the screen technology.
03:41:21.620 | It's really all right. I can just talk through it.
03:41:24.620 | So this is a poem, and often we want to go deeper into the text.
03:41:32.120 | So let's just -- there we go.
03:41:38.620 | So this is the daylight reading the "Real Gay Man Watching" poem.
03:41:42.620 | One cool feature we have is we have a little button on the side
03:41:45.620 | that you can press at any time and then you can talk to an A.I.
03:41:48.620 | So I was a little bit confused by certain parts of the poem.
03:41:51.620 | For example, when it said,
03:41:53.620 | "What do they mean by 'we would become strong too' and 'not need names'?"
03:41:56.620 | Like, what does "not need names" mean?
03:42:01.120 | The phrase "not need names" suggests transcending individual identity.
03:42:06.620 | And so we just kind of, as we were going through the entire poem,
03:42:09.620 | we read it once and we kind of went back through.
03:42:11.620 | We were able to ask all these questions,
03:42:13.620 | and came away feeling like we understood it so much more.
03:42:17.620 | Another question we asked was,
03:42:19.620 | "Can you tell us more about what they mean by these wrestlers in the Old Testament
03:42:22.620 | and who the angel is?
03:42:24.620 | What biblical story is this referencing?"
03:42:28.620 | So we can go deeper into it.
03:42:30.620 | It's referencing the biblical story of Jacob wrestling with an angel found in Genesis 32.
03:42:36.120 | And that's just, like, incredibly cool that we're able to do this.
03:42:39.120 | I felt very touched by this poem.
03:42:41.120 | And so I could say something like,
03:42:43.120 | "Could you recommend a few other poems that mirror the themes of this one?"
03:42:51.120 | There we go. It just gave us five poems.
03:42:54.120 | These poems have four themes.
03:42:55.120 | "Resilient, Struggled, Personal, Gravitational, Challenged, and Simultaneous."
03:42:58.120 | This just feels so cool.
03:43:00.120 | I'm gonna go a little bit in here and add these poems to your device and read them.
03:43:03.620 | Yeah. So we want to bring that to research and to the entire device.
03:43:08.620 | So one thing that's an advantage of owning the entire stack,
03:43:12.620 | operating system, hardware, everything,
03:43:15.620 | is we can tailor the workflows across any apps.
03:43:19.620 | So this will work in a web browser, in your messages, email.
03:43:24.620 | No matter what apps you're running,
03:43:27.120 | there's a sort of central AI that is running on your device.
03:43:29.620 | It can capture that, put it in a knowledge graph,
03:43:32.620 | develop a model of you and your interests.
03:43:35.620 | And it's just available everywhere and with a hardware button.
03:43:39.620 | So that's what we're working on.
03:43:41.620 | If you're interested in these things,
03:43:43.620 | if you're interested in knowledge graphs on device or quantized models,
03:43:47.620 | come talk to us.
03:43:49.620 | And I actually have a couple of these here if people want to play with them.
03:43:53.620 | Thank you.
03:43:56.620 | Actually, they're sold out online, right?
03:43:59.120 | Yeah, we are. But we have just a couple.
03:44:02.120 | We're sold out online probably until the beginning of next year,
03:44:06.120 | like March, maybe.
03:44:08.120 | But we have like three here today.
03:44:11.120 | You want to buy them off of it?
03:44:13.120 | Yeah. $7.29.
03:44:20.120 | Canadian money.
03:44:22.620 | But yeah, try it. I mean, it's 60 FPS E-Ink.
03:44:26.120 | This stuff is not cheap.
03:44:28.120 | Give us a tech talk.
03:44:30.120 | I can do a quick Q&A.
03:44:32.120 | So it's a new twist.
03:44:35.120 | There are like six patents on top of essentially a Game Boy screen.
03:44:40.120 | So this is like 20 years of extra research
03:44:44.120 | and six patents on top of what's called RLCD or TLCD.
03:44:49.620 | So it's transflective or reflective LCD.
03:44:55.120 | So it's liquid crystal, but it has no backlight required.
03:44:59.120 | The sort of innovation is the reflectance and transflectance films
03:45:05.120 | and like stack, you know, black magic to reflect the light back through the LCD.
03:45:11.120 | So it's as fast as any computer monitor.
03:45:15.120 | It can run, you know, 120 frames per second.
03:45:18.620 | And in broad sunlight.
03:45:21.120 | And then we, in order to, the transflective part is
03:45:25.120 | how do you enable a backlight for nighttime use?
03:45:29.120 | And we developed a layer that allows us to put a blue light free LED
03:45:34.120 | that's like safe for your circadian health and suprachiasmatic nucleus and so on.
03:45:38.120 | So you're not like burning your eyes out at midnight reading.
03:45:42.120 | But it can come through similar to a normal computer screen.
03:45:46.620 | So that's more or less the secret sauce here.
03:45:50.120 | Sorry?
03:45:53.120 | No, it's very beta right now.
03:45:56.120 | But we're going to release it with a, yeah, we're building it.
03:46:00.120 | Yeah, it's going to be great. It's fun to play with.
03:46:04.120 | And if you want to, you know, come by and try writing on it or reading on it
03:46:08.120 | or like watching a video on it, just seeing how it feels, looks,
03:46:11.120 | I'm in the back there. So see you.
03:46:15.620 | There will be a phone, you know, laptop, monitor, all those things.
03:46:20.120 | Yeah, so last speaker.
03:46:36.120 | We have Ben from StrongCompute, founder of StrongCompute.
03:46:40.120 | I would say like one of those weird things where even though they're mostly a compute shop,
03:46:44.620 | they also do a fair amount of like deep research.
03:46:48.120 | This year, Ring attention got a lot of attention from people for like scaling,
03:46:53.120 | I guess like distributed training.
03:46:56.120 | And we host a paper club. Like this is basically the in-person version
03:47:00.120 | of the online paper club that we've been running for two years.
03:47:03.120 | And the single best presentation, one of my favorite presentation of the years
03:47:06.120 | was from StrongCompute. So really grateful for you guys.
03:47:11.120 | All right. Hey, folks.
03:47:14.620 | Let's just get on the screen.
03:47:19.620 | Yeah. Allow. Allow.
03:47:22.620 | Did I get you to zoom? I didn't. I don't think I did.
03:47:25.620 | You didn't get me to zoom. Show me to zoom.
03:47:27.620 | Are you in the Discord?
03:47:29.620 | I can pop that up.
03:47:31.620 | Discord. Go to the Lanscapes Discord live.
03:47:34.620 | Yeah. I'll navigate faster.
03:47:38.120 | So where's the zoom here?
03:47:40.620 | Am I going down or up?
03:47:43.620 | I'm going down. There we go.
03:47:46.620 | Zoom. This is mostly just because I want to capture your screen for the recording.
03:47:52.620 | Oh, also I'm wearing this thing.
03:47:55.620 | This is for the swag swap. The swag table is back there.
03:47:59.620 | Yeah. There's a lot more swag people want.
03:48:04.120 | Okay. I think we're good.
03:48:07.620 | That's it.
03:48:09.620 | Cool. All right.
03:48:12.620 | All right. So we'll keep this informal.
03:48:23.620 | So I'll just leave this up here like this.
03:48:25.620 | So what we're trying to do is make clusters a lot easier to use.
03:48:29.620 | So anyone who's tried accessing clusters for training,
03:48:33.120 | we're trying to be what you'd feel an elite DevOps team would be.
03:48:36.620 | So here's kind of a feature list of some of the stuff we're going for.
03:48:40.620 | And this is the vibe that we're going for.
03:48:44.620 | So most people like, we actually started out optimizing,
03:48:48.620 | well, we started out building compute hardware,
03:48:51.620 | cooling our own power systems like that.
03:48:54.620 | Then we got into optimizing training time.
03:48:57.620 | So we're messing around with CUDA kernels and data loading and that kind of stuff.
03:49:02.120 | But all right, this is the hard part.
03:49:03.620 | This is the rocket science,
03:49:05.620 | like getting these much greater efficiencies on the GPU.
03:49:08.620 | Surely the easy part is just taking our work once it's done
03:49:11.620 | and just putting it on a cloud platform and having it go.
03:49:14.620 | And it turned out to be the complete opposite.
03:49:16.620 | We got a whole bunch of awesome optimizations done in a few months.
03:49:19.620 | And then it took us a couple of years
03:49:21.620 | to actually build a GPU orchestration platform that we wanted to use.
03:49:25.620 | And what we realized was that there was just a lot of things
03:49:28.620 | that you needed to solve.
03:49:31.620 | What I want to share with you today
03:49:33.120 | is something we've been working on for a year,
03:49:35.120 | which is a new UI for how you work with clusters.
03:49:39.120 | And so this is it here.
03:49:43.120 | So if you've got some compute,
03:49:47.120 | maybe AWS has given you some credits, that's really nice of them.
03:49:51.120 | So you've got a cluster over in US East.
03:49:55.120 | Maybe you've already got some stuff with GCP.
03:49:59.620 | Maybe you've got some on-prem as well.
03:50:03.120 | And in each of those regions, you've got some number of GPUs.
03:50:09.120 | And then you want to go and do things with them.
03:50:12.120 | You want to go train models on them.
03:50:15.120 | So the dataset stuff I'll get to in a sec,
03:50:19.120 | but you go and run a job.
03:50:23.120 | And that job's going to start on some cluster somewhere.
03:50:26.120 | So with our system, it's pretty much that easy.
03:50:29.120 | You don't need to worry about Linux, NVIDIA drivers, anything like that.
03:50:31.620 | You get a Docker, you get root in the Docker.
03:50:33.620 | It can just jump on the cluster straight away.
03:50:35.620 | And so then a bunch of other people want to run jobs.
03:50:38.620 | And so then you end up backed up in a Slurm queue.
03:50:41.620 | And you go, "Hang on, don't we have all this other compute
03:50:43.620 | somewhere else in the world?
03:50:45.620 | What's it going to take to actually migrate our workloads
03:50:47.620 | to those other resources?"
03:50:49.620 | Well, what we've built is the ability to migrate
03:50:53.620 | those workloads like this.
03:50:57.120 | And what that means is that we can snapshot
03:51:01.620 | the state of a cluster live.
03:51:04.620 | We can move the entire cluster workload
03:51:06.620 | to another cloud provider or another region
03:51:10.620 | or another cluster on that cloud provider.
03:51:13.620 | We can do that incredibly quickly as well.
03:51:16.620 | I've got a little bit of a video showing some of our speed demos.
03:51:22.120 | We rewrote file transfer, block storage,
03:51:25.620 | like absolutely everything.
03:51:27.620 | That number is going to be a lot higher,
03:51:29.620 | that 90 gigabytes a second soon.
03:51:31.620 | And we have the world's fastest container registry as well.
03:51:34.620 | So yeah, there's no more slow stuff on the cluster.
03:51:39.620 | It's also a lot cheaper than what you'd be used to paying for egress.
03:51:42.620 | So the vision for this is if you imagine a CPU
03:51:46.620 | and there's a scheduler and it's sending workloads
03:51:48.620 | out to cores on that CPU.
03:51:51.120 | Well, what if the whole world was the CPU?
03:51:52.620 | What if each core was actually a cluster in a data center?
03:51:55.620 | And you just felt like your computer
03:51:57.620 | had incredible amounts of power.
03:52:00.620 | And obviously, theoretically that's possible,
03:52:03.620 | but it's all about the APIs and the interface work.
03:52:05.620 | Normally, you want to go and start sending workloads
03:52:10.620 | around the world, you've got to go and talk
03:52:12.620 | to the dozen DevOps people and they'll get started
03:52:16.620 | on a multi-week, multi-month project to do that.
03:52:20.120 | These transfers happen incredibly quickly
03:52:21.620 | and without the need for any DevOps work to do.
03:52:24.620 | We've also got a few other features here as well.
03:52:26.620 | So you'll have some data sets and they might be quite large.
03:52:31.620 | So maybe you've got a 10 terabyte data set
03:52:36.620 | and we'll pop that in the US to start with.
03:52:39.620 | We can't, oh, name's required.
03:52:43.620 | All right, 10.
03:52:47.120 | Just lose the screen.
03:52:48.620 | Open there, I think it's still on.
03:52:52.620 | Yeah, there we go.
03:52:54.620 | So I'll just pop a 10 terabyte data set in.
03:52:56.620 | So I can go pull that down to any cluster I want.
03:53:00.620 | And then I can go and I can do training on that data set,
03:53:05.620 | but I can also set up workstations.
03:53:07.620 | So one of the issues that we've seen people encounter is,
03:53:10.620 | yeah, I can get a GPU reasonably easily,
03:53:12.620 | but it's not going to have high-speed access to my data set.
03:53:17.620 | So here we can just set up as many nodes as we want with workstations.
03:53:21.120 | You can carve off as many GPUs as you'd like.
03:53:23.120 | And that'll be your container that you can work out of
03:53:27.120 | and have high-speed access.
03:53:29.120 | That's that 90 gigabyte a second access to that data set.
03:53:33.120 | And that way you can go, so this is the entire dev cycle.
03:53:38.120 | We're not doing any production inference hosting right now,
03:53:40.120 | but you can have fast access to your data sets from your dev container.
03:53:44.620 | You can train on the clusters very, very easily.
03:53:47.120 | What we want to do is eliminate any developer waiting time
03:53:51.120 | or any work that wasn't PyTorch.
03:53:53.120 | So what does that look like for some other examples?
03:53:57.120 | We can also, because we're able to save the state of a cluster,
03:54:00.120 | we're resilient to things like GPU failure,
03:54:02.120 | but we can also do this trick
03:54:04.120 | where we go and build a time-cycled space.
03:54:06.120 | And here we can actually just go and pack as many jobs as we want
03:54:09.120 | into the same space on a cluster
03:54:12.620 | and choose how often they're going to rotate.
03:54:15.120 | So maybe it's every three hours.
03:54:17.120 | Every job gets an hour every three hours.
03:54:19.120 | Maybe this job's more important.
03:54:21.120 | Maybe this job's so important and our cluster is so backed up
03:54:24.120 | that we just actually want to get this job some dedicated space.
03:54:27.120 | So we hit burst and we spin it up.
03:54:30.120 | And now we've actually gone and found resources on the cloud.
03:54:33.120 | So we're integrated with six clouds and that's growing.
03:54:37.120 | And you can plug your own cloud accounts in.
03:54:39.120 | You can use ours, whatever you like.
03:54:41.620 | It's a great management platform.
03:54:44.120 | And those time-cycled spaces,
03:54:46.120 | one of the really cool things about them
03:54:49.120 | is that what they can let you do
03:54:53.120 | is you can interrupt them anytime you want.
03:54:57.120 | So if you want to launch a cluster-scale workload,
03:55:00.120 | it's usually about 15 seconds to start one.
03:55:02.120 | So you dev on a GPU.
03:55:04.120 | You want a cluster to work on.
03:55:06.120 | We'll get you on in 15 seconds.
03:55:08.120 | You'll see if your stuff works at cluster scale.
03:55:11.120 | You'll get off.
03:55:12.620 | All the other jobs will keep going with that state saving.
03:55:15.620 | So we're trying to make GPU feel very fluid.
03:55:18.620 | So this is our platform.
03:55:20.620 | We're running events about every month.
03:55:23.620 | There's one coming up this...
03:55:27.620 | Let me jump to it.
03:55:29.620 | Yeah, this Friday and Saturday
03:55:32.620 | for any folks who are heading home early from Europe.
03:55:35.620 | You get to play with our platform there.
03:55:39.120 | Just like people would train chess models on it.
03:55:41.620 | Last event, we got up to 1,600 GPUs.
03:55:44.620 | People were really having a lot of fun with it.
03:55:47.620 | And so that's what's happening this Friday
03:55:50.620 | and every month or so.
03:55:52.620 | I don't know if I can...
03:56:01.620 | Yeah, I saw this hackathon and I was wondering
03:56:04.620 | what the expected ELO of a chess bot
03:56:06.620 | that you can train in a day is.
03:56:08.620 | We were curious about that as well.
03:56:10.120 | So this is the sixth time we've run this.
03:56:13.120 | The first time your child
03:56:16.120 | who's just learned the rules of chess
03:56:18.120 | could probably defeat most of the models.
03:56:20.120 | But around the fourth time we'd run it,
03:56:22.120 | I think they were probably getting up
03:56:24.120 | around 1,500 ELO.
03:56:26.120 | People were really getting a lot better
03:56:28.120 | at figuring out how to do this stuff in a weekend.
03:56:31.120 | And yeah, I think...
03:56:33.120 | I'm not the world's greatest chess player
03:56:35.120 | and right now the models will quite easily defeat me.
03:56:39.120 | And yeah, we're excited to see
03:56:41.620 | how much further this will go.
03:56:43.620 | So this is like a short-form way
03:56:45.620 | to access the platform.
03:56:47.620 | We also have research grants.
03:56:49.620 | So if you want access to Compute,
03:56:52.620 | we've given out about a dozen of these so far.
03:56:54.620 | And we'll give you like
03:56:56.620 | 10 to 100k worth of Compute.
03:56:58.620 | And you just have to be doing something cool,
03:57:00.620 | something interesting,
03:57:02.620 | something that SWIX would want to host at an event.
03:57:06.120 | And yeah, you can come and get some Compute.
03:57:08.620 | So we'll give you like about three months
03:57:10.620 | on the systems with that as well.
03:57:12.620 | We have AWS, GCP, Azure, Oracle,
03:57:29.120 | Oblivious, Lambda Labs, our own Metal,
03:57:31.620 | and we've got some more on the way.
03:57:34.620 | It's the same.
03:57:36.120 | Like the way our systems work
03:57:38.120 | is we've just got a Ubuntu image
03:57:40.120 | that has our agent in it.
03:57:42.120 | It's called Agent Smith.
03:57:44.120 | And it has our fast data puller,
03:57:46.120 | our sort of node clustering stuff,
03:57:48.120 | a whole bunch of health monitoring.
03:57:50.120 | Yeah, some of the stuff I didn't show you today.
03:57:52.120 | You get total financial control.
03:57:54.120 | You can set budgets across all providers.
03:57:56.120 | You go, "All right, this developer is allowed
03:57:58.120 | 10k a month or this project's allowed 100k."
03:58:00.120 | It will just tally up all the usage
03:58:02.620 | and then tell them when they need to stop.
03:58:04.120 | So, yeah, it's very, very multi-cloud.
03:58:11.120 | Can you repeat the question?
03:58:16.120 | Yeah, so the question was how we compare this
03:58:18.120 | with something like Modal.
03:58:20.120 | So Modal is a great environment
03:58:22.120 | where you want, you know,
03:58:24.120 | probably not cluster scale,
03:58:26.120 | so a smaller number of GPUs,
03:58:28.120 | and where you're willing to sort of convert
03:58:30.120 | a lot of your workflow to the way
03:58:32.120 | that Modal operates.
03:58:33.620 | What we've tried to do is make it so it's--
03:58:36.620 | we're sort of clusters first and foremost.
03:58:38.620 | Like, yes, you can have workstations,
03:58:40.620 | but it's really about larger clusters
03:58:42.620 | and making the data extremely fast
03:58:44.620 | so that you're not sitting around
03:58:46.620 | waiting for days for that.
03:58:48.620 | The other thing is we've tried to make it
03:58:50.620 | as close to how you operate now as possible.
03:58:52.620 | So, you know, Docker is, you know,
03:58:54.620 | it is the standard.
03:58:56.620 | We didn't use Docker to start with
03:58:58.620 | because we were concerned about the speed.
03:59:01.120 | We haven't applied a bunch of our speed tooling to it,
03:59:02.620 | and it's only about a 5% overhead at the moment.
03:59:05.620 | And so you get on our system,
03:59:07.620 | you get a Docker container,
03:59:09.620 | you get root in the container,
03:59:11.620 | you can do anything except install kernel modules.
03:59:13.620 | So you get a lot of freedom,
03:59:15.620 | and you don't have to convert
03:59:17.620 | to another way of doing things.
03:59:19.620 | Cool.
03:59:21.620 | Yeah.
03:59:23.620 | Cool. Well, thank you so much.
03:59:25.620 | We'll chat a little bit for lunch,
03:59:27.620 | and then we'll come back at 1 p.m.
03:59:30.120 | Thank you.
03:59:32.620 | Thank you.
03:59:34.120 | Thank you.
03:59:35.620 | Thank you.
03:59:37.120 | Thank you.
03:59:38.620 | Thank you.
03:59:40.120 | Thank you.
03:59:41.620 | Thank you.
03:59:44.120 | Thank you.
03:59:46.620 | Thank you.
03:59:49.120 | Thank you.
03:59:50.620 | Thank you.
03:59:52.620 | Thank you.
03:59:54.620 | Thank you.
03:59:56.620 | Thank you.
03:59:58.620 | Thank you.
04:00:00.620 | Thank you.
04:00:02.620 | Thank you.
04:00:04.620 | Thank you.
04:00:06.620 | Thank you.
04:00:08.620 | Thank you.
04:00:10.620 | Thank you.
04:00:12.620 | Thank you.
04:00:14.620 | Thank you.
04:00:17.120 | Thank you.
04:00:18.620 | Thank you.
04:00:20.620 | Thank you.
04:00:22.620 | Thank you.
04:00:24.620 | Thank you.
04:00:26.620 | Thank you.
04:00:28.620 | Thank you.
04:00:30.620 | Thank you.
04:00:32.620 | Thank you.
04:00:34.620 | Thank you.
04:00:36.620 | Thank you.
04:00:38.620 | Thank you.
04:00:40.620 | Thank you.
04:00:42.620 | Thank you.
04:00:44.620 | Thank you.
04:00:46.620 | Thank you.
04:00:48.120 | Thank you.
04:00:50.120 | Thank you.
04:00:52.120 | Thank you.
04:00:54.120 | Thank you.
04:00:56.120 | Thank you.
04:00:58.120 | Thank you.
04:01:00.120 | Thank you.
04:01:02.120 | Thank you.
04:01:04.120 | Thank you.
04:01:06.120 | Thank you.
04:01:08.120 | Thank you.
04:01:10.120 | Thank you.
04:01:12.120 | Thank you.
04:01:14.120 | Thank you.
04:01:16.120 | Thank you.
04:01:17.620 | Thank you.
04:01:19.120 | Thank you.
04:01:20.620 | Thank you.
04:01:22.120 | Thank you.
04:01:23.620 | Thank you.
04:01:25.120 | Thank you.
04:01:27.120 | Thank you.
04:01:29.120 | Thank you.
04:01:31.120 | Thank you.
04:01:33.120 | Thank you.
04:01:35.120 | Thank you.
04:01:37.120 | Thank you.
04:01:39.120 | Thank you.
04:01:41.120 | Thank you.
04:01:43.120 | Thank you.
04:01:45.120 | Thank you.
04:01:47.120 | Thank you.
04:01:49.120 | Thank you.
04:01:51.120 | Thank you.
04:01:53.120 | Thank you.
04:01:54.620 | Thank you.
04:01:56.620 | Thank you.
04:01:58.620 | Thank you.
04:02:00.620 | Thank you.
04:02:02.620 | Thank you.
04:02:04.620 | Thank you.
04:02:06.620 | Thank you.
04:02:08.620 | Thank you.
04:02:10.620 | Thank you.
04:02:12.620 | Thank you.
04:02:14.620 | Thank you.
04:02:16.620 | Thank you.
04:02:18.620 | Thank you.
04:02:20.620 | Thank you.
04:02:22.620 | Thank you.
04:02:24.120 | Thank you.
04:02:26.120 | Thank you.
04:02:28.120 | Thank you.
04:02:30.120 | Thank you.
04:02:32.120 | Thank you.
04:02:34.120 | Thank you.
04:02:36.120 | Thank you.
04:02:38.120 | Thank you.
04:02:40.120 | Thank you.
04:02:42.120 | Thank you.
04:02:44.120 | Thank you.
04:02:46.120 | Thank you.
04:02:48.120 | Thank you.
04:02:50.620 | Thank you.
04:02:52.120 | Thank you.
04:02:54.120 | Thank you.
04:02:56.120 | Thank you.
04:02:58.120 | Thank you.
04:03:00.120 | Thank you.
04:03:02.120 | Thank you.
04:03:04.120 | Thank you.
04:03:06.120 | Thank you.
04:03:08.120 | Thank you.
04:03:10.120 | Thank you.
04:03:12.120 | Thank you.
04:03:14.120 | Thank you.
04:03:16.120 | Thank you.
04:03:18.620 | Thank you.
04:03:20.120 | Thank you.
04:03:22.120 | Thank you.
04:03:24.120 | Thank you.
04:03:26.120 | Thank you.
04:03:28.120 | Thank you.
04:03:30.120 | Thank you.
04:03:32.120 | Thank you.
04:03:34.120 | Thank you.
04:03:36.120 | Thank you.
04:03:38.120 | Thank you.
04:03:40.120 | Thank you.
04:03:42.120 | Thank you.
04:03:44.120 | Thank you.
04:03:46.620 | Thank you.
04:03:48.120 | Thank you.
04:03:50.120 | Thank you.
04:03:52.120 | Thank you.
04:03:54.120 | Thank you.
04:03:56.120 | Thank you.
04:03:58.120 | Thank you.
04:04:00.120 | Thank you.
04:04:02.120 | Thank you.
04:04:04.120 | Thank you.
04:04:06.120 | Thank you.
04:04:08.120 | Thank you.
04:04:10.120 | Thank you.
04:04:12.120 | Thank you.
04:04:14.620 | Thank you.
04:04:16.120 | Thank you.
04:04:18.120 | Thank you.
04:04:20.120 | Thank you.
04:04:22.120 | Thank you.
04:04:24.120 | Thank you.
04:04:26.120 | Thank you.
04:04:28.120 | Thank you.
04:04:30.120 | Thank you.
04:04:32.120 | Thank you.
04:04:34.120 | Thank you.
04:04:36.120 | Thank you.
04:04:38.120 | Thank you.
04:04:40.120 | Thank you.
04:04:42.620 | Thank you.
04:04:44.120 | Thank you.
04:04:46.120 | Thank you.
04:04:48.120 | Thank you.
04:04:50.120 | Thank you.
04:04:52.120 | Thank you.
04:04:54.120 | Thank you.
04:04:56.120 | Thank you.
04:04:58.120 | Thank you.
04:05:00.120 | Thank you.
04:05:02.120 | Thank you.
04:05:04.120 | Thank you.
04:05:06.120 | Thank you.
04:05:08.120 | Thank you.
04:05:10.620 | Thank you.
04:05:12.120 | Thank you.
04:05:14.120 | Thank you.
04:05:16.120 | Thank you.
04:05:18.120 | Thank you.
04:05:20.120 | Thank you.
04:05:22.120 | Thank you.
04:05:24.120 | Thank you.
04:05:26.120 | Thank you.
04:05:28.120 | Thank you.
04:05:30.120 | Thank you.
04:05:32.120 | Thank you.
04:05:34.120 | Thank you.
04:05:36.120 | Thank you.
04:05:38.620 | Thank you.
04:05:40.120 | Thank you.
04:05:42.120 | Thank you.
04:05:44.120 | Thank you.
04:05:46.120 | Thank you.
04:05:48.120 | Thank you.
04:05:50.120 | Thank you.
04:05:52.120 | Thank you.
04:05:54.120 | Thank you.
04:05:56.120 | Thank you.
04:05:58.120 | Thank you.
04:06:00.120 | Thank you.
04:06:02.120 | Thank you.
04:06:04.120 | Thank you.
04:06:06.120 | Thank you.
04:06:08.120 | Thank you.
04:06:09.620 | Thank you.
04:06:11.620 | Thank you.
04:06:13.620 | Thank you.
04:06:15.620 | Thank you.
04:06:17.620 | Thank you.
04:06:19.620 | Thank you.
04:06:21.620 | Thank you.
04:06:23.620 | Thank you.
04:06:25.620 | Thank you.
04:06:27.620 | Thank you.
04:06:29.620 | Thank you.
04:06:31.620 | Thank you.
04:06:33.620 | Thank you.
04:06:36.120 | Thank you.
04:06:37.620 | Thank you.
04:06:39.620 | Thank you.
04:06:41.620 | Thank you.
04:06:43.620 | Thank you.
04:06:45.620 | Thank you.
04:06:47.620 | Thank you.
04:06:49.620 | Thank you.
04:06:51.620 | Thank you.
04:06:53.620 | Thank you.
04:06:55.620 | Thank you.
04:06:57.620 | Thank you.
04:06:59.620 | Thank you.
04:07:01.620 | Thank you.
04:07:04.120 | Thank you.
04:07:05.620 | Thank you.
04:07:07.620 | Thank you.
04:07:09.620 | Thank you.
04:07:11.620 | Thank you.
04:07:13.620 | Thank you.
04:07:15.620 | Thank you.
04:07:17.620 | Thank you.
04:07:19.620 | Thank you.
04:07:21.620 | Thank you.
04:07:23.620 | Thank you.
04:07:25.620 | Thank you.
04:07:27.620 | Thank you.
04:07:29.620 | Thank you.
04:07:32.120 | Thank you.
04:07:33.620 | Thank you.
04:07:35.620 | Thank you.
04:07:37.620 | Thank you.
04:07:39.620 | Thank you.
04:07:41.620 | Thank you.
04:07:43.620 | Thank you.
04:07:45.620 | Thank you.
04:07:47.620 | Thank you.
04:07:49.620 | Thank you.
04:07:51.620 | Thank you.
04:07:53.620 | Thank you.
04:07:55.620 | Thank you.
04:07:57.620 | Thank you.
04:07:59.620 | Thank you.
04:08:01.620 | Thank you.
04:08:03.120 | Thank you.
04:08:05.120 | Thank you.
04:08:07.120 | Thank you.
04:08:09.120 | Thank you.
04:08:11.120 | Thank you.
04:08:13.120 | Thank you.
04:08:15.120 | Thank you.
04:08:17.120 | Thank you.
04:08:19.120 | Thank you.
04:08:21.120 | Thank you.
04:08:23.120 | Thank you.
04:08:25.120 | Thank you.
04:08:27.120 | Thank you.
04:08:29.620 | Thank you.
04:08:31.120 | Thank you.
04:08:33.120 | Thank you.
04:08:35.120 | Thank you.
04:08:37.120 | Thank you.
04:08:39.120 | Thank you.
04:08:41.120 | Thank you.
04:08:43.120 | Thank you.
04:08:45.120 | Thank you.
04:08:47.120 | Thank you.
04:08:49.120 | Thank you.
04:08:51.120 | Thank you.
04:08:53.120 | Thank you.
04:08:55.120 | Thank you.
04:08:57.620 | Thank you.
04:08:59.120 | Thank you.
04:09:01.120 | Thank you.
04:09:03.120 | Thank you.
04:09:05.120 | Thank you.
04:09:07.120 | Thank you.
04:09:09.120 | Thank you.
04:09:11.120 | Thank you.
04:09:13.120 | Thank you.
04:09:15.120 | Thank you.
04:09:17.120 | Thank you.
04:09:19.120 | Thank you.
04:09:21.120 | Thank you.
04:09:23.120 | Thank you.
04:09:25.620 | Thank you.
04:09:27.120 | Thank you.
04:09:29.120 | Thank you.
04:09:31.120 | Thank you.
04:09:33.120 | Thank you.
04:09:35.120 | Thank you.
04:09:37.120 | Thank you.
04:09:39.120 | Thank you.
04:09:41.120 | Thank you.
04:09:43.120 | Thank you.
04:09:45.120 | Thank you.
04:09:47.120 | Thank you.
04:09:49.120 | Thank you.
04:09:51.120 | Thank you.
04:09:53.620 | Thank you.
04:09:55.120 | Thank you.
04:09:57.120 | Thank you.
04:09:59.120 | Thank you.
04:10:01.120 | Thank you.
04:10:03.120 | Thank you.
04:10:04.620 | Thank you.
04:10:06.120 | Thank you.
04:10:07.620 | Thank you.
04:10:09.120 | Thank you.
04:10:10.620 | Thank you.
04:10:12.120 | Thank you.
04:10:14.120 | Thank you.
04:10:16.120 | Thank you.
04:10:18.120 | Thank you.
04:10:20.120 | Thank you.
04:10:22.120 | Thank you.
04:10:24.120 | Thank you.
04:10:26.120 | Thank you.
04:10:28.120 | Thank you.
04:10:30.120 | Thank you.
04:10:32.120 | Thank you.
04:10:34.120 | Thank you.
04:10:36.120 | Thank you.
04:10:38.620 | Thank you.
04:10:40.120 | Thank you.
04:10:41.620 | Thank you.
04:10:43.120 | Thank you.
04:10:44.620 | Thank you.
04:10:46.120 | Thank you.
04:10:47.620 | Thank you.
04:10:49.620 | Thank you.
04:10:51.620 | Thank you.
04:10:53.620 | Thank you.
04:10:55.620 | Thank you.
04:10:57.620 | Thank you.
04:10:59.620 | Thank you.
04:11:01.620 | Thank you.
04:11:03.620 | Thank you.
04:11:05.620 | Thank you.
04:11:07.620 | Thank you.
04:11:09.620 | Thank you.
04:11:11.620 | Thank you.
04:11:14.120 | Thank you.
04:11:15.620 | Thank you.
04:11:17.620 | Thank you.
04:11:19.620 | Thank you.
04:11:21.620 | Thank you.
04:11:23.620 | Thank you.
04:11:25.620 | Thank you.
04:11:27.620 | Thank you.
04:11:29.620 | Thank you.
04:11:31.620 | Thank you.
04:11:33.620 | Thank you.
04:11:35.620 | Thank you.
04:11:37.620 | Thank you.
04:11:39.620 | Thank you.
04:11:42.120 | Thank you.
04:11:43.620 | Thank you.
04:11:45.620 | Thank you.
04:11:47.620 | Thank you.
04:11:49.620 | Thank you.
04:11:51.620 | Thank you.
04:11:53.620 | Thank you.
04:11:55.620 | Thank you.
04:11:57.620 | Thank you.
04:11:59.620 | Thank you.
04:12:01.620 | Thank you.
04:12:03.620 | Thank you.
04:12:05.620 | Thank you.
04:12:07.620 | Thank you.
04:12:10.120 | Thank you.
04:12:11.620 | Thank you.
04:12:13.620 | Thank you.
04:12:15.620 | Thank you.
04:12:17.620 | Thank you.
04:12:19.620 | Thank you.
04:12:21.620 | Thank you.
04:12:23.620 | Thank you.
04:12:25.620 | Thank you.
04:12:27.620 | Thank you.
04:12:29.620 | Thank you.
04:12:31.620 | Thank you.
04:12:33.620 | Thank you.
04:12:35.620 | Thank you.
04:12:38.120 | Thank you.
04:12:39.620 | Thank you.
04:12:41.620 | Thank you.
04:12:43.620 | Thank you.
04:12:45.620 | Thank you.
04:12:47.620 | Thank you.
04:12:49.620 | Thank you.
04:12:51.620 | Thank you.
04:12:53.620 | Thank you.
04:12:55.620 | Thank you.
04:12:57.620 | Thank you.
04:12:59.620 | Thank you.
04:13:01.620 | Thank you.
04:13:03.620 | Thank you.
04:13:06.120 | Thank you.
04:13:07.620 | Thank you.
04:13:09.620 | Thank you.
04:13:11.620 | Thank you.
04:13:13.620 | Thank you.
04:13:15.620 | Thank you.
04:13:17.620 | Thank you.
04:13:19.620 | Thank you.
04:13:21.620 | Thank you.
04:13:23.620 | Thank you.
04:13:25.620 | Thank you.
04:13:27.620 | Thank you.
04:13:29.620 | Thank you.
04:13:31.620 | Thank you.
04:13:34.120 | Thank you.
04:13:35.620 | Thank you.
04:13:37.620 | Thank you.
04:13:39.620 | Thank you.
04:13:41.620 | Thank you.
04:13:43.620 | Thank you.
04:13:45.620 | Thank you.
04:13:47.620 | Thank you.
04:13:49.620 | Thank you.
04:13:51.620 | Thank you.
04:13:53.620 | Thank you.
04:13:55.620 | Thank you.
04:13:57.620 | Thank you.
04:13:59.620 | Thank you.
04:14:02.120 | Thank you.
04:14:03.620 | Thank you.
04:14:05.620 | Thank you.
04:14:07.620 | Thank you.
04:14:09.620 | Thank you.
04:14:11.620 | Thank you.
04:14:13.620 | Thank you.
04:14:15.620 | Thank you.
04:14:17.620 | Thank you.
04:14:19.620 | Thank you.
04:14:21.620 | Thank you.
04:14:23.620 | Thank you.
04:14:25.620 | Thank you.
04:14:27.620 | Thank you.
04:14:29.620 | Thank you.
04:14:31.620 | Thank you.
04:14:33.120 | Thank you.
04:14:35.120 | Thank you.
04:14:37.120 | Thank you.
04:14:39.120 | Thank you.
04:14:41.120 | Thank you.
04:14:43.120 | Thank you.
04:14:45.120 | Thank you.
04:14:47.120 | Thank you.
04:14:49.120 | Thank you.
04:14:51.120 | Thank you.
04:14:53.120 | Thank you.
04:14:55.120 | Thank you.
04:14:57.120 | Thank you.
04:14:59.620 | Thank you.
04:15:01.120 | Thank you.
04:15:03.120 | Thank you.
04:15:05.120 | Thank you.
04:15:07.120 | Thank you.
04:15:09.120 | Thank you.
04:15:11.120 | Thank you.
04:15:13.120 | Thank you.
04:15:15.120 | Thank you.
04:15:17.120 | Thank you.
04:15:19.120 | Thank you.
04:15:21.120 | Thank you.
04:15:23.120 | Thank you.
04:15:25.120 | Thank you.
04:15:27.120 | Thank you.
04:15:29.120 | Thank you.
04:15:30.620 | Thank you.
04:15:32.620 | Thank you.
04:15:34.620 | Thank you.
04:15:36.620 | Thank you.
04:15:38.620 | Thank you.
04:15:40.620 | Thank you.
04:15:42.620 | Thank you.
04:15:44.620 | Thank you.
04:15:46.620 | Thank you.
04:15:48.620 | Thank you.
04:15:50.620 | Thank you.
04:15:52.620 | Thank you.
04:15:54.620 | Thank you.
04:15:57.120 | Thank you.
04:15:58.620 | Thank you.
04:16:00.620 | Thank you.
04:16:02.620 | Thank you.
04:16:04.620 | Thank you.
04:16:06.620 | Thank you.
04:16:08.620 | Thank you.
04:16:10.620 | Thank you.
04:16:12.620 | Thank you.
04:16:14.620 | Thank you.
04:16:16.620 | Thank you.
04:16:18.620 | Thank you.
04:16:20.620 | Thank you.
04:16:22.620 | Thank you.
04:16:25.120 | Thank you.
04:16:26.620 | Thank you.
04:16:28.620 | Thank you.
04:16:30.620 | Thank you.
04:16:32.620 | Thank you.
04:16:34.620 | Thank you.
04:16:36.620 | Thank you.
04:16:38.620 | Thank you.
04:16:40.620 | Thank you.
04:16:42.620 | Thank you.
04:16:44.620 | Thank you.
04:16:46.620 | Thank you.
04:16:48.620 | Thank you.
04:16:50.620 | Thank you.
04:16:53.120 | Thank you.
04:16:54.620 | Thank you.
04:16:56.620 | Thank you.
04:16:58.620 | Thank you.
04:17:00.620 | Thank you.
04:17:02.620 | Thank you.
04:17:04.620 | Thank you.
04:17:06.620 | Thank you.
04:17:08.620 | Thank you.
04:17:10.620 | Thank you.
04:17:12.620 | Thank you.
04:17:14.620 | Thank you.
04:17:16.620 | Thank you.
04:17:18.620 | Thank you.
04:17:21.120 | Thank you.
04:17:23.120 | Thank you.
04:17:24.620 | Thank you.
04:17:26.620 | Thank you.
04:17:28.620 | Thank you.
04:17:30.620 | Thank you.
04:17:32.620 | Thank you.
04:17:34.620 | Thank you.
04:17:36.620 | Thank you.
04:17:38.620 | Thank you.
04:17:40.620 | Thank you.
04:17:42.620 | Thank you.
04:17:44.620 | Thank you.
04:17:46.620 | Thank you.
04:17:48.620 | Thank you.
04:17:50.620 | Thank you.
04:17:52.620 | Thank you.
04:17:54.620 | Thank you.
04:17:56.120 | Thank you.
04:17:58.120 | Thank you.
04:18:00.120 | Thank you.
04:18:02.120 | Thank you.
04:18:04.120 | Thank you.
04:18:06.120 | Thank you.
04:18:08.120 | Thank you.
04:18:10.120 | Thank you.
04:18:12.120 | Thank you.
04:18:14.120 | Thank you.
04:18:16.120 | Thank you.
04:18:18.120 | Thank you.
04:18:20.120 | Thank you.
04:18:22.620 | Thank you.
04:18:24.620 | Thank you.
04:18:26.620 | Thank you.
04:18:28.120 | Thank you.
04:18:30.120 | Thank you.
04:18:32.120 | Thank you.
04:18:34.120 | Thank you.
04:18:36.120 | Thank you.
04:18:38.120 | Thank you.
04:18:40.120 | Thank you.
04:18:42.120 | Thank you.
04:18:44.120 | Thank you.
04:18:46.120 | Thank you.
04:18:48.120 | Thank you.
04:18:50.120 | Thank you.
04:18:52.120 | Thank you.
04:18:54.620 | Thank you.
04:18:56.120 | How you can share your screen.
04:19:02.120 | Okay.
04:19:04.120 | I understand.
04:19:06.120 | And yes, the Corgi extension.
04:19:11.120 | And I am going to ask Gemini real quick.
04:19:16.120 | I'm going to ask Gemini real quick a couple of questions.
04:19:20.120 | So let's go ahead and get started.
04:19:25.120 | Hey, Gemini, could you explain to me --
04:19:28.620 | Could you explain to me how I might go about changing my runtime type in Colabs?
04:19:34.620 | Yes, I can see the Colabs.
04:19:38.620 | Okay.
04:19:40.620 | I understand that you can see the Colab interface.
04:19:43.620 | Is there anything specific you would like to discuss or ask about it?
04:19:47.620 | Yep, yep, yep.
04:19:49.620 | So -- yep, yep, yep.
04:19:53.120 | Okay.
04:19:54.620 | Okay.
04:19:56.620 | Okay.
04:19:58.620 | Okay.
04:20:00.620 | Beautiful.
04:20:02.620 | Thank you.
04:20:04.620 | Cool.
04:20:06.620 | So, Gemini, could you explain to me how I would go about changing my runtime type?
04:20:12.620 | Sure, I can help with that.
04:20:14.620 | To change your runtime type in Google Colab,
04:20:16.620 | you need to go to the Runtime menu at the top of the page.
04:20:20.120 | Then you can select Change Runtime Type.
04:20:22.620 | A window will open where you can select the hardware accelerator you want.
04:20:26.620 | Amazing. This is great.
04:20:28.620 | So which one should I select if I wanted to train a model
04:20:32.620 | of the ones that are in white over here?
04:20:37.620 | Okay, I can help you with that.
04:20:39.620 | If you're looking to train a model and the options shown in the hardware accelerator section
04:20:43.620 | of the Change Runtime Type pop-up menu are the ones you have to choose from,
04:20:48.120 | the A4 GPU is a good choice for general model training.
04:20:50.620 | However, if your model training requires a lot of memory,
04:20:53.620 | you might want to try the A100 GPU instead.
04:20:58.620 | Amazing. Thank you.
04:21:00.620 | So, Gemini, how about this key section here off to the left?
04:21:07.620 | How would I go about adding a new key or a new secret?
04:21:13.620 | Certainly, I can assist with that.
04:21:16.120 | [Indiscernible]
04:21:34.620 | Just play the audio from this.
04:21:36.620 | Yeah, you can play the audio from the computer.
04:21:40.120 | [Indiscernible]
04:21:55.620 | So confusing.
04:21:57.620 | Any more questions?
04:21:59.620 | Is it this?
04:22:01.620 | That's just pure recording.
04:22:03.620 | He's here.
04:22:05.620 | Sorry about this.
04:22:08.120 | So when you play, is it coming from the laptop now?
04:22:10.620 | It's coming from the laptop.
04:22:14.620 | [Indiscernible]
04:22:33.620 | This is why we have to rehearse.
04:22:36.120 | [Indiscernible]
04:23:02.620 | Hey, Gemini.
04:23:05.120 | Can you help me understand how to add a new secret?
04:23:08.620 | Sure, I can help with that.
04:23:10.620 | Looking at the screen, you will need to click the add new secret button
04:23:13.620 | on the left side panel of the screen below the other secrets.
04:23:17.620 | After that, you can fill out the information for the new secret.
04:23:21.620 | Beautiful. Thank you so much.
04:23:23.620 | All right. So for Colab, how would I go about changing my runtime type?
04:23:33.120 | To change your runtime type in Colab, you can click on the runtime menu
04:23:35.620 | at the top of the page, and then select change runtime type.
04:23:39.620 | From there, you can choose between different options, such as GPU or TPU.
04:23:45.620 | Excellent. Explain to me what the difference between GPUs and TPUs are.
04:23:51.620 | Certainly. GPUs, or graphics processing units, are designed to handle
04:23:56.620 | parallel processing, which is ideal for tasks like rendering images and videos,
04:24:02.120 | and also machine learning. TPUs, or tensor processing units,
04:24:05.620 | are custom built by Google and specifically designed for deep learning tasks,
04:24:09.620 | making them often more efficient than GPUs for these tasks.
04:24:14.620 | Excellent. Thank you so much.
04:24:16.620 | How would I go about accessing the terminal in Microlab?
04:24:21.620 | To access the terminal in your Colab notebook, you can click on the tools menu
04:24:25.620 | at the top of the page, and then select command palette.
04:24:29.120 | Then, type terminal in the search bar, and click on open a terminal.
04:24:33.620 | This will open a terminal within your Colab environment.
04:24:36.620 | Excellent. Very cool.
04:24:38.620 | And what command would I need to type in order to list all of the files
04:24:42.620 | in my current directory?
04:24:46.620 | To list all the files in your current directory, you can use the command ls.
04:24:51.620 | Type ls into the terminal and press enter.
04:24:56.120 | Are there any other config options that I should add for ls
04:24:59.620 | to list all of the files, not just the directories?
04:25:03.620 | Certainly. If you want to see all files, including hidden files,
04:25:07.620 | you can use the command lsa.
04:25:10.620 | This will show all files in the current directory.
04:25:13.620 | If you would like to list all files in a detailed view,
04:25:16.620 | including their permissions, owner, size, and modification time,
04:25:19.620 | you can use lsastan-l.
04:25:23.120 | Also, combining both, lsil will show a detailed list of all files,
04:25:28.620 | including hidden ones.
04:25:32.620 | That's cool.
04:25:34.620 | You're welcome.
04:25:37.620 | Awesome.
04:25:39.620 | Is there anything else I can help you with?
04:25:41.620 | Yeah, you know, Gemini, I would really prefer if you addressed me
04:25:44.620 | and started talking like you're a pirate, then.
04:25:48.620 | Ahoy, matey. What be your next question, then?
04:25:52.120 | Shiver me timbers, I'm ready for it.
04:25:54.620 | Amazing. Help me understand what the tools command palette
04:26:01.620 | helps me do within Colab.
04:26:05.620 | Aye, matey. The command palette be a handy tool for you.
04:26:09.620 | It be a way to quickly access a wide range of commands and actions in Colab.
04:26:13.620 | You can use it to add a Gemini prompt cell, add a code cell,
04:26:17.620 | change your runtime type, and many other things.
04:26:21.120 | Now I want you to talk like a Texan.
04:26:26.620 | Aye, matey. I'll gladly talk about the text.
04:26:29.620 | What be your question about it?
04:26:31.620 | No, I said please talk like a Texan.
04:26:33.620 | I want you to have a Texas accent like John Lee.
04:26:38.620 | Alrighty then, partner. What can I do for yay now?
04:26:41.620 | Let's talk about that there, text like a true Texan.
04:26:43.620 | What do you want to discuss?
04:26:45.620 | Cool. Thank you, Gemini.
04:26:50.120 | You're welcome, partner. Happy to help.
04:26:52.620 | Let me know if you need anything else, you hear?
04:26:55.620 | So in addition to being able to interact with screens,
04:26:59.620 | interact with video, interact with audio,
04:27:02.620 | Gemini is also capable of natively outputting image tokens,
04:27:06.620 | so we haven't released that via AI Studio today.
04:27:10.620 | We can also couple Gemini together with code execution,
04:27:14.620 | function calling, so you can do things like say,
04:27:18.120 | Gemini, watch my screen, and if you see anything
04:27:21.620 | that looks like a particular kind of notification
04:27:25.620 | or a particular kind of flag, set off a call to a specific tool.
04:27:32.620 | I also like using it to help me as kind of like a posture corrector.
04:27:38.620 | So if I give Gemini access to my camera, which you can see here,
04:27:46.120 | so it just sees me. Hello.
04:27:48.620 | It can tell me if my posture is correct or incorrect.
04:27:53.620 | So Gemini, I'm taking a look at my posture.
04:27:56.620 | Should I stand up straighter? Do I have good posture?
04:28:00.620 | What's your assessment?
04:28:03.620 | Okay. Looking at the video, your posture does seem slightly slumped.
04:28:08.620 | To improve it, you might try standing up straighter,
04:28:11.620 | pulling your shoulders back gently,
04:28:14.120 | keeping them aligned with your spine.
04:28:16.620 | Well, thank you. I appreciate it.
04:28:19.620 | But anyhow, it's helpful to have that open
04:28:22.620 | as you're kind of sitting at your desk to see.
04:28:26.620 | It supports different kind of system instructions,
04:28:29.620 | so you can add things like the speak like a pirate
04:28:32.620 | or speak like a Texan up there.
04:28:34.620 | And then there are also a few different voice options.
04:28:37.620 | So if I go back to the main menu,
04:28:41.120 | you can see kind of a few different voices
04:28:44.620 | that you can play around with to test out some of your favorites.
04:28:48.620 | We also have kind of an audio and text.
04:28:52.620 | So if you don't want to have audio out responses,
04:28:54.620 | you can swap to being just text.
04:28:57.620 | We released some starter apps as well
04:29:00.620 | that I encourage you to take a look at
04:29:02.620 | to help experiment with things like bounding boxes.
04:29:05.620 | So you can see Gemini kind of identify bounding box locations
04:29:10.620 | for different things.
04:29:12.120 | So like the shadow of an original fox,
04:29:14.120 | the shadow of an original armadillo,
04:29:17.120 | and then the armadillo and the fox off to the side.
04:29:21.120 | It's also capable of doing this for things like socks.
04:29:24.120 | So being able to sort and filter between different kinds of socks.
04:29:28.120 | And then also for different kinds of desserts.
04:29:33.120 | So if you want to have bounding boxes natively output,
04:29:36.120 | this is something that's supported not just with images,
04:29:39.620 | but also screen by screen for videos.
04:29:42.120 | Cool. So as I mentioned,
04:29:44.120 | you can get started with it today at aistudio.google.com.
04:29:48.120 | We also have a very generous free tier.
04:29:50.120 | So all of this is freely available for you to use today to try out
04:29:54.120 | and to also use if you want to create API keys
04:29:57.120 | as part of your applications.
04:29:59.120 | Awesome. Thank you.
04:30:09.120 | So the question was, can you speak to the agentic research?
04:30:13.620 | And I'm trying to debate now
04:30:15.620 | how much I can speak to without getting fired.
04:30:18.620 | But the agentic research,
04:30:22.620 | we did release a couple of different options,
04:30:24.620 | including a data science agent within Colab,
04:30:27.620 | which is just using kind of commodity Gemini available APIs.
04:30:31.620 | So you could test them out, use them for your projects today.
04:30:35.120 | We also released something called Project Mariner,
04:30:37.620 | which is focused around computer use,
04:30:39.620 | being able to interact with websites directly within the browser.
04:30:43.620 | Again, strongly encourage you to try out multimodal streaming API
04:30:47.620 | coupled with tools use.
04:30:49.620 | And you'll probably be able to get very close
04:30:51.620 | to the kinds of experiments that you saw released just via video.
04:30:55.620 | But those are the kinds of things that we're focusing on for agents,
04:31:00.120 | not just being able to understand and generate text and code,
04:31:04.620 | but also to be able to interact
04:31:06.620 | with these multimodal and streaming experiences.
04:31:11.620 | Yeah, let's do it.
04:31:22.620 | So let's see.
04:31:30.620 | AWS would be interesting.
04:31:35.620 | I will pull up the cloud interface, though,
04:31:45.620 | and we'll see how well it does.
04:31:47.620 | I've never tried this either.
04:31:49.620 | I've tried Colab before, but I've never tried the cloud interface.
04:31:52.620 | So let's see.
04:31:55.620 | [Inaudible]
04:31:57.120 | Hey, Gemini, could you help me understand
04:32:05.120 | how I would use one of the cloud models on this interface?
04:32:11.120 | Oh, I had switched it to only output just text, I think.
04:32:23.120 | Okay.
04:32:33.620 | Audio.
04:32:38.620 | Hey, Gemini, could you help me understand
04:32:51.120 | how I would use one of the cloud interfaces
04:32:54.620 | or one of the cloud models within the screen?
04:32:59.620 | For some reason, it's not wanting to have the audio outputs anymore.
04:33:13.620 | Let me try again and refresh.
04:33:17.620 | Yeah.
04:33:19.120 | Yeah, definitely yes.
04:33:28.120 | So it's recognizing this.
04:33:31.120 | But, hey, Gemini?
04:33:40.120 | For some reason, the audio isn't wanting to work for me anymore.
04:33:45.620 | And I'm curious why that would be.
04:33:48.120 | I don't think it's because of the website.
04:33:52.120 | Oh, it might be on here.
04:33:55.120 | Yeah, because if you see --
04:34:01.120 | I think that's live demos for us.
04:34:04.120 | No, no, no.
04:34:06.120 | It's saying that, for whatever reason,
04:34:08.120 | it's not wanting to understand the audio anymore.
04:34:11.120 | [Inaudible]
04:34:13.120 | Yeah.
04:34:16.120 | [Inaudible]
04:34:17.620 | Wow, this is very strange, YouTube.
04:34:24.620 | Josh Johnson, the flower's to work.
04:34:27.620 | Yeah, it's not wanting to output the volume.
04:34:30.620 | Let me check.
04:34:32.620 | Do you want me to check?
04:34:34.620 | No, no, I think the next speaker doesn't have audio.
04:34:37.620 | Yeah. Cool, cool.
04:34:39.620 | Try it out.
04:34:41.620 | Thanks for showing. Thanks for the Q&A.
04:34:44.120 | Let me --
04:34:45.620 | And also, I encourage you all to go try it out
04:34:48.620 | with websites that are not Colab.
04:34:51.620 | And do it, like, right now with your laptop.
04:34:55.620 | So it should be fun.
04:34:57.620 | Awesome.
04:34:59.620 | Cool. Thank you.
04:35:01.620 | Now I'm trying to understand --
04:35:07.620 | Oh, yes.
04:35:09.620 | I'm trying to understand why the --
04:35:13.120 | Oh, it's working.
04:35:14.620 | So let me --
04:35:16.620 | I'm going to try it.
04:35:18.620 | Okay. Next speaker session.
04:35:20.620 | So we've got some pretty fun ones.
04:35:23.620 | The next two speakers,
04:35:25.620 | we've got Eugene and Dan
04:35:27.620 | from Together and Featherless.
04:35:29.620 | They're going to talk about
04:35:31.620 | alternative transformer architecture,
04:35:33.620 | so what goes on outside of the transformer.
04:35:36.620 | Eugene is a member of our paper club every week,
04:35:39.620 | core member there.
04:35:42.120 | He's got a whole article about hardware scaling
04:35:44.620 | and how it's VC subsidized.
04:35:46.620 | Hot Take, he's launched, like,
04:35:48.620 | one of the best 34B models.
04:35:50.620 | They're dropping it today.
04:35:52.620 | So their company has just trained
04:35:54.620 | some of the best models.
04:35:56.620 | Dan is with Together, same situation.
04:35:58.620 | Super smart people.
04:36:00.620 | They've also shipped a lot of this stuff.
04:36:02.620 | So we have a little bit of an overview
04:36:04.620 | of states-based models,
04:36:06.620 | overview of what's going on
04:36:08.620 | outside of transformers.
04:36:11.120 | They're going to share their hot takes,
04:36:12.620 | do a bit of a Q&A.
04:36:14.620 | So give it up for them.
04:36:16.620 | [no audio]
04:36:39.120 | >> The one that says test only?
04:36:41.120 | >> Yep.
04:36:43.620 | [no audio]
04:37:05.120 | >> All right, cool.
04:37:07.120 | Yeah, so thanks so much for having us.
04:37:09.120 | So this is going to be a little bit
04:37:11.120 | of a two-part presentation.
04:37:13.120 | My name is Dan.
04:37:14.620 | I'm at Together AI,
04:37:16.620 | and I'll be joining UCSD as faculty
04:37:18.620 | in about a year.
04:37:20.620 | Eugene, you want to introduce yourself?
04:37:22.620 | >> Yeah, I'm Eugene.
04:37:24.620 | I lead the art activity team,
04:37:26.620 | and I'm CEO and co-founder
04:37:28.620 | of Featherless, and we both work
04:37:30.620 | on this new post-transformer
04:37:32.620 | architecture space.
04:37:34.620 | >> Yeah, so today we're really
04:37:36.620 | excited to talk to you a little bit
04:37:38.620 | about that.
04:37:40.620 | So first I'm going to give a broad
04:37:42.620 | overview of post-transformer
04:37:44.120 | architectures, and then afterwards
04:37:46.120 | Eugene will tell us a little bit
04:37:48.120 | about the latest and the greatest
04:37:50.120 | and the latest frontier models
04:37:52.120 | in this space.
04:37:54.120 | So the story starts with scaling.
04:37:56.120 | So this is probably a figure
04:37:58.120 | or something like this that you've
04:38:00.120 | seen very recently.
04:38:02.120 | Over the last five to six years,
04:38:04.120 | we've seen models really scale up
04:38:06.120 | in parameter size, and that's
04:38:08.120 | brought with it a bunch of new
04:38:10.120 | capabilities like the ability to
04:38:12.120 | scale your WS screens.
04:38:13.620 | But another place where we've seen
04:38:15.620 | scaling, especially recently, is
04:38:17.620 | scaling in context length.
04:38:19.620 | So this can mean just having more
04:38:21.620 | text inputs for your models, but
04:38:23.620 | it can also mean things like
04:38:25.620 | taking a lot of visual token
04:38:27.620 | inputs, image inputs to your
04:38:29.620 | models, or generating lots of
04:38:31.620 | outputs.
04:38:33.620 | And one thing that's been really
04:38:35.620 | exciting over the last few months
04:38:37.620 | or so is that we're seeing scaling
04:38:39.620 | not only during training time, but
04:38:41.620 | this is one of the -- this is the
04:38:43.120 | iconic image from the OpenAI '01
04:38:45.120 | release. Not only are we starting
04:38:47.120 | to scale train time compute, but
04:38:49.120 | we're also starting to scale test
04:38:51.120 | time compute. Now, if you're
04:38:53.120 | familiar with our attention and
04:38:55.120 | our transformer architectures
04:38:57.120 | today, this graph on the right
04:38:59.120 | might look a little bit scary.
04:39:01.120 | And one of the reasons is that the
04:39:03.120 | implications are a little bit
04:39:05.120 | interesting. So what does it mean
04:39:07.120 | if we want to continue having
04:39:09.120 | smarter and smarter models? Do we
04:39:11.120 | need bigger and bigger data
04:39:12.620 | centers, spending more flops? Is
04:39:14.620 | this -- this little dolly three,
04:39:16.620 | we need more flops guy, is this
04:39:18.620 | going to be the future of all of
04:39:20.620 | AI? Or is there a better way,
04:39:23.620 | another path forward? Maybe we
04:39:25.620 | can get the same capabilities that
04:39:27.620 | we've gotten used to, but for a
04:39:29.620 | lot less compute, a lot less
04:39:31.620 | flops. And one of the things that
04:39:33.620 | we're going to talk about today is
04:39:35.620 | specifically looking at that core
04:39:37.620 | attention operator in some of
04:39:40.120 | these examples. The reason is
04:39:41.620 | that -- so this is just some
04:39:43.620 | basic, you know, scaling curves,
04:39:45.620 | but attention has compute that
04:39:47.620 | scales quadratically in the
04:39:49.620 | context length. So that means
04:39:51.620 | that if you're doing something
04:39:53.620 | like test time compute and you
04:39:55.620 | want to spend a bunch of tokens
04:39:57.620 | thinking about what comes next,
04:39:59.620 | the longer that goes, the more
04:40:01.620 | tokens you spend on that, that
04:40:03.620 | compute grows quadratically in
04:40:05.620 | that. One of the questions that
04:40:07.620 | we're interested in is can we
04:40:09.620 | scale from the bottom and get it
04:40:11.120 | to scale better? Can we scale
04:40:13.120 | in, let's say, n to the three
04:40:15.120 | halves or n log n? And so in the
04:40:17.120 | first part of the talk, so we
04:40:19.120 | just went over the introduction,
04:40:21.120 | what I'm going to do over the
04:40:23.120 | next few slides is just talk
04:40:25.120 | about some of the key advances
04:40:27.120 | and ideas that have shown over
04:40:29.120 | the past few years since maybe
04:40:31.120 | early 2020 to now that shown
04:40:33.120 | promise that this might actually
04:40:35.120 | be possible, that you can
04:40:37.120 | actually get potentially the same
04:40:39.120 | results. So to do that, and
04:40:40.620 | basically the story that we're
04:40:42.620 | going to look is we're going to
04:40:44.620 | start to see how, so this is a
04:40:46.620 | basic graph of just the past
04:40:48.620 | couple years of progress of
04:40:50.620 | perplexity where that blue line,
04:40:52.620 | that dotted blue line is
04:40:54.620 | attention, it's your basic
04:40:56.620 | transformer, full dense
04:40:58.620 | attention, and then the dots
04:41:00.620 | coming down are some of the
04:41:02.620 | methods that you'll see in this
04:41:04.620 | presentation today. We're going
04:41:06.620 | to turn the clock back all the
04:41:08.620 | way around, and we're going to
04:41:10.120 | look at how to make attention
04:41:12.120 | subquadratic. Basically, as soon
04:41:14.120 | as we said attention is all you
04:41:16.120 | need, people started asking this
04:41:18.120 | question. So we have this
04:41:20.120 | quadratic attention operator, can
04:41:22.120 | we do better? I'll briefly talk
04:41:24.120 | about why attention is
04:41:26.120 | quadratic, and the basic thing
04:41:28.120 | that happens, if you're not
04:41:30.120 | familiar, is that you have these
04:41:32.120 | inputs, these keys and queries,
04:41:34.120 | and what you do in this attention
04:41:36.120 | matrix, this S matrix over here,
04:41:38.120 | what happens beyond the, maybe
04:41:39.620 | not Gemini, because we don't
04:41:41.620 | necessarily know what architecture
04:41:43.620 | is, but let's say we upload it to
04:41:45.620 | Lama, what happens behind the
04:41:47.620 | scenes is that it's going to take
04:41:49.620 | every single word in that book and
04:41:51.620 | compare it to every other word.
04:41:53.620 | And this has led to some pretty
04:41:55.620 | impressive things, but it's kind
04:41:57.620 | of a brute forcing of the way that
04:41:59.620 | you would try to interpret
04:42:01.620 | something. What attention does in
04:42:03.620 | particular is the, and then what
04:42:05.620 | attention, sorry, don't want to,
04:42:07.620 | no laser pointer, what attention
04:42:09.120 | does afterwards is that instead of
04:42:11.120 | always operating in this quadratic
04:42:13.120 | thing, it takes a role-wise soft
04:42:15.120 | max over this matrix and then
04:42:17.120 | multiplies it by this values matrix.
04:42:19.120 | So one of the key points to notice
04:42:21.120 | is that the output size is always
04:42:23.120 | going to be the same as the inputs,
04:42:25.120 | at least in standard self-attention.
04:42:27.120 | So one of the first things that
04:42:29.120 | folks tried to do around 2020 is
04:42:31.120 | this thing called linear attention,
04:42:33.120 | which is just noticing that if we
04:42:35.120 | take out this soft max from here,
04:42:37.120 | it's going to be in the middle of
04:42:38.620 | the attention operation. And then
04:42:40.620 | if you compute the keys and the
04:42:42.620 | values operation first, you
04:42:44.620 | actually never hit this quadratic
04:42:46.620 | bottleneck. So that's potentially
04:42:48.620 | a way to get a lot more
04:42:50.620 | computationally efficient. And
04:42:52.620 | there are various ways to do this
04:42:54.620 | by basically using feature maps or
04:42:56.620 | try to approximate this overall
04:42:58.620 | attention computation. But some of
04:43:00.620 | this work sort of started to hit a
04:43:02.620 | wall in 2020. And the basic
04:43:04.620 | challenges were two. So one was
04:43:06.620 | actually hardware efficiency.
04:43:08.120 | Back then, it was kind of hard to
04:43:10.120 | get good quality with these linear
04:43:12.120 | attention operators. The other one
04:43:14.120 | was actually hardware efficiency.
04:43:16.120 | So this feature map that's just
04:43:18.120 | shown by a simplified here
04:43:20.120 | actually ends up being quite
04:43:22.120 | computationally expensive if you
04:43:24.120 | just implement it naively. So you
04:43:26.120 | started having these operators that
04:43:28.120 | not only were you sure you're not
04:43:30.120 | really sure if they have the same
04:43:32.120 | quality, but also they're actually
04:43:34.120 | just wall clock slower. So you kind
04:43:36.120 | of set the stage. So that kind of
04:43:37.620 | sets the stage for four years ago.
04:43:39.620 | Keep this in mind because linear
04:43:41.620 | attention is actually going to come
04:43:43.620 | back in a few years once we have a
04:43:45.620 | better understanding. But one of
04:43:47.620 | the works that started kicking off
04:43:49.620 | this mini revolution in
04:43:51.620 | post-transformer architectures was
04:43:53.620 | this idea called state-space model.
04:43:55.620 | So here the seminal work is one
04:43:57.620 | about our work in 2022. And this
04:44:00.620 | piece of work really brought
04:44:02.620 | together a few ideas from some
04:44:05.120 | long-running research lines of
04:44:07.620 | work. The first one was -- and
04:44:12.620 | this is really one of the keys to
04:44:14.620 | closing the gap in quality -- was
04:44:16.620 | just using things that if you talk
04:44:19.620 | to an electrical engineer off the
04:44:22.620 | street, they might know off the
04:44:25.620 | back of their hand. But taking
04:44:27.620 | some of those properties with how
04:44:29.620 | we model dynamical systems in
04:44:31.620 | signal processing and then using
04:44:34.120 | those ideas to model the inputs,
04:44:36.620 | the text tokens in, for example, a
04:44:39.620 | transformer-like next token
04:44:41.620 | prediction architecture. So some
04:44:43.620 | of those early state-space model
04:44:45.620 | papers were looking at this
04:44:47.620 | relatively simple recurrent update
04:44:49.620 | model that comes from maybe chapter
04:44:51.620 | one of a signal processing class,
04:44:53.620 | but then using some principle
04:44:55.620 | theory about how you should do that
04:44:57.620 | recurrent update in order to really
04:45:00.620 | get the most that you can out of
04:45:03.120 | your hidden state out of your
04:45:04.620 | sequence. So that was one key idea
04:45:07.620 | for quality. And when this was
04:45:10.620 | eventually realized, you started
04:45:12.620 | to see a bunch of benchmarks that
04:45:14.620 | were pretty sticky for a few years,
04:45:16.620 | things like long-range arena, some
04:45:18.620 | long-sequence evaluation
04:45:20.620 | benchmarks, there was stuff in
04:45:22.620 | time-series analysis. You started
04:45:26.620 | to see the quality tick up in
04:45:28.620 | meaningful ways. But the other key
04:45:32.120 | thing that was so influential about
04:45:34.620 | these state-space models is that
04:45:36.620 | they also had a key idea about how
04:45:38.620 | you can compute these things
04:45:40.620 | efficiently. So if you go back to
04:45:42.620 | your machine learning 101 class
04:45:44.620 | where you learned about RNNs, one
04:45:46.620 | thing that you may have learned is
04:45:48.620 | that they don't paralyze as well as
04:45:50.620 | detention, because if you just run
04:45:52.620 | them naively, you have to do this
04:45:54.620 | kind of sequential update to
04:45:56.620 | process new tokens, whereas in
04:45:58.620 | attention, you can process all the
04:46:01.120 | tokens. One of the key insights
04:46:02.620 | behind the S4 paper was that these
04:46:04.620 | were current models. You could take
04:46:06.620 | them, and you could also formulate
04:46:08.620 | them as a convolution. And in
04:46:10.620 | particular, with a convolution,
04:46:12.620 | instead of using a PyTorch conv1D
04:46:14.620 | operation, you can compute that
04:46:16.620 | with the FFT. And that would give
04:46:18.620 | you n log n compute in the
04:46:20.620 | sequence length n with an operator
04:46:22.620 | that was relatively well optimized
04:46:24.620 | for modern hardware. So those are
04:46:26.620 | really, I'd say, the two key ideas
04:46:29.120 | that started allowing these
04:46:30.620 | breakthroughs to happen in these
04:46:32.620 | non-transformer architectures. So
04:46:34.620 | these ideas about how to
04:46:36.620 | principally model -- sorry, how to
04:46:38.620 | model the recurrent updates of a
04:46:40.620 | sequence in a principled way, and
04:46:42.620 | also these key ideas in how you
04:46:44.620 | can compute it efficiently by
04:46:46.620 | turning it into a convolution and
04:46:48.620 | then scaling it up with the FFT.
04:46:50.620 | Along those same lines, so
04:46:54.620 | afterwards, we started putting out
04:46:57.120 | some work on specialized kernels.
04:46:58.620 | So just like we have Flash
04:47:00.620 | attention for transformers, we
04:47:02.620 | also have works like Flash FFT
04:47:04.620 | conv. And if you look at these
04:47:06.620 | lines of work, oftentimes whenever
04:47:08.620 | you see a new architecture, you
04:47:10.620 | see a new primitive. One of the
04:47:12.620 | table stakes now is, do you have
04:47:14.620 | an efficient kernel so that you
04:47:16.620 | can actually get wall clock speed
04:47:18.620 | up? So by 2022, 2023, we were
04:47:20.620 | starting to have these models that
04:47:22.620 | had promising quality primitives
04:47:24.620 | and also promising wall clocks. So
04:47:26.620 | we were better than transformers
04:47:28.120 | in meaningful ways. That being
04:47:30.120 | said, there were still sometimes
04:47:32.120 | a quality gap, particularly for
04:47:34.120 | language modeling. And because
04:47:36.120 | language is so core to what we do
04:47:38.120 | in sequence modeling these days,
04:47:40.120 | the next key idea that I'm going
04:47:42.120 | to talk about is this idea of
04:47:44.120 | selection mechanisms. And this is
04:47:46.120 | basically an idea of -- so you
04:47:48.120 | have this recurrent state that
04:47:50.120 | you're keeping around that just
04:47:52.120 | summarizes everything that came
04:47:54.120 | before. And to get a good sequence
04:47:56.120 | model, the best thing you're able
04:47:57.620 | to do is have the model learn
04:47:59.620 | what's the best way to pick out
04:48:01.620 | pieces from that recurrent state.
04:48:03.620 | So one of the major ideas here in
04:48:05.620 | a line of work called H3, Hungry
04:48:07.620 | Hungry Hippos, and also these
04:48:09.620 | hyena models were -- one way you
04:48:11.620 | can do this is by just adding some
04:48:13.620 | simple element-wise gates. So
04:48:15.620 | versions of these ideas have been
04:48:17.620 | around for decades. If you squint
04:48:19.620 | at the LSTM paper, you can
04:48:21.620 | probably find this gating
04:48:24.120 | mechanism. But turns out you can
04:48:25.620 | take those old ideas, add them into
04:48:27.620 | these new states-based models, and
04:48:29.620 | then you can see quality start to
04:48:31.620 | pick up. If you've heard of the
04:48:34.620 | Mamba model, this also takes the
04:48:36.620 | selection to the next level by
04:48:38.620 | actually making some changes in
04:48:40.620 | that fundamental recurrent
04:48:42.620 | states-based. So it's not only
04:48:44.620 | just this gating that happens
04:48:46.620 | around the SSM layer, but also you
04:48:48.620 | can actually make the ABCD
04:48:50.620 | matrices of your states-based
04:48:53.120 | model, you can make them data
04:48:54.620 | dependent, which will allow you to
04:48:56.620 | even better select out different
04:48:58.620 | pieces from your hidden state
04:49:00.620 | depending on what you're seeing.
04:49:02.620 | I'll also point out if you look at
04:49:04.620 | the bottom right of this figure,
04:49:06.620 | there's this little triangle with
04:49:08.620 | a GPU SRAM, GPU HBM, and this is
04:49:10.620 | just continuing that trend of when
04:49:12.620 | you have a new architecture, you
04:49:14.620 | also release it with a kernel to
04:49:16.620 | show that it is hardware
04:49:18.620 | efficient, that it can be hardware
04:49:21.120 | efficient everywhere. One of the
04:49:26.620 | next cool things that happened is
04:49:28.620 | once we had this understanding of
04:49:30.620 | these are the basic pieces, these
04:49:32.620 | are the basic principles behind
04:49:34.620 | some of the sequence models, linear
04:49:36.620 | attention actually started to come
04:49:38.620 | back. So earlier this year, there
04:49:40.620 | was a model called BASED from
04:49:42.620 | Simran Arora and some other folks
04:49:44.620 | that combined a more principled
04:49:46.620 | version of linear attention that
04:49:49.120 | used the two-second summaries that
04:49:50.620 | it used a Taylor approximation of
04:49:52.620 | the softmax attention, combined
04:49:54.620 | that with a simple sliding window
04:49:56.620 | attention and was starting to be
04:49:58.620 | able to expand the Pareto frontier
04:50:00.620 | of how much data can you recall
04:50:02.620 | from your sequence versus how
04:50:04.620 | small is your recurrent state size.
04:50:06.620 | So those orange dots at the top
04:50:08.620 | there are just showing smaller
04:50:10.620 | sequences that can recall more
04:50:12.620 | memory. And the last major thing
04:50:18.120 | or the last major idea I think
04:50:19.620 | that has been influential in this
04:50:21.620 | line of work and is very
04:50:23.620 | relatively late-breaking just a
04:50:25.620 | few months ago is just the basic
04:50:27.620 | idea that when you have these
04:50:29.620 | models that are fundamentally more
04:50:31.620 | efficient in the sequence length,
04:50:33.620 | you maybe don't want to prompt
04:50:35.620 | them or use them in exactly the
04:50:37.620 | same way. So this was a really
04:50:39.620 | cool paper called Just Read Twice
04:50:41.620 | also from Simran that basically
04:50:43.620 | said, hey, all these efficient
04:50:45.620 | models can process tokens so much
04:50:47.620 | that tokens have unfair advantages
04:50:49.120 | compared to a simple transformer
04:50:51.120 | token. Sorry, a simple
04:50:53.120 | transformer model. So take, for
04:50:55.120 | example, the standard use case
04:50:57.120 | of you have some long document,
04:50:59.120 | you're going to pass it in as
04:51:01.120 | input, and then you're going to
04:51:03.120 | ask some question about it. One
04:51:05.120 | problem you might imagine for a
04:51:07.120 | recurrent model where you have a
04:51:09.120 | fixed state size is, let's say
04:51:11.120 | that your article is very long
04:51:13.120 | and you're trying to ask about
04:51:15.120 | some really niche thing, you can
04:51:17.120 | really hard for the model to know
04:51:18.620 | ahead of time what information to
04:51:20.620 | put into the hidden state. But
04:51:22.620 | these models are so much more
04:51:24.620 | efficient that you can do
04:51:26.620 | something really stupid like you
04:51:28.620 | can just put the document, write
04:51:30.620 | down the document, write down the
04:51:32.620 | question, write down the document
04:51:34.620 | again, and then write down the
04:51:36.620 | question again. And then this
04:51:38.620 | time, the second time that you go
04:51:40.620 | over that document, you know
04:51:42.620 | exactly what to look for. And the
04:51:44.620 | cool thing about this is so this
04:51:46.620 | it really takes advantage of the
04:51:48.120 | more efficient architectures that
04:51:50.120 | we're having here. So one of the
04:51:52.120 | other, I think, influential ideas
04:51:54.120 | in this line of work is if you
04:51:56.120 | change the fundamental compute
04:51:58.120 | capabilities of your model and the
04:52:00.120 | way that it scales, you can
04:52:02.120 | actually start to query it at test
04:52:04.120 | time differently. And this
04:52:06.120 | actually, of course, goes back to
04:52:08.120 | those slides on test time compute.
04:52:10.120 | So while everybody's looking at,
04:52:12.120 | say, test time compute for big
04:52:14.120 | transformer models, I think
04:52:16.120 | everybody's looking at how can you
04:52:17.620 | take those and how does it change
04:52:19.620 | with this new next generation of
04:52:21.620 | models? So I'll just briefly
04:52:23.620 | summarize what some of those key
04:52:25.620 | ideas were and then talk and then
04:52:27.620 | show you briefly kind of what the
04:52:29.620 | state of the art is today. So the
04:52:31.620 | four key ideas are instead of just
04:52:33.620 | doing a simple linear attention
04:52:35.620 | approximation, instead take ideas
04:52:37.620 | that we know from other fields
04:52:39.620 | like signal processing, do a more
04:52:41.620 | principled approach to your
04:52:43.620 | modeling of the sequence. Another
04:52:45.620 | key idea throughout all these
04:52:47.120 | lines of work is you really want
04:52:49.120 | hardware and kernel support from
04:52:51.120 | day one. So even if your model
04:52:53.120 | is theoretically more efficient,
04:52:55.120 | if somebody goes and runs it and
04:52:57.120 | it's two times slower, one of the
04:52:59.120 | things that we've learned is that
04:53:01.120 | if you're in that situation, it's
04:53:03.120 | just going to be dead on arrival.
04:53:05.120 | So you want to be designing your
04:53:07.120 | architectures with the hardware
04:53:09.120 | in mind. One of the key machine
04:53:11.120 | learning ideas that has been
04:53:13.120 | important for the quality is just
04:53:15.120 | making sure that you encode
04:53:16.620 | different ways that you can select
04:53:18.620 | from your hidden state and really
04:53:20.620 | focus on that as a key decider of
04:53:22.620 | quality. And finally, I think one
04:53:24.620 | of the emerging new things for this
04:53:27.620 | line of work and something that's
04:53:29.620 | quite interesting is what are the
04:53:31.620 | right test time paradigms for these
04:53:33.620 | models? How do they change relative
04:53:35.620 | to what you might do for a standard
04:53:38.620 | transformer? I'll briefly end this
04:53:41.620 | section. So I've labeled this slide
04:53:44.120 | where we are yesterday because
04:53:45.620 | Eugene is going to talk about some
04:53:47.620 | new models that he released
04:53:49.620 | literally this morning. But as of
04:53:51.620 | yesterday, some of the really cool
04:53:53.620 | results out of these efficient
04:53:55.620 | alternative models were -- so AI2
04:53:57.620 | trained this hybrid MOE called
04:53:59.620 | Jamba that seems -- that is
04:54:01.620 | currently the state-of-the-art for
04:54:03.620 | these non-transformer
04:54:05.620 | architectures. There's this --
04:54:07.620 | NVIDIA and MIT put out this new
04:54:09.620 | diffusion model called SANA
04:54:12.120 | recently that one of their key
04:54:14.620 | observations is that you can take a
04:54:16.620 | standard diffusion -- transformer
04:54:18.620 | diffusion model, replace the layers
04:54:20.620 | with linear attention, and then that
04:54:22.620 | lets you scale to much larger
04:54:25.620 | images, much larger sequences more
04:54:29.620 | efficiently. And one thing that I
04:54:32.620 | don't think anybody would have
04:54:34.620 | called when -- a few years ago is
04:54:36.620 | that one of those gated SSM, gated
04:54:41.120 | states-based models, ended up on
04:54:42.620 | the cover of Science because a
04:54:44.620 | great group of folks went and
04:54:46.620 | trained some DNA models. So that's
04:54:48.620 | Michael Polly, Eric Yuen from
04:54:51.620 | Stanford and the ARC Institute.
04:54:53.620 | So it's -- we're really at an
04:54:55.620 | exciting time in 2024 where these
04:54:57.620 | non-transformer, post-transformer
04:54:59.620 | architectures are showing promise
04:55:01.620 | across a wide range -- across a
04:55:04.620 | wide range of modalities, of
04:55:08.120 | applications, and of tasks. And
04:55:10.620 | with that, I'll pass it on to
04:55:11.620 | Eugene who can tell you a little
04:55:13.620 | bit about the latest and greatest
04:55:15.620 | with RWKV.
04:55:16.620 | >> Yeah. So --
04:55:18.620 | >> You're talking to here.
04:55:20.620 | >> Oh, I'm talking to here. Okay.
04:55:22.620 | So, yeah, two streams. Yeah. So I
04:55:24.620 | think one common question that we
04:55:26.620 | tend to get asked is what's the
04:55:28.620 | difference between RWKV and
04:55:30.620 | states-based. So I think one of
04:55:32.620 | the key things to really
04:55:33.620 | understand, right, the difference
04:55:35.620 | between the two groups, right, is
04:55:37.620 | that we are actually more like an
04:55:39.120 | open-source rental internet meets
04:55:41.120 | academia kind of situation. Like
04:55:43.120 | most of us never wrote any paper,
04:55:45.120 | but we basically look at RNNs and
04:55:48.120 | linear intention, when intention
04:55:50.120 | is all you need came out, and then
04:55:52.120 | we decided to, like, hey, there is
04:55:54.120 | a quadratic scaling problem. Why
04:55:56.120 | don't we try fixing that instead?
04:55:58.120 | So we end up developing our own
04:56:00.120 | branch, but we end up sharing
04:56:02.120 | ideas back and forth. And we do
04:56:04.120 | all this actively in Discord,
04:56:06.620 | GitHub, et cetera. This was so bad
04:56:09.120 | for a few years, right, that
04:56:11.120 | basically the average group's H
04:56:13.120 | index was so close to zero, right,
04:56:15.120 | AI actually came in and helped us
04:56:17.120 | write our first paper. Great. Now
04:56:19.120 | our H index is now three,
04:56:21.120 | apparently. But the thing is, like,
04:56:23.120 | a lot of these experiments led to
04:56:25.120 | results, and essentially we took
04:56:29.120 | the same ideas from linear
04:56:31.120 | intention, and we built on it. So
04:56:34.120 | to take a step back into, like,
04:56:36.120 | how does RWKB handle its own
04:56:37.620 | attention mechanic, and achieve
04:56:39.620 | the same goals of, like, O(n)
04:56:41.620 | compute, respectively, and in
04:56:44.620 | focus of our overall goal to make
04:56:46.620 | AI accessible to everyone, regardless
04:56:48.620 | of language, nation, or compute,
04:56:50.620 | that's our goal. We actually train
04:56:52.620 | our models primarily on over 100
04:56:54.620 | languages, which is another topic
04:56:56.620 | altogether, and our goal is to train
04:56:58.620 | to even 200 languages to cover all
04:57:00.620 | languages in the world. But at the
04:57:02.620 | same time, we work on this
04:57:05.120 | compute cost so that people can run
04:57:06.620 | in Raspberry Pis and on anything.
04:57:09.620 | So, how did RWKB break the
04:57:12.620 | dependency of LSTM token flow?
04:57:14.620 | Because I think to understand
04:57:16.620 | architecture, right, it's probably
04:57:18.620 | easier to understand it from the
04:57:20.620 | RNN lens, because that's where we
04:57:22.620 | built on. We all state space kind
04:57:24.620 | of, like, try to start anew, and
04:57:26.620 | took lessons from that, so there's
04:57:28.620 | a little bit of divergence there,
04:57:30.620 | aka this is our version of linear
04:57:33.120 | architecture. But if you step back,
04:57:34.620 | all foundation models, be it
04:57:36.620 | transformers or non-transformers, at
04:57:38.620 | a very high level, right, comes in
04:57:40.620 | a token, I mean, takes things into
04:57:42.620 | embeddings, and goes through a lot
04:57:44.620 | of layers, generate a lot of internal
04:57:46.620 | states, whether QKB cache or RNN
04:57:49.620 | states or RWKB states, and outputs
04:57:51.620 | an embedding, they are not in
04:57:53.620 | sampling, and we just take more
04:57:55.620 | layers and more embeddings, and
04:57:57.620 | somehow that magically works. So,
04:57:59.620 | if you remember your ancient RNN
04:58:02.120 | lessons, which we, which we, which
04:58:03.620 | we, which we call "Blessed Learning"
04:58:05.620 | these days, the general idea is
04:58:07.620 | that you have the embedding
04:58:09.620 | information flowing all the way up,
04:58:11.620 | and when, and you take that
04:58:13.620 | information, and you flow it back
04:58:15.620 | down, and then you process it as
04:58:17.620 | part of your LSTM layers. So, this
04:58:19.620 | is how it generally works. Kapati
04:58:21.620 | is quoted saying that RNNs are
04:58:23.620 | actually unreasonably effective.
04:58:25.620 | The problem is this is not scalable.
04:58:27.620 | To start doing work on the second
04:58:29.620 | token, you need to wait for the
04:58:31.620 | second token, yada yada. That is
04:58:33.120 | CPU land, not GPU land. So, so,
04:58:35.120 | so, you can have a H100 and you
04:58:37.120 | can't even use 1% of it. So, so
04:58:39.120 | that's kind of why RNNs didn't
04:58:41.120 | really take off in the direction
04:58:43.120 | that we wanted, like, billions of
04:58:45.120 | parameters when it comes to training.
04:58:47.120 | So, what did RWKB version 0 do?
04:58:49.120 | We just did the dumbest, lamest
04:58:51.120 | thing. Sorry, this is the bottleneck
04:58:53.120 | for RNN. We did the dumb thing of
04:58:55.120 | removing that line. And it kind of
04:58:57.120 | worked. It trained, it sucked, but
04:58:59.620 | it kind of worked. Then we were
04:59:01.120 | like, hey, then no one cared
04:59:03.120 | because the loss was crap. But how
04:59:05.120 | do we improve that? And that's
04:59:07.120 | essentially where we move forward.
04:59:09.120 | Because if you see this kind of
04:59:11.120 | flow, right, you can actually get
04:59:13.120 | the, you can get your GPU saturated
04:59:15.120 | quickly where it essentially
04:59:17.120 | cascades respectively. So, I'm just
04:59:19.120 | waiting for this to loop again. So,
04:59:21.120 | it's like once you get your first
04:59:23.120 | layer, your token to be computed
04:59:25.120 | finished, you start to cascade your
04:59:27.120 | compute all the way until you're,
04:59:29.120 | your GPU. So, we worked on it and
04:59:30.620 | we started going along the principle
04:59:32.620 | of as long as we keep this general
04:59:34.620 | architecture where we can cascade
04:59:36.620 | and be highly efficient with our
04:59:38.620 | architecture, nothing is sacred in
04:59:40.620 | our architecture. And we have done
04:59:42.620 | some crazy ideas. In fact, you ask
04:59:44.620 | us, if you ask me to explain some
04:59:46.620 | things in the paper, right,
04:59:48.620 | officially in the paper, I'll say
04:59:50.620 | we had this idea and we wrote it
04:59:52.620 | this way. The reality is someone
04:59:54.620 | came with a code, we tested it, it
04:59:56.620 | worked, and then we rationalized
04:59:58.620 | later. So, the general idea behind
05:00:01.120 | RWA KVR is that we generally have
05:00:04.120 | two major blocks that we do. We
05:00:06.120 | call it time mix and channel mix.
05:00:08.120 | And time mix generally handles
05:00:11.120 | long-term memory states where
05:00:13.120 | essentially, where we apply the
05:00:16.120 | matrix multiplication and silhouette
05:00:18.120 | activation functions into processing
05:00:20.120 | input embedding and then output
05:00:22.120 | embedding. I'm oversimplifying it
05:00:24.120 | because this calculation changed
05:00:26.620 | every version and we have like
05:00:28.120 | version 7 right now. Channel mix
05:00:30.120 | is similar to Bayes in the sense
05:00:32.120 | that it does shorter-term attention
05:00:34.120 | where you just look at the sister
05:00:36.120 | token or the token before it because
05:00:38.120 | there's a shift in the token shift
05:00:40.120 | matrix. I don't really want to go
05:00:42.120 | too much into the papers itself
05:00:44.120 | because we do have three papers on
05:00:46.120 | this. Basically, RWKV, RNN for
05:00:48.120 | Transformer, ERA, Ego and Finch
05:00:50.120 | RWKV, Matrix Value State. This is
05:00:52.120 | the updated version 5, version 6.
05:00:54.620 | And GoFinch is our hybrid model
05:00:57.120 | respectively. We are writing the
05:01:00.120 | paper already for V7, which is for
05:01:04.120 | RWKV7, codenamed Goose. All our
05:01:08.120 | architectures are codenamed by a
05:01:10.120 | bird. And I'm going to cover as
05:01:12.120 | well, QRWKV and MAMA-RWKV and
05:01:15.120 | RWKVMV. So where did that lead to?
05:01:18.120 | Because we were all GPU poor. And
05:01:21.120 | to be clear, most of this research
05:01:23.620 | is done only on a handful H100s,
05:01:26.120 | which I had one Google researcher
05:01:28.120 | told me that was his experiment
05:01:30.120 | budget for a single researcher. So
05:01:32.120 | our entire organization has less
05:01:34.120 | compute than a single researcher in
05:01:37.120 | Google. One of the things that we
05:01:39.120 | explored into was how do we convert
05:01:42.120 | Transformer models instead? Because
05:01:44.120 | someone already paid that million
05:01:46.120 | dollars onto training, so why don't
05:01:48.120 | we take advantage of those weights?
05:01:50.620 | And I believe together AI worked on
05:01:53.120 | the lowercase for the MAMA side of
05:01:56.120 | things, and we took some ideas from
05:01:58.120 | there as well, and we essentially did
05:02:00.120 | that for RWKV. And that led to
05:02:04.120 | QRWKV6, which we just dropped today,
05:02:06.120 | a 32-bit instruct preview model,
05:02:08.120 | where we took the quaint 32-bit
05:02:10.120 | instruct model, freeze the feed
05:02:12.120 | forward layer, remove the QKV
05:02:15.120 | attention layer, and replace it with
05:02:17.120 | RWKV linear layers. So to be clear,
05:02:20.120 | this means we do not have the
05:02:21.620 | RWKV channel mix layer, we only have
05:02:23.620 | the time mix layer. But once we do
05:02:26.620 | that, we train the RWKV layer.
05:02:28.620 | Important is that the feed forward
05:02:30.620 | layer needs to be frozen, so the new
05:02:32.620 | attention can be learned. And then we
05:02:34.620 | unfreeze the feed forward layer and
05:02:36.620 | train all the layers together with a
05:02:38.620 | custom learning rate schedule so that
05:02:40.620 | they can learn how to work together.
05:02:42.620 | The end result, surprisingly, and to
05:02:44.620 | be honest, to the frustration of the
05:02:46.620 | RWKV MOE team, which ended up
05:02:49.120 | on the same day, was that with just a
05:02:51.620 | few hours of training on two nodes, we
05:02:54.620 | managed to get it to be on par, kind of,
05:02:57.620 | with the original quaint 32-bit model.
05:02:59.620 | So in fact, when the first run, right,
05:03:01.620 | that completely confused us. And I was
05:03:04.620 | telling Daniel Goldstein, who kind of
05:03:07.620 | leads most of our research coordination,
05:03:09.620 | "When you pitched me this idea, you
05:03:11.620 | told me at best you'll get the same
05:03:13.620 | level of performance. You didn't tell
05:03:16.120 | me that the MMOGrad score will shoot up."
05:03:19.620 | I don't know what's happening there,
05:03:21.620 | but it did. MMOU score dropping, that
05:03:24.620 | was expected, because if you think about
05:03:26.620 | it, when we were training all the layers,
05:03:28.620 | right, we were essentially, like,
05:03:30.620 | Frankensteining this thing, and we did
05:03:32.620 | brain damage to the feed forward network
05:03:34.620 | layer with the new RWKV layers, but
05:03:36.620 | 76%, hey, somehow it's retained, and we
05:03:38.620 | can probably further train this. We
05:03:40.620 | didn't even spend, like, three days
05:03:42.620 | training this, so there's a lot more
05:03:45.120 | that can be done, hence the preview.
05:03:46.620 | But this brings up a big question,
05:03:50.620 | because we are already now in the
05:03:52.620 | process of converting the 70B.
05:03:54.620 | This is actually extremely compute
05:03:56.620 | efficient to test our attention
05:03:58.620 | mechanic. It's like, it becomes a
05:04:00.620 | shortcut. We are already planning to
05:04:02.620 | do our version 7 and our hybrid
05:04:04.620 | architecture for it, because we don't
05:04:06.620 | train from scratch, and we get a really
05:04:08.620 | good model out of it. And the other
05:04:10.620 | thing that is uncomfortable to say
05:04:12.620 | is that because we are doing right now
05:04:14.620 | on 70B, is that if this scales
05:04:18.120 | correctly to 128K context length, I'm
05:04:20.120 | not even talking about a million, 128,
05:04:24.120 | majority of enterprise workload today
05:04:27.120 | is just on 70B at under 32K context
05:04:30.120 | length. That means if this works and
05:04:33.120 | the benchmark matches it, it means we
05:04:35.120 | can replace the vast majority of
05:04:37.120 | current AI workload, unless you want
05:04:39.120 | super long context. And then, sorry,
05:04:41.120 | can someone give us more GPUs, because
05:04:43.620 | we don't need the vRAM for super long
05:04:45.120 | context, sadly. So yeah, that's what
05:04:48.120 | we are working on, and essentially we
05:04:50.120 | are excited about this to just push it
05:04:52.120 | further. And this conversion process,
05:04:54.120 | to be clear, I don't think it's going
05:04:56.120 | to be exclusive to RWKB. It probably
05:04:59.120 | will work for Mamba as well. I don't
05:05:01.120 | see why not. And we will probably see
05:05:03.120 | more ideas, or more experiments, or
05:05:05.120 | more hybrids. Like, yeah, like one of
05:05:07.120 | the weirdest things that I wanted to
05:05:09.120 | say outright, and I confirmed this with
05:05:11.120 | the Black Mamba team and the Jamba
05:05:13.120 | team, because we did the Goldfinch
05:05:14.620 | hybrid model, is that none of us
05:05:16.620 | understand why a hybrid with a state-based
05:05:19.620 | model, be it RWKB or state-based, and
05:05:21.620 | transformer, performs better than the
05:05:25.620 | baseline of both. It's like, when you
05:05:28.620 | train one, you expect, and then you
05:05:30.620 | replace, you expect the same results.
05:05:32.620 | That's our pitch. That's our claim.
05:05:34.620 | But somehow when we jam both together,
05:05:36.620 | it outperforms both. And that's like
05:05:38.620 | one area of evaluation that, like, we
05:05:40.620 | only have four experiments, plus four
05:05:42.620 | hybrids, that a lot more needs to be
05:05:44.120 | done. But these are things that excite
05:05:46.120 | me, essentially, because that is what
05:05:48.120 | potentially we can move ahead for.
05:05:50.120 | Which brings us to what comes next.
05:05:53.120 | So this part is kind of just some,
05:06:03.120 | or we'll talk a little bit about stuff
05:06:05.120 | that we're excited about, maybe have
05:06:08.620 | a little wild speculation on what's
05:06:12.120 | coming next. And of course, this is
05:06:14.120 | also the part that will be more open
05:06:16.120 | to questions. So a couple things that
05:06:19.120 | I'm excited about is continued hardware
05:06:23.120 | model co-design for these models. So
05:06:27.120 | one of the things that we've put out
05:06:29.120 | recently is this library called
05:06:31.120 | Thunder Kittens. It's a CUDA library.
05:06:33.120 | And one of the things that we found
05:06:35.120 | frustrating is every time that we built
05:06:37.620 | PyTorch and I'm sure you had the exact
05:06:39.120 | same experience, we'd have to go and
05:06:41.120 | spend two months in CUDA land writing
05:06:43.120 | these new efficient things. And if we
05:06:46.120 | decided to change one thing in PyTorch,
05:06:48.120 | like one line of PyTorch code is like
05:06:50.120 | a week of CUDA code, at least. So one
05:06:53.120 | of our goals with a library like
05:06:55.120 | Thunder Kittens, we just broke down
05:06:57.120 | what are the key principles, what are
05:06:59.120 | the key hardware things, what are the
05:07:02.120 | key compute pieces that you get from
05:07:04.120 | the hardware. So for example, on H100,
05:07:06.620 | everything really revolves around a warp
05:07:10.120 | group matrix multiply operation. So you
05:07:13.120 | really want your operation to be able
05:07:15.120 | to split into relatively small matrix
05:07:19.120 | matrix multiply operations. So like
05:07:21.120 | multiplying two 64 by 64 matrices, for
05:07:24.120 | example. And so if you know that ahead
05:07:26.120 | of time when you're designing your
05:07:28.120 | model, that probably gives you some
05:07:31.120 | information about how you set the state
05:07:33.120 | sizes, how you set the update, how you
05:07:35.620 | set the run time. So with Thunder
05:07:37.120 | Kittens, we basically built a whole
05:07:39.120 | library just around this basic idea
05:07:41.120 | that all your basic compute primitives
05:07:44.120 | should not be a float, but it should
05:07:46.120 | be a matrix and everything should just
05:07:48.120 | be matrix compute. And we've been
05:07:50.120 | using that to try to both re-implement
05:07:52.120 | some existing architectures and also
05:07:54.120 | start to design some new ones that are
05:07:56.120 | really designed with this core, with a
05:07:59.120 | tensor core primitive in mind. Another
05:08:02.120 | thing that we're, at least I'm excited
05:08:05.120 | about is we, over the last four or five
05:08:07.620 | years, we've really been looking at
05:08:09.620 | language models as the next thing. But
05:08:11.620 | if you've been paying attention to
05:08:13.620 | Twitter, there's been a bunch of new
05:08:15.620 | next generation models that are coming
05:08:17.620 | out. So there are video generation
05:08:20.620 | models that can run real time that are
05:08:23.620 | supported by your mouse and your
05:08:25.620 | keyboard that I'm told if you play with
05:08:28.620 | them, they only have a few seconds of
05:08:31.620 | memory. Can we take that model? Can we
05:08:34.120 | do a super long context length so that
05:08:35.620 | you could actually maybe generate an
05:08:37.620 | entire game state at a time? What does
05:08:39.620 | that look like for the model? You're
05:08:41.620 | certainly not going to do a giant
05:08:43.620 | quadratic attention computation to try
05:08:45.620 | to run that. Maybe use some of these
05:08:48.620 | new models or some of these new video
05:08:50.620 | generation models that came out. So
05:08:52.620 | Sora came out two days ago now, but
05:08:55.620 | with super long queue times and super
05:08:57.620 | long generation times. So that's
05:08:59.620 | probably a quadratic attention
05:09:01.620 | operation at the bottom of it. What if
05:09:03.620 | I could do that and get the same
05:09:05.120 | quality, but a lot faster generation
05:09:07.120 | time or some of the demos that we saw
05:09:10.120 | from Paige earlier today? If I have a
05:09:13.120 | super long conversation with my Gemini
05:09:16.120 | bot, what if I wanted to remember
05:09:19.120 | everything that I've seen in the last
05:09:21.120 | week? I mean, maybe you don't for
05:09:24.120 | personal reasons, but what if I did?
05:09:26.120 | What does that mean for the
05:09:28.120 | architecture? And I think that's
05:09:30.120 | certainly something I'm pretty excited
05:09:32.620 | about it too. I think we were supposed
05:09:34.120 | to have some hot takes, but I honestly
05:09:36.120 | don't remember what our hot takes were.
05:09:38.120 | Yeah. Hot takes. Yes, these are our hot
05:09:42.120 | takes. I think the big one on Twitter
05:09:47.120 | that we saw, that we shared, was the
05:09:49.120 | question is like, is RAG relevant in
05:09:52.120 | the case of the future of state-based
05:09:55.120 | models? Is what relevant?
05:09:58.120 | Right?
05:10:00.620 | Oh, sorry. I was like right or left?
05:10:03.120 | RAG. Hmm. Okay. We are live, so maybe
05:10:08.120 | I'll... Let's see. I haven't played too
05:10:15.120 | much with RAG, but when I have, I'll say
05:10:21.120 | I found it was a little bit challenging
05:10:25.120 | to do research on it because we had
05:10:28.120 | this experience over and over again
05:10:30.120 | where you could have an embedding model
05:10:33.620 | of any quality. So you could have a
05:10:35.620 | really, really bad embedding model, or
05:10:37.620 | you could have a really, really good
05:10:39.620 | one by any measure of good. And for the
05:10:41.620 | final RAG application, it kind of
05:10:43.620 | didn't matter. That's what I'll say
05:10:46.620 | about RAG while I'm being recorded. I
05:10:50.620 | know it doesn't actually answer the
05:10:52.620 | question, but...
05:10:53.620 | Yeah. So I think a lot of folks are
05:10:55.620 | like extremely excited of the idea of
05:10:58.120 | RWKV or state space potentially having
05:11:00.620 | infinite context. But I think the
05:11:03.620 | reality is that when we say infinite
05:11:05.620 | context, we just mean a different kind
05:11:07.620 | of infinite context, or as it's
05:11:09.620 | previously covered, you need to test the
05:11:11.620 | model differently. So think of it more
05:11:13.620 | along the lines of the human. Like, I
05:11:15.620 | don't remember what I eat for breakfast
05:11:17.620 | yesterday. Yeah, that's the statement
05:11:19.620 | that I'll say. And we humans are not
05:11:23.620 | quadratic transformers. If we did, if
05:11:27.120 | we increased our brain size for every
05:11:28.620 | second we live, we would have exploded
05:11:30.620 | by the time we are five years old or
05:11:32.620 | something like that. And I think
05:11:34.620 | basically fundamentally for us, right,
05:11:36.620 | regardless of whether RWKV, state
05:11:39.620 | space, XLSTM, et cetera, our general
05:11:42.620 | idea is that instead of that expanding
05:11:44.620 | state, that increase in computational
05:11:46.620 | cost, what if you have a fixed state
05:11:48.620 | size? And information theory detects
05:11:51.620 | that that fixed state size will have a
05:11:55.120 | limit. Just how big of a limit is a
05:11:57.620 | question. Like RWKV is running at
05:12:00.620 | 40 megabytes for a state. Its future
05:12:03.620 | version might run into 400 megabytes.
05:12:05.620 | That is like millions of tokens in, if
05:12:09.620 | we're talking about mathematically, the
05:12:11.620 | maximum possibility. It's just that I
05:12:13.620 | guess we were all more inefficient
05:12:15.620 | about it, so maybe we hit 100,000. And
05:12:17.620 | that's kind of like the work we're
05:12:19.620 | doing, trying to push it and maximize
05:12:21.620 | it. And that's where the models will
05:12:24.120 | start deferring because it will choose
05:12:25.620 | to forget things, it will choose to
05:12:27.620 | remember things. And that's why I think
05:12:29.620 | that there might be some element of
05:12:31.620 | right, but it may not be the same right.
05:12:33.620 | Maybe the model learned things. And
05:12:35.620 | it's like, hmm, I can't remember that
05:12:37.620 | article. Let me do a database search.
05:12:39.620 | Two search. Just like us humans, when
05:12:41.620 | we can't remember the article in the
05:12:43.620 | company, we do a search on Notion.
05:12:45.620 | Yeah. I think something that would be
05:12:47.620 | really interesting is if you could have
05:12:49.620 | facts that are -- so right now the one
05:12:53.120 | thing we're learning about language
05:12:54.620 | models is that all those parameters
05:12:56.620 | are around just to store random facts
05:12:58.620 | about the world. And this intuition
05:13:00.620 | comes from the observation that if you
05:13:02.620 | take a really small language model, it
05:13:04.620 | can do things like talk to you or kind
05:13:06.620 | of has like the style of conversation
05:13:08.620 | it can learn that. But where it will
05:13:10.620 | usually fall over compared to a much
05:13:12.620 | larger one is it will just be a lot
05:13:14.620 | less factual about things that it
05:13:16.620 | knows or that it can do. But that
05:13:18.620 | points to all those weights that we're
05:13:20.620 | spending, all that SGD that we're
05:13:22.620 | spending to train these models are
05:13:24.120 | just being used to store facts. And
05:13:26.120 | we have things like databases that are
05:13:28.120 | pretty good at storing facts. So I
05:13:30.120 | think one thing that would be really
05:13:32.120 | interesting is if we could actually
05:13:34.120 | have some sort of outside data store
05:13:36.120 | that a language model can look at that
05:13:38.120 | maybe is, you know, has some sort of
05:13:41.120 | gradient descent in it, but would be
05:13:44.120 | quite interesting. And then maybe you
05:13:46.120 | could edit it, delete facts, you know,
05:13:48.120 | change who's president so that it
05:13:50.620 | doesn't get lost.
05:13:52.120 | Can we open up Q&A and hot takes?
05:13:54.120 | Sure.
05:13:56.120 | I have hot take Q&A. Do these scale?
05:14:00.120 | When 405B state space model rag exists,
05:14:05.120 | no one does long context, who's
05:14:07.120 | throwing in 2 million token questions,
05:14:10.120 | hot takes?
05:14:12.120 | The who's throwing in 2 million token
05:14:14.120 | question I think is a really good
05:14:16.120 | question. So I actually I was going to
05:14:18.120 | offer that as a hot take. I mean my
05:14:20.120 | hot take was going to be that long
05:14:21.620 | context doesn't matter. I know I just
05:14:23.620 | gave a whole talk about it. You know,
05:14:26.620 | what's the point of doing research if
05:14:28.620 | you can't, you know, play both sides.
05:14:30.620 | But I think one of the so I think for
05:14:34.620 | both of us the reason that we first got
05:14:36.620 | into this was just from the first
05:14:38.620 | principled questions of there's this
05:14:40.620 | quadratic thing. Clearly intelligence
05:14:43.620 | doesn't need to be quadratic. What is
05:14:45.620 | going on? Can we understand it better?
05:14:49.120 | You know, since then it's kind of
05:14:51.620 | turned into a race which has been
05:14:52.620 | exciting to watch like how much
05:14:53.620 | context you can take in. But I think
05:14:55.620 | it's right. Nobody is actually putting
05:14:57.620 | 2 million context prompt into these
05:14:59.620 | models. And, you know, if they are
05:15:02.620 | maybe we can go, you know, design a
05:15:05.620 | better model to do that particular
05:15:07.620 | thing. Yeah, what do you think about
05:15:09.620 | that? So you've also been working on
05:15:11.620 | this. Do you think long context
05:15:12.620 | matters?
05:15:13.620 | So I'm going to burn a bit. How many
05:15:15.620 | of you remember the news of Google
05:15:18.120 | Gemini is supporting 3 million
05:15:19.620 | context? Raise your hand.
05:15:21.620 | Yeah, 2 million.
05:15:23.620 | Oh, it's 2 million.
05:15:25.620 | Yeah, how many of you actually tried
05:15:31.620 | that?
05:15:32.620 | I use it a lot.
05:15:34.620 | You use it a lot for Mind's TV.
05:15:37.620 | I use it a lot.
05:15:46.120 | I throw his podcast in there.
05:15:48.620 | All right. So for some people there is
05:16:08.620 | use. And I think that's
05:16:11.620 | might be like this where my opinion
05:16:15.120 | starts to differ because I think the
05:16:16.620 | big labs may have a bigger role in
05:16:18.620 | this. Because like even for RWKB,
05:16:20.620 | even when we train long context, the
05:16:21.620 | reason why I say VRAM is a problem is
05:16:23.620 | that because when we did the we need
05:16:25.620 | to back prop against the states, we
05:16:27.620 | actually need to maintain the state in
05:16:29.620 | between the tokens by the token length.
05:16:31.620 | So that means we need to actually roll
05:16:33.620 | out the whole 1 million context if we
05:16:35.620 | are actually training 1 million, which
05:16:37.620 | is the same for transformers actually.
05:16:39.620 | But it just means we don't magically
05:16:41.620 | reduce the VRAM consumption in the
05:16:44.120 | training time space. So that is the
05:16:45.620 | one that VRAM bottlenecks. And I'm
05:16:47.620 | neither OpenAI nor Google. So donate
05:16:50.620 | GPUs if you have too much of them.
05:16:53.620 | But then putting it back to another
05:16:55.620 | paradigm, right, is that I think O1
05:16:57.620 | style reasoning might be actually
05:17:00.620 | pushing that direction downward. In my
05:17:03.620 | opinion, this is my partial hot take,
05:17:05.620 | is that if let's say you have a super
05:17:07.620 | big 400B model and let's say you have
05:17:10.620 | a 70B model that may take double the
05:17:13.620 | tokens but gets the same result.
05:17:17.120 | Strictly speaking, a 70B, and this is
05:17:19.120 | even for transformer or non-transformer,
05:17:21.120 | right, will take less resources than
05:17:25.120 | that 400B model even if it did double
05:17:28.120 | the amount of thinking. And if that's
05:17:30.120 | the case, and we are still all trying
05:17:31.120 | to figure this out, maybe the direction
05:17:33.120 | for us is really getting the sub-200B
05:17:35.120 | to be as fast as efficient as possible
05:17:37.120 | with a very efficient architecture that
05:17:39.120 | some folks happen to be working on to
05:17:42.620 | just reason it out over larger and
05:17:45.120 | larger context.
05:17:46.120 | Yeah.
05:17:47.120 | More hot take. Why not throw GPU at
05:17:50.120 | problem? Cerebrus, grok, fast inference.
05:17:54.120 | I will accept those GPUs. Please send
05:17:56.120 | it over.
05:17:57.120 | I'm on Twitter, recently on BlueSky.
05:18:07.120 | I'm @realDanFu, so you can follow
05:18:10.120 | there.
05:18:11.620 | Same thing, Twitter, BlueSky, I guess
05:18:14.120 | GitHub, Discord, Pico Creator, P-I-C-O
05:18:17.120 | Creator.
05:18:18.120 | Can you all hear me? Yeah. One thing
05:18:25.120 | I'm super interested in is models that
05:18:28.120 | can watch forever. Obviously you cannot
05:18:32.120 | train something on infinite context
05:18:34.120 | length. How are you all thinking about
05:18:36.120 | that where you run on a much longer
05:18:39.620 | context length than is possible to
05:18:43.120 | train on?
05:18:44.120 | It's a great question. I think you
05:18:50.120 | guys probably had tweets along these
05:18:52.120 | lines, too. When we first started
05:18:54.120 | doing these things, because these are
05:18:56.120 | all recurrent models, in theory you
05:18:58.120 | could just run it forever. You could
05:19:00.120 | just run it forever, and at the very
05:19:02.120 | least it won't error out on you or
05:19:04.120 | crash. There's another question of
05:19:06.120 | whether it can actually use what it's
05:19:08.620 | seen in that infinite context. One
05:19:12.120 | place where probably the research in
05:19:14.120 | architectures ran faster than other
05:19:17.120 | research is actually the benchmarks
05:19:19.120 | for long context. You turn it on
05:19:21.120 | forever, you want to do everything or
05:19:23.120 | watch everything. What is it that you
05:19:25.120 | actually wanted to do? Can we build
05:19:27.120 | some benchmarks for that, then measure
05:19:29.120 | what's happening, and then ask the
05:19:31.120 | question, can the models do it? Is
05:19:33.120 | there something else that they need?
05:19:35.120 | I think that if I were to turn back
05:19:37.620 | the clock to 2022, that's probably one
05:19:39.120 | of the things I would have done
05:19:41.120 | differently, which would have been
05:19:43.120 | actually get some long context
05:19:45.120 | benchmarks out at the same time as we
05:19:47.120 | started pushing context length on all
05:19:49.120 | these models.
05:19:51.120 | >> I will also say the use case. I
05:19:53.120 | think we both agree that there's no
05:19:55.120 | infinite memory, and the model needs
05:19:57.120 | to be able to learn and decide. I
05:19:59.120 | think what we have observed is that
05:20:01.120 | one of the key advantages of this
05:20:03.120 | alternate attention mechanic that is
05:20:05.120 | not based on token position is that
05:20:07.120 | when you suddenly become crazy, when
05:20:08.620 | you go past the 8K training context
05:20:10.620 | length or a million context length,
05:20:12.620 | it's actually still stable. It's
05:20:14.620 | still able to run. It's still able
05:20:16.620 | to rationalize. It just starts
05:20:18.620 | forgetting things. But some of these
05:20:20.620 | things are still there in latent
05:20:22.620 | memory. Some of these things are
05:20:24.620 | still somewhat there. That's the
05:20:26.620 | whole point of why reading twice
05:20:28.620 | works, things like that. One of the
05:20:30.620 | biggest pushes in this direction is
05:20:32.620 | that I think both State Space and
05:20:35.120 | other researchers where they use this
05:20:36.620 | architecture for time series data,
05:20:38.620 | weather modeling. You're not asking
05:20:40.620 | what was the weather five days ago.
05:20:42.620 | You're asking what's the weather
05:20:44.620 | tomorrow based on the infinite length
05:20:46.620 | that we as on this earth and the
05:20:48.620 | computer will keep running. And they
05:20:51.620 | found that it is better than existing
05:20:55.620 | transformer or existing architecture
05:20:57.620 | in modeling this weather data.
05:20:59.620 | Control for the param size and stuff.
05:21:01.620 | I'm quite sure there are people with
05:21:04.120 | that. So there are things that in
05:21:05.620 | this case, right, there is future
05:21:07.620 | applications if your question is just
05:21:09.620 | what's next and not what's 10 years
05:21:13.620 | Thanks so much for having us.
05:21:19.620 | It's a video of us.
05:21:27.620 | It's going to come out on YouTube.
05:21:33.120 | All right. Thank you, Eugene and
05:21:35.620 | Dan. So we've talked about efficiency
05:21:38.620 | in large scale model. We've talked
05:21:40.620 | about grass root type of model that
05:21:44.620 | comes from two GPUs. No, there was
05:21:47.620 | more than that. It's not complete to
05:21:50.620 | then not talk about the on device type
05:21:53.620 | of size of model. For that, synthetic
05:21:56.620 | data is also very relevant. And
05:21:58.620 | joining us to talk about it will be
05:22:01.120 | Lubna Ben-Elal. She is a research
05:22:03.620 | engineer at Hugging Face. And she's
05:22:06.620 | worked on pre-training data sets like
05:22:08.620 | Cosmopedia, which is going to be
05:22:11.620 | showcased in our event pamphlet
05:22:13.620 | towards the end of the day. If you
05:22:15.620 | stay long enough, you will get one of
05:22:17.620 | them. It's really artfully done. And
05:22:20.620 | Cosmopedia is featured on there. It's
05:22:23.620 | Lubna's work. And she's also the lead
05:22:25.620 | on training small language models,
05:22:28.120 | small LM and small M2 at Hugging Face.
05:22:30.620 | The TLDA on those models, they have
05:22:32.620 | very, very high performance with all
05:22:34.620 | the high cost associated with large
05:22:37.620 | models. So Lubna's going to talk about
05:22:39.620 | how she did that, as well as the
05:22:41.620 | synthetic data's role in that. Welcome.
05:22:44.620 | Thanks for the introduction. Let me
05:22:54.620 | see if you can see my screen. Okay,
05:22:57.120 | perfect. Okay, cool. We can get
05:23:21.620 | started. I'm very happy to be here.
05:23:23.620 | Thank you for the invitation. So I'm
05:23:26.120 | going to be talking about synthetic
05:23:27.620 | data in 2024, and then I'm going to be
05:23:29.620 | talking about small on-device models.
05:23:32.620 | So I think the most interesting thing
05:23:34.620 | about synthetic data this year is that
05:23:36.620 | like now we have it everywhere in the
05:23:38.620 | large language models pipeline. I think
05:23:41.620 | initially synthetic data was mainly
05:23:43.620 | used just for post-training, because
05:23:45.620 | naturally that's the part where we
05:23:47.620 | needed human annotators to show the
05:23:49.620 | models how they should answer
05:23:51.620 | instructions, how they should be
05:23:54.120 | useful and not toxic. And when we had
05:23:56.620 | LLMs that were really performant, we
05:23:58.620 | replaced the human annotators just
05:24:01.620 | with the synthetic data. And then
05:24:03.620 | after that, we realized that we don't
05:24:05.620 | really have good benchmarks to measure
05:24:07.620 | if models follow instructions well, if
05:24:10.620 | they are creative enough, or if they
05:24:12.620 | are chatty enough. So we also started
05:24:14.620 | using LLMs as judges. And I think this
05:24:17.620 | year and towards the end of last year,
05:24:19.620 | we also went to the pre-training parts,
05:24:23.120 | generating synthetic data for
05:24:24.620 | pre-training to kind of replace some
05:24:26.620 | parts of the web. And the motivation
05:24:28.620 | behind that is that you have a lot of
05:24:30.620 | control over synthetic data. You can
05:24:32.620 | control your prompt and basically also
05:24:34.620 | the kind of data that you generate. So
05:24:36.620 | instead of just trying to filter the
05:24:38.620 | web, you could try to get the LLM to
05:24:40.620 | generate what you think the best web
05:24:42.620 | pages could look like, and then train
05:24:44.620 | your models on that. So this is how we
05:24:46.620 | went from not having synthetic data at
05:24:48.620 | all in the LLM pipeline to having it
05:24:51.120 | on the web. And so the cool thing is
05:24:52.620 | today you can train an LLM with an
05:24:55.620 | entirely synthetic pipeline. For
05:24:57.620 | example, you can use our Cosmopedia
05:24:59.620 | data sets and you can train a 1B model
05:25:01.620 | on 150 billion tokens that are 100%
05:25:04.620 | synthetic, and those are also of good
05:25:06.620 | quality. And then you can instruction
05:25:08.620 | tune the model on a synthetic SFT data
05:25:10.620 | set. You can also do DPO on a synthetic
05:25:12.620 | data set. And then to evaluate if the
05:25:14.620 | model is good, you can use a benchmark
05:25:16.620 | that uses LLMs as a judge, for example,
05:25:19.120 | or Alpaca Evol. So I think this is
05:25:20.620 | really mind-blowing because just a few
05:25:22.620 | years ago we wouldn't think this is
05:25:24.620 | possible. And I think there's a lot of
05:25:26.620 | concerns about model collapse, and I'm
05:25:28.620 | going to talk about that later. But we
05:25:30.620 | will see that if we use synthetic data
05:25:32.620 | properly and we curate it carefully,
05:25:34.620 | that shouldn't happen. And the reason
05:25:37.620 | synthetic data is very popular right
05:25:39.620 | now is that we have really strong
05:25:41.620 | models, both open and closed. It is
05:25:44.620 | really cheap and fast to use compared
05:25:46.620 | to human annotations, which cost a lot
05:25:48.620 | and take a lot of time. And also for
05:25:51.120 | open models, right now we have some
05:25:53.120 | really good inference frameworks. So
05:25:55.120 | if you have enough GPUs, it's really
05:25:57.120 | easy to spawn these GPUs and generate
05:25:59.120 | a lot of synthetic data. Some examples
05:26:01.120 | are VLM, TGI, and TensorRT. Now let's
05:26:07.120 | talk about the elephant in the room,
05:26:09.120 | model collapse. Is this the end? If you
05:26:11.120 | look at the media and all of, for
05:26:13.120 | example, some papers in Nature, it's
05:26:15.120 | really scary because there's a lot of
05:26:17.620 | synthetic data out there in the web and
05:26:19.120 | naturally we train on the web. So we're
05:26:21.120 | going to be training a lot of synthetic
05:26:23.120 | data. And if model collapse is going to
05:26:25.120 | happen, we should really try to take
05:26:27.120 | that seriously. And the other issue is
05:26:30.120 | that, as I said, a lot of people think
05:26:32.120 | the web is polluted because there's a
05:26:34.120 | lot of synthetic data. And, for example,
05:26:36.120 | when we're building fine web data sets
05:26:38.120 | here at Guillermo and Hinek, we're
05:26:40.120 | interested in how much synthetic data is
05:26:42.120 | there in the web. So there isn't really
05:26:44.120 | a method to properly measure the amount
05:26:47.120 | of synthetic data or to say if a web
05:26:48.620 | page is synthetic or not. But one thing
05:26:50.620 | we can do is to try to look for, like,
05:26:52.620 | proxy words. For example, expressions
05:26:54.620 | like as a large language model or words
05:26:57.620 | like delve that we know are actually
05:26:59.620 | generated by ChatGPT. We could try to
05:27:01.620 | measure the amount of these words in
05:27:03.620 | our data system and compare them to the
05:27:05.620 | previous years. For example, here we
05:27:07.620 | measured, like, these words ratio in
05:27:09.620 | different dumps of Common Crawl. And we
05:27:11.620 | can see that, like, the ratio really
05:27:13.620 | increased after ChatGPT's release. So
05:27:16.620 | if we were to say the synthetic data
05:27:18.120 | amount didn't change, you would expect
05:27:20.120 | this ratio to stay constant, which is
05:27:22.120 | not the case. So there's a lot of
05:27:24.120 | synthetic data probably on the web. But
05:27:26.120 | does this really make models worse? So
05:27:28.120 | what we did is we trained different
05:27:30.120 | models on these different dumps. And we
05:27:32.120 | then computed their performance on
05:27:34.120 | popular, like, NLP benchmarks. And then
05:27:36.120 | we computed the aggregated score. And
05:27:38.120 | surprisingly, you can see that the
05:27:40.120 | latest dumps are actually even better
05:27:42.120 | than the dumps that are before. So if
05:27:44.120 | there's some synthetic data there, at
05:27:46.120 | least it did not make the models worse.
05:27:48.620 | Yeah, which is really encouraging. So
05:27:51.620 | personally, I wouldn't say the web is
05:27:53.620 | polluted with synthetic data. Maybe it's
05:27:55.620 | even making it more rich. And the issue
05:27:59.620 | with, like, model collapse is that, for
05:28:01.620 | example, those studies, they were done
05:28:03.620 | at, like, a small scale. And you would
05:28:05.620 | ask the model to complete, for example,
05:28:07.620 | a Wikipedia paragraph. And then you
05:28:09.620 | would train it on these new generations.
05:28:11.620 | And you would do that iteratively. I
05:28:13.620 | think if you do that approach, it's
05:28:15.620 | going to have the same kind of behavior
05:28:17.120 | because the quality is going to be worse
05:28:19.120 | because the model is already small. And
05:28:21.120 | then if you train it just on these
05:28:23.120 | generations, you shouldn't expect it to
05:28:25.120 | become better. But what we're really
05:28:27.120 | doing here is that we take a model that
05:28:29.120 | is very large and we try to distill its
05:28:31.120 | knowledge into a model that is smaller.
05:28:33.120 | And in this way, you can expect to get,
05:28:35.120 | like, better performance for your small
05:28:37.120 | model. And using synthetic data for
05:28:39.120 | pre-training has become really popular
05:28:41.120 | after the textbooks are all you need
05:28:43.620 | to really train the series of small
05:28:45.120 | models on textbooks that were using a
05:28:49.120 | large LLM. And then they found that
05:28:51.120 | these models were actually better than
05:28:53.120 | models that are much larger. So this
05:28:55.120 | was really interesting. It was, like,
05:28:57.120 | first of its time. But it was also met
05:28:59.120 | with a lot of skepticism, which is a
05:29:01.120 | good thing in research. It pushes you
05:29:03.120 | to question things. Because the data
05:29:05.120 | set that they trained on was not
05:29:07.120 | public. So people were not really sure
05:29:09.120 | if these models are really good or
05:29:11.120 | maybe there's just some data
05:29:13.120 | contamination. So it was really hard
05:29:14.620 | to check if you just have the weights
05:29:16.620 | of the models. And at Hugging Face,
05:29:18.620 | because we're, like, open source, we
05:29:20.620 | try to reproduce what they did. So
05:29:22.620 | this is our Cosmopedia data set. We
05:29:24.620 | basically tried to follow a similar
05:29:26.620 | approach to what they documented in the
05:29:28.620 | paper. And we created a synthetic data
05:29:30.620 | set of textbooks and blog posts and
05:29:32.620 | stories that had almost 30 billion
05:29:34.620 | tokens. And we trained some models on
05:29:36.620 | that. And we found that, like, the key
05:29:39.620 | ingredient to getting a good data set
05:29:42.120 | that is synthetic is trying as much as
05:29:43.620 | possible to keep it diverse. Because
05:29:45.620 | if you just throw the same prompts as
05:29:47.620 | your model, like, generate, like, a
05:29:49.620 | textbook about linear algebra, and even
05:29:51.620 | if you change the temperature, the
05:29:53.620 | textbooks are going to look alike. So
05:29:55.620 | there's no way you could scale to,
05:29:57.620 | like, millions of samples. And the way
05:29:59.620 | you do that is by creating prompts
05:30:01.620 | that have some seeds that make them
05:30:03.620 | diverse. In our case, the prompt, we
05:30:05.620 | would ask the model to generate a
05:30:07.620 | textbook, but make it related to an
05:30:10.120 | existing web page. And also, we try to
05:30:11.620 | frame it within -- to stay within topic.
05:30:13.620 | For example, here, we put, like, an
05:30:15.620 | extract about cardiovascular
05:30:17.620 | bioimaging. And then we ask the model
05:30:19.620 | to generate a textbook related to
05:30:21.620 | medicine that is also related to this
05:30:23.620 | web page. And this is a really nice
05:30:25.620 | approach because there's so many web
05:30:27.620 | pages out there. So you can be sure
05:30:29.620 | that your generation is not going to
05:30:31.620 | be diverse when you change the seed
05:30:34.620 | example. One thing that's challenging
05:30:36.620 | with this is that you want the seed
05:30:39.120 | samples to be related to your topics.
05:30:41.620 | So we use, like, a search tool to try
05:30:44.620 | to go all the fine web datasets and
05:30:46.620 | find the pages that are related to the
05:30:48.620 | topics we're interested in. And then
05:30:50.620 | we also do a lot of experiments with
05:30:52.620 | the type of generations we want the
05:30:54.620 | model to generate. For example, we ask
05:30:56.620 | it for textbooks for middle school
05:30:58.620 | students or a textbook for a college.
05:31:00.620 | And we found that, like, some generation
05:31:02.620 | styles help on some specific benchmarks
05:31:04.620 | while others help on other benchmarks.
05:31:06.620 | For example, college textbooks are
05:31:08.620 | good for MMLU, while middle school
05:31:10.120 | textbooks are good for benchmarks like
05:31:12.120 | OpenBook UA and PICO. This is, like, a
05:31:16.120 | sample from, like, our search tool. For
05:31:18.120 | example, you have a top category, which
05:31:20.120 | is a topic, and then you have some
05:31:22.120 | subtopics, and then you have the topic
05:31:24.120 | hits, which are basically the web pages
05:31:26.120 | in fine web that belong to these topics.
05:31:28.120 | And here you can see the comparison
05:31:30.120 | between Cosmopedia. We had two versions,
05:31:32.120 | V1 and V2 in blue and red. And you can
05:31:35.120 | see the comparison to fine web. And as
05:31:37.620 | you can see throughout the training,
05:31:39.120 | training on Cosmopedia was consistently
05:31:41.120 | better. So we managed to get a dataset
05:31:43.120 | that was actually good to train these
05:31:45.120 | models on. It's, of course, so much
05:31:47.120 | smaller than fine web. It's only 30
05:31:49.120 | billion tokens. But that's the scale
05:31:51.120 | that Microsoft datasets was. So we kind
05:31:53.120 | of managed to reproduce a bit what they
05:31:55.120 | did. And the dataset is public, so
05:31:57.120 | everyone can go there, check if
05:31:59.120 | everything is all right. And this is
05:32:03.120 | a recent paper from NVIDIA, Nemotron
05:32:05.620 | CC. They took things a bit further,
05:32:08.120 | and they generated not a few billion
05:32:10.120 | tokens, but 1.9 trillion tokens, which
05:32:13.120 | is huge. And we can see later how they
05:32:16.120 | did that. It's more of, like, rephrasing
05:32:18.120 | the web. So we can see today that
05:32:20.120 | there's, like, some really huge
05:32:22.120 | synthetic datasets out there, and
05:32:24.120 | they're public, so, like, you can try
05:32:26.120 | to filter them even further if you want
05:32:28.120 | to get, like, more high-quality
05:32:29.120 | corpuses. So for this rephrasing the
05:32:33.620 | web, this approach was suggested in
05:32:36.120 | this paper by Pratyush, where basically
05:32:39.120 | in this paper, they take some samples
05:32:42.120 | from C4 datasets, and then they use an
05:32:44.120 | LLM to rewrite these samples into a
05:32:47.120 | better format. For example, they ask an
05:32:50.120 | LLM to rewrite the sample into a
05:32:52.120 | Wikipedia passage or into a Q&A page.
05:32:55.120 | And the interesting thing in this
05:32:57.120 | approach is that you can use a model
05:32:59.120 | that is small because it doesn't --
05:33:01.120 | rewriting doesn't require knowledge.
05:33:03.120 | It's just rewriting a page into a
05:33:04.620 | different style. So the model doesn't
05:33:06.620 | need to have, like, knowledge that is,
05:33:08.620 | like, extensive of what is rewriting
05:33:10.620 | compared to just asking a model to
05:33:12.620 | generate a new textbook and not giving
05:33:14.620 | it, like, ground truth. So here they
05:33:16.620 | rewrite some samples from C4 into Q&A,
05:33:19.620 | into Wikipedia, and they find that
05:33:21.620 | doing this works better than training
05:33:23.620 | just on C4. And so what they did in
05:33:27.620 | Nemotron CC is a similar approach.
05:33:31.120 | They rewrite some pages from Common
05:33:32.620 | Crawl for two reasons. One is to,
05:33:35.620 | like, improve pages that are low
05:33:38.620 | quality. So they rewrite them into,
05:33:40.620 | for example, Wikipedia page so they
05:33:42.620 | look better. And another reason is to
05:33:44.620 | create more diverse datasets. So they
05:33:46.620 | have a dataset that they already
05:33:48.620 | heavily filtered, and then they take
05:33:50.620 | these pages that are already high
05:33:52.620 | quality and they ask the model to
05:33:54.620 | rewrite them into Q&A format into,
05:33:56.620 | like, open-ended questions or, like,
05:33:58.620 | multi-choice questions. So this way
05:34:00.620 | they can reuse the same page multiple
05:34:02.120 | times without fearing, like, having
05:34:04.120 | multiple duplicates because it's the
05:34:06.120 | same information but it's going to be
05:34:08.120 | rewritten differently. So I think
05:34:10.120 | that's also a really interesting
05:34:11.120 | approach for, like, generating
05:34:13.120 | synthetic data just by rephrasing the
05:34:15.120 | pages that you already have. There's
05:34:18.120 | also this approach called Prox where
05:34:21.120 | they try to start from a web page and
05:34:23.120 | then they generate a program which
05:34:25.120 | finds how to rewrite that page to make
05:34:27.120 | it better and less noisy. For example,
05:34:29.620 | here you can see there's some leftover
05:34:31.120 | metadata in the web page and you don't
05:34:33.120 | necessarily want to keep that for
05:34:35.120 | training your model. So they train a
05:34:37.120 | model that can generate programs that
05:34:39.120 | can, like, normalize and remove lines
05:34:41.120 | that are extra. So I think this
05:34:43.120 | approach is also interesting but it's
05:34:45.120 | maybe less scalable than the approaches
05:34:47.120 | that I presented before. So that was
05:34:51.120 | it for, like, rephrasing and generating
05:34:53.120 | new textbooks. Another approach that I
05:34:55.120 | think is really good and becoming
05:34:57.120 | really popular for using synthetic
05:34:59.120 | data for pre-training is basically
05:35:01.620 | building better classifiers for
05:35:03.620 | filtering the web. For example, here we
05:35:06.620 | released a dataset called FindWebEDU
05:35:08.620 | and the way we built it is by taking
05:35:11.620 | Lama3 and asking it to rate the
05:35:13.620 | educational content of web pages from
05:35:16.620 | 0 to 5. So, for example, if a page is,
05:35:18.620 | like, a really good textbook that could
05:35:20.620 | be useful in a school setting, it would
05:35:22.620 | get a really high score. And if a page
05:35:24.620 | is just, like, an advertisement or
05:35:27.120 | promotional material, it would get a
05:35:28.620 | lower score. And then after that, we
05:35:30.620 | take these synthetic annotations and we
05:35:32.620 | train a classifier on them. It's a
05:35:34.620 | classifier, like, a BERT model. And then
05:35:37.620 | we run this classifier on all of
05:35:39.620 | FindWeb, which is a 15 trillion tokens
05:35:41.620 | dataset, and then we only keep the
05:35:43.620 | pages that have, like, a score that's
05:35:45.620 | higher than 3. So, for example, in our
05:35:47.620 | case, we went from 15 trillion tokens
05:35:49.620 | to just 1.5 trillion tokens. Those are
05:35:51.620 | really highly educational. And as you
05:35:53.620 | can see here, FindWebEDU outperforms
05:35:56.620 | all the other public web datasets by a
05:35:59.120 | larger margin on a couple of benchmarks.
05:36:02.120 | Here I show the aggregated score. And
05:36:04.120 | you can see that this approach is really
05:36:06.120 | effective for filtering web datasets to
05:36:08.120 | get, like, better corpuses for training
05:36:10.120 | your LLMs. Others also try to do this
05:36:16.120 | approach. There's, for example, the
05:36:18.120 | DCLM dataset, where they also train the
05:36:20.120 | classifier, but not to detect
05:36:22.120 | educational content. Instead, they
05:36:24.620 | trained it on OpenHermes dataset, which
05:36:27.120 | is a dataset for instruction tuning.
05:36:29.120 | And also, they explain, like, IM5
05:36:31.120 | subreddits. And then they also get
05:36:33.120 | really high-quality dataset, which is,
05:36:37.120 | like, very information-dense and can
05:36:39.120 | help you train some really good LLMs.
05:36:42.120 | And then Nemotron and Common Crawl,
05:36:44.120 | they also did this approach, but
05:36:46.120 | instead of using one classifier, they
05:36:48.120 | used an ensemble of classifiers. So,
05:36:50.120 | they used, for example, the DCLM
05:36:52.120 | classifier and also classifiers like
05:36:54.120 | the ones we used in FineWeb
05:36:55.620 | Educational. And then they combined
05:36:57.620 | these scores with an ensemble method
05:36:59.620 | to only retain the best high-quality
05:37:01.620 | pages. And they get a dataset that
05:37:03.620 | works even better than the ones we
05:37:05.620 | developed. So, that was it for, like,
05:37:09.620 | synthetic data for pre-training. Now,
05:37:11.620 | we can go back to post-training. I
05:37:13.620 | think there's a lot of interesting
05:37:14.620 | post-training datasets out there. One
05:37:16.620 | that was released recently is Agent
05:37:18.620 | Instruct by Microsoft, where they
05:37:20.620 | basically try to target some specific
05:37:23.120 | skills and improve the performance of
05:37:25.620 | models on them. For example, here you
05:37:27.620 | can see code, brain teasers, open
05:37:29.620 | domain QA. And they managed to get a
05:37:31.620 | dataset that outperforms -- this one,
05:37:33.620 | fine-tuning Mistral 7b on it, it
05:37:35.620 | outperforms the original Instruct
05:37:37.620 | model that was released by Mistral.
05:37:42.620 | And as I said, to get good synthetic
05:37:44.620 | data, you really have to have a
05:37:46.620 | framework to make sure that your data
05:37:48.620 | is diverse. So, for example, for them,
05:37:50.620 | they always see the generations of
05:37:52.620 | instructions on either source code or
05:37:54.120 | raw text documents. And then they
05:37:56.120 | rewrite them to make sure they're
05:37:58.120 | easier to generate instructions from.
05:38:00.120 | And then they use that for their,
05:38:02.120 | like, instruction data generation.
05:38:05.120 | There's also the Tool 3 SFT mixture,
05:38:07.120 | which was released recently by Allen
05:38:09.120 | AI. It's also really good quality, and
05:38:11.120 | it covers a wide range of tasks. And
05:38:14.120 | the way they make sure that this
05:38:16.120 | dataset is diverse is by using
05:38:18.120 | personas from the Persona Hub
05:38:20.120 | datasets, which is basically a dataset
05:38:22.120 | of, like, I think over a million
05:38:23.620 | personas. And, for example, in the
05:38:25.620 | Tool mixture to generate, like, a
05:38:27.620 | new code snippet, they would give,
05:38:29.620 | like, the model persona, for example,
05:38:31.620 | a machine learning researcher
05:38:33.620 | interested in neural networks, and
05:38:35.620 | then ask it to generate, like, a
05:38:37.620 | coding problem. This way, you make
05:38:39.620 | sure that your dataset is really
05:38:41.620 | diverse, and then you can further
05:38:43.620 | filter the datasets, for example,
05:38:45.620 | using the reward models. We also
05:38:47.620 | released a dataset called Smalltalk,
05:38:49.620 | and we also tried to cover the wide
05:38:51.620 | scale. For example, when fine-tuning
05:38:53.120 | Mistral 7b on the dataset, we also
05:38:55.120 | outperformed the original Mistral
05:38:57.120 | instructs on a number of benchmarks,
05:38:59.120 | notably on mathematics and
05:39:01.120 | instruction following with IFEVO.
05:39:05.120 | Another paper that's really
05:39:07.120 | interesting I wanted to mention is
05:39:09.120 | this one called Multilingual Data
05:39:11.120 | Arbitrage by Cohere. And, basically,
05:39:13.120 | they want to generate a dataset for
05:39:15.120 | post-training that is multilingual,
05:39:17.120 | and they have a really interesting
05:39:19.120 | problem. It's the fact that there
05:39:21.120 | isn't a single model that's really
05:39:22.620 | good at all the languages they wanted.
05:39:24.620 | So, what they do is that, like, they
05:39:26.620 | use not just one teacher model, but
05:39:28.620 | multiple teachers. And then they have
05:39:30.620 | a router, which basically sends the
05:39:32.620 | prompts they have to all these models,
05:39:34.620 | and then they get the completions, and
05:39:36.620 | they have a reward model that traces
05:39:38.620 | all these generations and only keeps
05:39:40.620 | the best one. And this is, like,
05:39:42.620 | arbitrage and finance. So, I think
05:39:44.620 | what's interesting in this, it shows
05:39:46.620 | that, like, synthetic data, it doesn't
05:39:48.620 | have to come from a single model. And
05:39:50.620 | you can pull these models together
05:39:52.120 | and get, like, a dataset that's really
05:39:54.120 | high quality, and that's diverse, and
05:39:56.120 | that covers all your needs. I was
05:40:01.120 | supposed to put a meme there, but lack
05:40:04.120 | of time. Yeah, so that was it for,
05:40:08.120 | like, synthetic data. Now we can go
05:40:11.120 | to see what's happening in the small
05:40:13.120 | models field in 2024. I don't know if
05:40:18.120 | you know, but, like, now we have some
05:40:20.120 | models. For example, Lama 3.2, 1b, it
05:40:23.620 | matches Lama 213b from -- that was
05:40:25.620 | released last year on the LMSIS arena,
05:40:28.620 | which is basically the default go-to
05:40:30.620 | leaderboard for evaluating models using
05:40:32.620 | human evaluation. And as you can see
05:40:34.620 | here, the scores of the models are
05:40:36.620 | really close. So, I think we've made,
05:40:38.620 | like, a huge leap forward in terms of
05:40:40.620 | small models. Of course, that's just
05:40:43.620 | one data point, but there's more. For
05:40:45.620 | example, if you look at this chart from
05:40:48.120 | 2.5 blog posts, it shows that today we
05:40:50.620 | have some really good models that are
05:40:52.620 | only, like, 3 billion parameters and 4
05:40:54.620 | billion that score really high on MMLU,
05:40:57.620 | which is a really popular benchmark for
05:40:59.620 | evaluating models. And you can see here
05:41:02.620 | that the blue dots have more than 65 on
05:41:05.620 | MMLU, and the gray ones have less. And
05:41:08.620 | for example, Lama 33b had less, so now
05:41:11.620 | we have a 3b model that outperforms a
05:41:15.120 | 33b model that was released earlier on
05:41:17.620 | MMLU benchmark. So, I think now people
05:41:20.620 | are starting to realize that, like, we
05:41:22.620 | shouldn't just scale and scale models,
05:41:24.620 | but we should try to make them more
05:41:26.620 | efficient. I don't know if you knew, but
05:41:31.620 | you can also chat with a 3b+ model on
05:41:33.620 | your iPhone. For example, here, this is
05:41:35.620 | an app called PocketPal, where you can
05:41:37.620 | go and select a model from Hugging Face.
05:41:40.620 | It has a large choice. For example, here,
05:41:43.120 | we loaded the 53.5, which is 3.8 billion
05:41:47.620 | parameters on this iPhone, and we can
05:41:49.620 | chat with it, and you can see that even
05:41:51.620 | the latency is also acceptable. For
05:41:54.620 | example, here, I asked it to give me a
05:41:56.620 | joke about NeurIPS, so let's see what it
05:41:58.620 | has to say. Okay, why did the neural
05:42:04.620 | network attend NeurIPS? Because it heard
05:42:06.620 | there would be a lot of layers and fun,
05:42:08.620 | and it wanted to train its sense of
05:42:10.620 | humor. So, not very funny, but at least
05:42:12.620 | it can run on-device. Yeah, so I think
05:42:16.120 | now we have good small models, but we
05:42:18.120 | also have, like, good frameworks and
05:42:20.120 | tools to use these small models. So, I
05:42:22.120 | think we're really close to having, like,
05:42:24.120 | really on-edge and on-device models
05:42:27.120 | that are really good. And I think for a
05:42:29.120 | while, we've had this narrative that
05:42:31.120 | just training larger models is better.
05:42:34.120 | Of course, this is supported by science
05:42:36.120 | scaling laws. As you can see here, for
05:42:39.120 | example, when we scale the model size,
05:42:41.620 | the loss is lower, and obviously, you
05:42:43.120 | get a better model. But -- and we can
05:42:46.120 | see this, for example, in the GPT family
05:42:48.120 | of models, how we went from just 100
05:42:50.120 | million parameters to more than a
05:42:52.120 | trillion parameters, and of course, we
05:42:54.120 | all observed the performance improvement
05:42:56.120 | when using the latest model. But one
05:42:59.120 | thing that we shouldn't forget is that
05:43:00.120 | when we scale the model, we also scale
05:43:02.120 | the inference cost and time, and so the
05:43:05.120 | largest models are going to cost so
05:43:07.120 | much more. So, I think now, instead of
05:43:11.120 | just building larger models, we should
05:43:12.620 | be focusing on building more efficient
05:43:14.620 | models. It's no longer a race for the
05:43:16.620 | largest models, since these models are
05:43:18.620 | really expensive to run, and they require,
05:43:20.620 | like, really good infrastructure to do
05:43:22.620 | that, and they cannot run on, for
05:43:24.620 | example, consumer hardware. And when
05:43:26.620 | you try to build more efficient models
05:43:28.620 | that match larger models, that's when
05:43:31.620 | you can really unlock some really
05:43:33.620 | interesting on-device use cases. And I
05:43:36.620 | think a trend that we're noticing now
05:43:38.620 | is the trend of training smaller models
05:43:40.620 | longer. For example, if you compare how
05:43:43.120 | long Lama was trained compared to Lama
05:43:45.120 | 3, there is a huge increase in the
05:43:48.120 | pre-training length. Lama was trained on
05:43:50.120 | one trillion tokens, but Lama 3 ATB was
05:43:53.120 | trained on 15 trillion tokens. So, Meta
05:43:55.120 | managed to get a model that's the same
05:43:57.120 | size, but it performs so much better by
05:44:00.120 | choosing to, like, spend the sacrifice
05:44:02.120 | during training, because as we know,
05:44:05.120 | training is a one-time cost, but
05:44:06.120 | inference is something that's ongoing.
05:44:10.120 | If you want to see what are, like, the
05:44:12.620 | small models reads in 2024, I think this
05:44:15.620 | mobile LLM paper by Meta is interesting.
05:44:18.620 | They try to study different models that
05:44:21.620 | are, like, have less than one billion
05:44:23.620 | parameters and find which architecture
05:44:25.620 | makes most sense for these models. For
05:44:27.620 | example, they find that depth is more
05:44:29.620 | important than width, so it's more
05:44:31.620 | important to have models that have, like,
05:44:33.620 | more layers than just making them more
05:44:35.620 | wide. They also find that GQA helps,
05:44:39.120 | that tying the embedding helps. So, I
05:44:40.620 | think it's a nice study overall for
05:44:42.620 | models that are just a few hundred
05:44:44.620 | million parameters. There's also the
05:44:47.620 | Apple Intelligence tech report, which
05:44:49.620 | is interesting. So, for Apple
05:44:51.620 | Intelligence, they had two models, one
05:44:53.620 | that was, like, on server and another
05:44:55.620 | model that was on device. It had three
05:44:57.620 | billion parameters, and I think the
05:44:59.620 | interesting part is that they trained
05:45:01.620 | this model using pruning and then
05:45:03.620 | distillation, and for example, they have
05:45:05.620 | this table where they show that, like,
05:45:08.120 | pruning and distillation works much
05:45:09.620 | better than training from scratch, and
05:45:11.620 | they also have some interesting insights
05:45:13.620 | about, like, how they specialize their
05:45:15.620 | models on specific tasks, like, for
05:45:17.620 | example, summarization and rewriting.
05:45:21.620 | There's also this paper by NVIDIA that
05:45:23.620 | was released recently. I think you've
05:45:25.620 | already had a talk about, like, hybrid
05:45:27.620 | models. That was all interesting, and
05:45:29.620 | this model, they used, like, hybrid
05:45:31.620 | architecture between state space
05:45:33.620 | models and transformers, and they
05:45:35.620 | managed to train a 1B model that's
05:45:37.620 | really performant without needing to
05:45:39.120 | train it on a lot of tokens. And
05:45:43.120 | regarding our work, we just recently
05:45:45.120 | released SmallM2, so it's a series of
05:45:48.120 | three models which are the best in
05:45:50.120 | class in each model size. For example,
05:45:53.120 | our 1.7B model outperforms LAMP 1B and
05:45:56.120 | also 2.5, and how we managed to train
05:46:00.120 | this model is that we spent a lot of
05:46:02.120 | time trying to curate the pre-training
05:46:04.120 | dataset. We did a lot of ablations
05:46:06.620 | trying to find which datasets are good
05:46:08.120 | and also how to mix them. We also
05:46:10.120 | created some new math and code datasets
05:46:12.120 | that we're releasing soon, but we
05:46:14.120 | basically really spent a lot of time
05:46:16.120 | trying to find what's the best mixture
05:46:18.120 | that you can train these models on, and
05:46:20.120 | then we spent some time trying to, like,
05:46:22.120 | we also trained these models for very
05:46:24.120 | long. For example, SmallM1 was trained
05:46:26.120 | only on 1 trillion tokens, but this
05:46:28.120 | model is trained on 11 trillion tokens,
05:46:30.120 | and we saw that the performance kept
05:46:32.120 | improving. The models didn't really
05:46:34.120 | platonic training, which I think is
05:46:36.120 | great, and it shows that you can train
05:46:37.620 | such small models for very long and
05:46:39.620 | keep getting performance gains. What's
05:46:43.620 | interesting about SmallM2 is that it's
05:46:45.620 | fully open. We also released, like, the
05:46:47.620 | pre-training code base, the fine-tuning
05:46:49.620 | code, and datasets, and also evaluation
05:46:51.620 | in this repository. Also, there's, like,
05:46:55.620 | really interesting small models for
05:46:57.620 | text, but also for vision. For example,
05:46:59.620 | here you can see SmallVLM, which is a
05:47:01.620 | 2B model that's really efficient. It
05:47:03.620 | doesn't consume a lot of RAM, and it
05:47:05.620 | also has a good performance. There's
05:47:08.120 | also Moondream 0.5B, which was released
05:47:11.120 | recently. It's, like, the smallest
05:47:13.120 | vision language model, and as you can
05:47:15.120 | see, there isn't, like, a big trade-off
05:47:17.120 | compared to Moondream 2B. So now I
05:47:21.120 | showed you that we have some really
05:47:23.120 | good small models. We also have the
05:47:25.120 | tools to use them, but why should you
05:47:27.120 | consider using small models, and when?
05:47:30.120 | I think, like, small models are really
05:47:32.120 | interesting because of the on-device
05:47:34.620 | infrastructure. Because these models are
05:47:36.120 | small and they can run fast, you can
05:47:38.120 | basically run them on your laptop, but
05:47:40.120 | also on your mobile phone, and this
05:47:42.120 | means that your dataset stays locally.
05:47:44.120 | You don't have to send your queries to
05:47:46.120 | third parties, and this really enhances
05:47:48.120 | privacy. This was, for example, one of
05:47:50.120 | the big selling points for Apple
05:47:52.120 | Intelligence. Also, right now, we really
05:47:55.120 | have so many frameworks to do
05:47:57.120 | on-device inference. For example, there's
05:47:59.120 | MLX, MLC, LLAMA, CPP, Transformers, JS.
05:48:02.120 | So we have a lot of options, and each
05:48:04.120 | of them have, like, great features, so
05:48:05.620 | you have so many options for doing that.
05:48:09.620 | Small models are also really powerful
05:48:11.620 | if you choose to specialize them. For
05:48:13.620 | example, here, there's a startup called
05:48:15.620 | Numind, which took small LLAM, and then
05:48:17.620 | they fine-tuned it on text extraction
05:48:19.620 | datasets, and they managed to get a
05:48:21.620 | model that's not very far from models
05:48:23.620 | that are much larger. So I think text
05:48:25.620 | extraction is, like, one use case where
05:48:27.620 | small models can be really performant,
05:48:29.620 | and it makes sense to use them instead
05:48:32.120 | of using models. You can also chat with
05:48:34.620 | these models in browser. For example,
05:48:36.620 | here, you can go there. You can load the
05:48:38.620 | model. You can even turn off your
05:48:40.620 | internet and just start chatting with
05:48:42.620 | the model locally. Speaking of text
05:48:45.620 | extraction, if you don't want to fine-tune
05:48:47.620 | the models, there's a really good method
05:48:49.620 | of structure generation, where you can
05:48:51.620 | basically force the models to follow a
05:48:53.620 | JSON schema that you defined. For
05:48:55.620 | example, here, we tried to force the
05:48:57.620 | model to follow a schema for extracting
05:49:01.120 | key information from GitHub issues. So
05:49:03.620 | you can input free text, which is a
05:49:05.620 | complaint about a GitHub repository,
05:49:07.620 | something not working, and then you can
05:49:09.620 | run it there, and the model can extract
05:49:11.620 | anything that is relevant for your
05:49:13.620 | GitHub issue creation. For example, the
05:49:15.620 | priority. For example, here, priority is
05:49:17.620 | high, the type of the issue, bug, and
05:49:19.620 | then a title and the estimation of how
05:49:21.620 | long this will take to fix. And you can
05:49:23.620 | just, like, do this in the browser. You
05:49:25.620 | can transform your text into a GitHub
05:49:27.620 | issue that's properly formatted.
05:49:31.620 | So what's next for synthetic data and
05:49:34.120 | small models? I think that domain-specific
05:49:37.120 | synthetic data is going to be -- it's
05:49:39.120 | already important. It's going to be even
05:49:41.120 | more important. For example, generating
05:49:43.120 | synthetic data for math. I think this
05:49:46.120 | really would help improve the reasoning
05:49:48.120 | of a lot of models, and a lot of people
05:49:50.120 | are doing it. For example, Queen 2.12
05:49:52.120 | math, everyone's trying to reproduce one.
05:49:55.120 | And so I think for synthetic data, trying
05:49:57.120 | to specialize it on some domains is
05:49:59.620 | going to be really important. And then
05:50:01.120 | for small models, I think specializing
05:50:03.120 | them through fine-tuning is also going
05:50:05.120 | to be really important. Because I think
05:50:07.120 | a lot of companies are just trying to
05:50:09.120 | use these large models because they are
05:50:11.120 | better. But on some tasks, I think you
05:50:13.120 | can already get decent performance with
05:50:15.120 | small models, so you don't need to pay,
05:50:17.120 | like, a cost that's much larger just to
05:50:19.120 | make your model better at your task by a
05:50:21.120 | few percent. And this is not just for
05:50:24.120 | text. I think it also applies for other
05:50:26.120 | modalities, like vision and audio. And
05:50:29.120 | I think you should also watch out for
05:50:30.620 | on-device frameworks and applications.
05:50:32.620 | For example, like the app I showed,
05:50:34.620 | Pokespal, Olama, all these frameworks
05:50:36.620 | are becoming really popular, and I'm
05:50:38.620 | pretty sure that we're going to get,
05:50:40.620 | like, more of them in 2025, and users
05:50:42.620 | really like that. Maybe for other, I
05:50:45.620 | should also say, a hot take. I think
05:50:48.620 | that, like, in AI, we just started,
05:50:50.620 | like, with fine-tuning, for example,
05:50:52.620 | trying to make BERT work on some
05:50:54.620 | specific use cases and really struggling
05:50:56.620 | to do that. And then we had some models
05:50:58.620 | that are larger, so we just switched to,
05:51:00.120 | like, prompt engineering to get the
05:51:02.120 | models to solve our tasks. And I think
05:51:04.120 | we're going back to fine-tuning where
05:51:06.120 | we realize these models are really
05:51:07.120 | costly. It's better to use just a small
05:51:09.120 | model. We'll try to specialize it. So I
05:51:11.120 | think it's a little bit of a cycle, and
05:51:12.120 | we're going to start to see, like, more
05:51:14.120 | of fine-tuning and less of just, like,
05:51:16.120 | prompt engineering the models. So that
05:51:18.120 | was my talk. Thank you for following,
05:51:20.120 | and if you have any questions, we can
05:51:22.120 | take them now.
05:51:28.120 | >> Opening to question. Hot takes.
05:51:32.620 | Go for it.
05:51:48.620 | >> Yeah. Now it only talks about text
05:51:50.620 | because that's what I work on, but I
05:51:52.620 | think also, like, for the other
05:51:53.620 | modalities, it makes a lot of sense,
05:51:54.620 | and a lot of people are already using
05:51:57.120 | AI, like, for building their machine
05:51:58.620 | model, they generate, like, synthetic
05:52:00.620 | audio data. I think that's also the
05:52:02.620 | case for vision. I think the same
05:52:04.620 | applies, like, if you want to avoid
05:52:05.620 | model collapse, you should make sure
05:52:06.620 | to, like, make your data diverse and
05:52:07.620 | also filter it. So I think everything
05:52:09.620 | I said also applies for the other
05:52:11.620 | modalities.
05:52:14.620 | >> Can I jump in with a question?
05:52:16.620 | >> Yeah.
05:52:17.620 | >> Talk about modalities. There are
05:52:19.620 | more, like, physical world modalities,
05:52:21.620 | so the bio modality or the action
05:52:23.620 | modality. Synthetic data there is
05:52:26.120 | quite hard to do, like, or at least
05:52:28.620 | do well. For example, the synthetic
05:52:30.620 | data for molecular structure or
05:52:32.620 | genomic sequences, that's really
05:52:34.620 | hard, but those are the areas, or to
05:52:37.620 | your point earlier, specialized
05:52:38.620 | synthetic data is very needed, and
05:52:40.620 | they are very data poor right now as
05:52:42.620 | modalities. What are your take on
05:52:44.620 | that?
05:52:45.620 | >> Yeah. I think, like, for you to
05:52:46.620 | get good synthetic data, you need to
05:52:47.620 | have a model that's already good at
05:52:49.620 | the tasks you want to generate
05:52:50.620 | synthetic data from. So, for example,
05:52:52.620 | for these use cases, the issue is
05:52:53.620 | that, like, we don't have, like,
05:52:55.120 | models that you could leverage for
05:52:56.620 | generating synthetic data from. So
05:52:58.620 | maybe if we solve that issue of
05:52:59.620 | having models just, like, by
05:53:00.620 | training on data that's not
05:53:01.620 | synthetic first, I think we're
05:53:03.620 | probably going to get there
05:53:04.620 | somewhere. Then synthetic data is
05:53:05.620 | going to be a natural follow-up.
05:53:14.620 | >> Any other questions, maybe, on
05:53:15.620 | device models?
05:53:20.620 | >> Apple Intelligence, I think,
05:53:22.120 | has got a shitless nod today from
05:53:25.620 | OpenAI, so there's that. All
05:53:30.620 | right. Well, thank you, Gopala.
05:53:31.620 | >> Thank you.
05:53:33.620 | >> Thank you.
05:53:58.120 | >> Okay. So now we have time for a
05:53:59.120 | little bit of a break. We were
05:54:01.120 | supposed to get these pamphlets
05:54:02.120 | at the start, but now we have the
05:54:03.620 | schedules halfway through the day.
05:54:06.620 | So really thanks to Toph1943 and
05:54:10.620 | the Singapore friends that helped
05:54:12.620 | us print this. It's super nice.
05:54:14.620 | It's like a little take-home
05:54:15.620 | thing that at the back has
05:54:17.620 | visualizations of all the papers,
05:54:20.620 | the top one of each segment that
05:54:23.620 | we had to pick, because we only
05:54:24.620 | got space for one. And so, yeah,
05:54:27.620 | we have Lubna's small LM in here,
05:54:31.120 | I think. Yeah, that's her paper
05:54:32.620 | right there. So really, really
05:54:35.620 | cute. Thanks to Toph and the
05:54:37.620 | Singapore crew for doing this.
05:54:39.620 | We'll be back at 3, 3 p.m. All
05:54:42.620 | right. Bit of a break.
05:54:44.620 | [Break]
05:54:48.120 | [Break]
05:54:51.620 | [Break]
05:54:55.120 | [Break]
05:54:58.620 | [Break]
05:55:01.120 | [Break]
05:55:04.620 | [Break]
05:55:08.120 | [Break]
05:55:11.620 | [Break]
05:55:15.620 | [Break]
05:55:19.620 | [Break]
05:55:23.620 | [Break]
05:55:27.620 | [Break]
05:55:31.620 | [Break]
05:55:36.120 | [Break]
05:55:39.620 | [Break]
05:55:43.120 | [Break]
05:55:46.120 | [Break]
05:55:49.620 | [Break]
05:55:53.120 | [Break]
05:55:56.620 | [Break]
05:56:00.120 | [Break]
05:56:03.620 | [Break]
05:56:07.120 | [Break]
05:56:11.120 | [Break]
05:56:14.120 | [Break]
05:56:17.620 | [Break]
05:56:21.120 | [Break]
05:56:24.620 | [Break]
05:56:28.120 | [Break]
05:56:31.620 | [Break]
05:56:35.120 | [Break]
05:56:39.120 | [Break]
05:56:42.120 | [Break]
05:56:45.620 | [Break]
05:56:49.120 | [Break]
05:56:52.620 | [Break]
05:56:56.120 | [Break]
05:56:59.620 | [Break]
05:57:03.120 | [Break]
05:57:07.120 | [Break]
05:57:10.120 | [Break]
05:57:13.620 | [Break]
05:57:17.120 | [Break]
05:57:20.620 | [Break]
05:57:24.120 | [Break]
05:57:27.620 | [Break]
05:57:31.120 | [Break]
05:57:35.120 | [Break]
05:57:38.120 | [Break]
05:57:41.620 | [Break]
05:57:45.120 | [Break]
05:57:48.620 | [Break]
05:57:52.120 | [Break]
05:57:55.620 | [Break]
05:57:59.120 | [Break]
05:58:03.120 | [Break]
05:58:06.120 | [Break]
05:58:09.620 | [Break]
05:58:13.120 | [Break]
05:58:16.620 | [Break]
05:58:20.120 | [Break]
05:58:23.620 | [Break]
05:58:27.120 | [Break]
05:58:31.120 | [Break]
05:58:34.120 | [Break]
05:58:37.620 | [Break]
05:58:41.120 | [Break]
05:58:44.620 | [Break]
05:58:48.120 | [Break]
05:58:51.620 | [Break]
05:58:55.120 | [Break]
05:58:59.120 | [Break]
05:59:02.120 | [Break]
05:59:05.620 | [Break]
05:59:09.120 | [Break]
05:59:12.620 | [Break]
05:59:16.120 | [Break]
05:59:19.620 | [Break]
05:59:23.120 | [Break]
05:59:27.120 | [Break]
05:59:30.120 | [Break]
05:59:33.620 | [Break]
05:59:37.120 | [Break]
05:59:40.620 | [Break]
05:59:44.120 | [Break]
05:59:47.620 | [Break]
05:59:51.120 | [Break]
05:59:55.120 | [Break]
05:59:58.120 | [Break]
06:00:01.620 | [Break]
06:00:05.120 | [Break]
06:00:08.620 | [Break]
06:00:12.120 | [Break]
06:00:15.620 | [Break]
06:00:19.120 | [Break]
06:00:23.120 | [Break]
06:00:26.120 | [Break]
06:00:29.620 | [Break]
06:00:33.120 | [Break]
06:00:36.620 | [Break]
06:00:40.120 | [Break]
06:00:43.620 | [Break]
06:00:47.120 | [Break]
06:00:51.120 | [Break]
06:00:54.120 | [Break]
06:00:57.620 | [Break]
06:01:01.120 | [Break]
06:01:04.620 | [Break]
06:01:08.120 | [Break]
06:01:11.620 | [Break]
06:01:15.120 | [Break]
06:01:19.120 | [Break]
06:01:22.120 | [Break]
06:01:25.620 | [Break]
06:01:29.120 | [Break]
06:01:32.620 | [Break]
06:01:36.120 | [Break]
06:01:39.620 | [Break]
06:01:43.120 | [Break]
06:01:47.120 | [Break]
06:01:50.120 | [Break]
06:01:53.620 | [Break]
06:01:57.120 | [Break]
06:02:00.620 | [Break]
06:02:04.120 | [Break]
06:02:07.620 | [Break]
06:02:11.120 | [Break]
06:02:15.120 | [Break]
06:02:18.120 | [Break]
06:02:21.620 | [Break]
06:02:25.120 | [Break]
06:02:28.620 | [Break]
06:02:32.120 | [Break]
06:02:35.620 | [Break]
06:02:39.120 | [Break]
06:02:43.120 | [Break]
06:02:46.120 | [Break]
06:02:49.620 | [Break]
06:02:53.120 | [Break]
06:02:56.620 | [Break]
06:03:00.120 | [Break]
06:03:03.620 | [Break]
06:03:07.120 | [Break]
06:03:11.120 | [Break]
06:03:14.120 | [Break]
06:03:17.620 | [Break]
06:03:21.120 | [Break]
06:03:24.620 | [Break]
06:03:28.120 | [Break]
06:03:31.620 | [Break]
06:03:35.120 | [Break]
06:03:39.120 | [Break]
06:03:42.120 | [Break]
06:03:45.620 | [Break]
06:03:49.120 | [Break]
06:03:52.620 | [Break]
06:03:56.120 | [Break]
06:03:59.620 | [Break]
06:04:03.120 | [Break]
06:04:07.120 | [Break]
06:04:10.120 | [Break]
06:04:13.620 | [Break]
06:04:17.120 | [Break]
06:04:20.620 | [Break]
06:04:24.120 | [Break]
06:04:27.620 | [Break]
06:04:31.120 | [Break]
06:04:35.120 | [Break]
06:04:38.120 | [Break]
06:04:41.620 | [Break]
06:04:45.120 | [Break]
06:04:48.620 | [Break]
06:04:52.120 | [Break]
06:04:55.620 | [Break]
06:04:59.120 | [Break]
06:05:03.120 | [Break]
06:05:06.120 | [Break]
06:05:09.620 | [Break]
06:05:13.120 | [Break]
06:05:16.620 | [Break]
06:05:20.120 | [Break]
06:05:23.620 | [Break]
06:05:27.120 | [Break]
06:05:31.120 | [Break]
06:05:34.120 | [Break]
06:05:37.620 | [Break]
06:05:41.120 | [Break]
06:05:44.620 | [Break]
06:05:48.120 | [Break]
06:05:51.620 | [Break]
06:05:55.120 | [Break]
06:05:59.120 | [Break]
06:06:02.120 | [Break]
06:06:05.620 | [Break]
06:06:09.120 | [Break]
06:06:12.620 | [Break]
06:06:16.120 | [Break]
06:06:19.620 | [Break]
06:06:23.120 | [Break]
06:06:27.120 | [Break]
06:06:30.120 | [Break]
06:06:33.620 | [Break]
06:06:37.120 | [Break]
06:06:40.620 | [Break]
06:06:44.120 | [Break]
06:06:47.620 | [Break]
06:06:51.120 | [Break]
06:06:55.120 | [Break]
06:06:58.120 | [Break]
06:07:01.620 | [Break]
06:07:05.120 | [Break]
06:07:08.620 | [Break]
06:07:12.120 | [Break]
06:07:15.620 | [Break]
06:07:19.120 | [Break]
06:07:23.120 | [Break]
06:07:26.120 | [Break]
06:07:29.620 | [Break]
06:07:33.120 | [Break]
06:07:36.620 | [Break]
06:07:40.120 | [Break]
06:07:43.620 | [Break]
06:07:47.120 | [Break]
06:07:51.120 | [Break]
06:07:54.120 | [Break]
06:07:57.620 | [Break]
06:08:01.120 | [Break]
06:08:04.620 | [Break]
06:08:08.120 | [Break]
06:08:11.620 | [Break]
06:08:15.120 | [Break]
06:08:19.120 | [Break]
06:08:22.120 | [Break]
06:08:25.620 | [Break]
06:08:29.120 | [Break]
06:08:32.620 | [Break]
06:08:36.120 | [Break]
06:08:39.620 | [Break]
06:08:43.120 | [Break]
06:08:47.120 | [Break]
06:08:50.120 | [Break]
06:08:53.620 | [Break]
06:08:57.120 | [Break]
06:09:00.620 | [Break]
06:09:04.120 | [Break]
06:09:07.620 | [Break]
06:09:11.120 | [Break]
06:09:15.120 | [Break]
06:09:18.120 | [Break]
06:09:21.620 | [Break]
06:09:25.120 | [Break]
06:09:28.620 | [Break]
06:09:32.120 | [Break]
06:09:35.620 | [Break]
06:09:39.120 | [Break]
06:09:43.120 | [Break]
06:09:46.120 | [Break]
06:09:49.620 | [Break]
06:09:53.120 | [Break]
06:09:56.620 | [Break]
06:10:00.120 | [Break]
06:10:03.620 | [Break]
06:10:07.120 | [Break]
06:10:11.120 | [Break]
06:10:14.120 | [Break]
06:10:17.620 | [Break]
06:10:21.120 | [Break]
06:10:24.620 | [Break]
06:10:28.120 | [Break]
06:10:31.620 | [Break]
06:10:35.120 | [Break]
06:10:39.120 | [Break]
06:10:42.120 | [Break]
06:10:45.620 | [Break]
06:10:49.120 | [Break]
06:10:52.620 | [Break]
06:10:56.120 | [Break]
06:10:59.620 | [Break]
06:11:03.120 | [Break]
06:11:07.120 | [Break]
06:11:10.120 | [Break]
06:11:13.620 | [Break]
06:11:17.120 | [Break]
06:11:20.620 | [Break]
06:11:24.120 | [Break]
06:11:27.620 | [Break]
06:11:31.120 | [Break]
06:11:35.120 | [Break]
06:11:38.120 | [Break]
06:11:41.620 | [Break]
06:11:45.120 | [Break]
06:11:48.620 | [Break]
06:11:52.120 | [Break]
06:11:55.620 | [Break]
06:11:59.120 | [Break]
06:12:03.120 | [Break]
06:12:06.120 | [Break]
06:12:09.620 | [Break]
06:12:13.120 | [Break]
06:12:16.620 | [Break]
06:12:20.120 | [Break]
06:12:23.620 | [Break]
06:12:27.120 | [Break]
06:12:31.120 | [Break]
06:12:34.120 | [Break]
06:12:37.620 | [Break]
06:12:41.120 | [Break]
06:12:44.620 | [Break]
06:12:48.120 | [Break]
06:12:51.620 | [Break]
06:12:55.120 | [Break]
06:12:59.120 | [Break]
06:13:02.120 | [Break]
06:13:05.620 | [Break]
06:13:09.120 | [Break]
06:13:12.620 | [Break]
06:13:16.120 | [Break]
06:13:19.620 | [Break]
06:13:23.120 | [Break]
06:13:27.120 | [Break]
06:13:30.120 | [Break]
06:13:33.620 | [Break]
06:13:37.120 | [Break]
06:13:40.620 | [Break]
06:13:44.120 | [Break]
06:13:47.620 | [Break]
06:13:51.120 | [Break]
06:13:55.120 | [Break]
06:13:58.120 | [Break]
06:14:01.620 | [Break]
06:14:05.120 | [Break]
06:14:08.620 | [Break]
06:14:12.120 | [Break]
06:14:15.620 | [Break]
06:14:19.120 | [Break]
06:14:23.120 | [Break]
06:14:26.120 | [Break]
06:14:29.620 | [Break]
06:14:33.120 | [Break]
06:14:36.620 | [Break]
06:14:40.120 | [Break]
06:14:43.620 | [Break]
06:14:47.120 | [Break]
06:14:51.120 | [Break]
06:14:54.120 | [Break]
06:14:57.620 | [Break]
06:15:01.120 | [Break]
06:15:04.620 | [Break]
06:15:08.120 | [Break]
06:15:11.620 | [Break]
06:15:15.120 | [Break]
06:15:19.120 | [Break]
06:15:22.120 | [Break]
06:15:25.620 | [Break]
06:15:29.120 | [Break]
06:15:32.620 | [Break]
06:15:36.120 | [Break]
06:15:39.620 | [Break]
06:15:43.120 | [Break]
06:15:47.120 | [Break]
06:15:50.120 | [Break]
06:15:53.620 | [Break]
06:15:57.120 | [Break]
06:16:00.620 | [Break]
06:16:04.120 | [Break]
06:16:07.620 | [Break]
06:16:11.120 | [Break]
06:16:15.120 | [Break]
06:16:18.120 | [Break]
06:16:21.620 | [Break]
06:16:25.120 | [Break]
06:16:28.620 | [Break]
06:16:32.120 | [Break]
06:16:35.620 | [Break]
06:16:39.120 | [Break]
06:16:43.120 | [Break]
06:16:46.120 | [Break]
06:16:49.620 | [Break]
06:16:53.120 | [Break]
06:16:56.620 | [Break]
06:17:00.120 | [Break]
06:17:03.620 | [Break]
06:17:07.120 | [Break]
06:17:11.120 | [Break]
06:17:14.120 | [Break]
06:17:17.620 | [Break]
06:17:21.120 | [Break]
06:17:24.620 | [Break]
06:17:28.120 | [Break]
06:17:31.620 | [Break]
06:17:35.120 | [Break]
06:17:39.120 | [Break]
06:17:42.120 | [Break]
06:17:45.620 | [Break]
06:17:49.120 | [Break]
06:17:52.620 | [Break]
06:17:56.120 | [Break]
06:17:59.620 | [Break]
06:18:03.120 | [Break]
06:18:07.120 | [Break]
06:18:10.120 | [Break]
06:18:13.620 | [Break]
06:18:17.120 | [Break]
06:18:20.620 | [Break]
06:18:24.120 | [Break]
06:18:27.620 | [Break]
06:18:31.120 | [Break]
06:18:35.120 | [Break]
06:18:38.120 | [Break]
06:18:41.620 | [Break]
06:18:45.120 | [Break]
06:18:48.620 | [Break]
06:18:52.120 | [Break]
06:18:55.620 | [Break]
06:18:59.120 | [Break]
06:19:03.120 | [Break]
06:19:06.120 | [Break]
06:19:09.620 | [Break]
06:19:13.120 | [Break]
06:19:16.620 | [Break]
06:19:20.120 | [Break]
06:19:23.620 | [Break]
06:19:27.120 | [Break]
06:19:31.120 | [Break]
06:19:34.120 | [Break]
06:19:37.620 | [Break]
06:19:41.120 | [Break]
06:19:44.620 | [Break]
06:19:48.120 | [Break]
06:19:51.620 | [Break]
06:19:55.120 | [Break]
06:19:59.120 | [Break]
06:20:02.120 | [Break]
06:20:05.620 | [Break]
06:20:09.120 | [Break]
06:20:12.620 | [Break]
06:20:16.120 | [Break]
06:20:19.620 | [Break]
06:20:23.120 | [Break]
06:20:27.120 | [Break]
06:20:30.120 | [Break]
06:20:33.620 | [Break]
06:20:37.120 | [Break]
06:20:40.620 | [Break]
06:20:44.120 | [Break]
06:20:47.620 | [Break]
06:20:51.120 | [Break]
06:20:55.120 | [Break]
06:20:58.120 | [Break]
06:21:01.620 | [Break]
06:21:05.120 | [Break]
06:21:08.620 | [Break]
06:21:12.120 | [Break]
06:21:15.620 | [Break]
06:21:19.120 | [Break]
06:21:23.120 | [Break]
06:21:26.120 | [Break]
06:21:29.620 | [Break]
06:21:33.120 | [Break]
06:21:36.620 | [Break]
06:21:40.120 | [Break]
06:21:43.620 | [Break]
06:21:47.120 | [Break]
06:21:51.120 | [Break]
06:21:54.120 | [Break]
06:21:57.620 | [Break]
06:22:01.120 | [Break]
06:22:04.620 | [Break]
06:22:08.120 | [Break]
06:22:11.620 | [Break]
06:22:15.120 | [Break]
06:22:19.120 | [Break]
06:22:22.120 | [Break]
06:22:25.620 | [Break]
06:22:29.120 | [Break]
06:22:32.620 | [Break]
06:22:36.120 | [Break]
06:22:39.620 | [Break]
06:22:43.120 | [Break]
06:22:47.120 | [Break]
06:22:50.120 | [Break]
06:22:53.620 | [Break]
06:22:57.120 | [Break]
06:23:00.620 | [Break]
06:23:04.120 | [Break]
06:23:07.620 | [Break]
06:23:11.120 | [Break]
06:23:15.120 | [Break]
06:23:18.120 | [Break]
06:23:21.620 | [Break]
06:23:25.120 | [Break]
06:23:28.620 | [Break]
06:23:32.120 | [Break]
06:23:35.620 | [Break]
06:23:39.120 | [Break]
06:23:43.120 | [Break]
06:23:46.120 | [Break]
06:23:49.620 | [Break]
06:23:53.120 | [Break]
06:23:56.620 | [Break]
06:24:00.120 | [Break]
06:24:03.620 | [Break]
06:24:07.120 | [Break]
06:24:11.120 | [Break]
06:24:14.120 | [Break]
06:24:17.620 | [Break]
06:24:21.120 | [Break]
06:24:24.620 | [Break]
06:24:28.120 | [Break]
06:24:31.620 | [Break]
06:24:35.120 | [Break]
06:24:39.120 | [Break]
06:24:42.120 | [Break]
06:24:45.620 | [Break]
06:24:49.120 | [Break]
06:24:52.620 | [Break]
06:24:56.120 | [Break]
06:24:59.620 | [Break]
06:25:03.120 | [Break]
06:25:07.120 | [Break]
06:25:10.120 | [Break]
06:25:13.620 | [Break]
06:25:17.120 | [Break]
06:25:20.620 | [Break]
06:25:24.120 | [Break]
06:25:27.620 | [Break]
06:25:31.120 | [Break]
06:25:35.120 | [Break]
06:25:38.120 | [Break]
06:25:41.620 | [Break]
06:25:45.120 | [Break]
06:25:48.620 | [Break]
06:25:52.120 | [Break]
06:25:55.620 | [Break]
06:25:59.120 | [Break]
06:26:03.120 | [Break]
06:26:06.120 | [Break]
06:26:09.620 | [Break]
06:26:13.120 | [Break]
06:26:16.620 | [Break]
06:26:20.120 | [Break]
06:26:23.620 | [Break]
06:26:27.120 | [Break]
06:26:31.120 | [Break]
06:26:34.120 | [Break]
06:26:37.620 | [Break]
06:26:41.120 | [Break]
06:26:44.620 | [Break]
06:26:48.120 | [Break]
06:26:51.620 | [Break]
06:26:55.120 | [Break]
06:26:59.120 | [Break]
06:27:02.120 | [Break]
06:27:05.620 | [Break]
06:27:09.120 | [Break]
06:27:12.620 | [Break]
06:27:16.120 | [Break]
06:27:19.620 | [Break]
06:27:23.120 | [Break]
06:27:27.120 | [Break]
06:27:30.120 | [Break]
06:27:33.620 | [Break]
06:27:37.120 | [Break]
06:27:40.620 | [Break]
06:27:44.120 | [Break]
06:27:47.620 | [Break]
06:27:51.120 | [Break]
06:27:55.120 | [Break]
06:27:58.120 | [Break]
06:28:01.620 | [Break]
06:28:05.120 | [Break]
06:28:08.620 | [Break]
06:28:12.120 | [Break]
06:28:15.620 | [Break]
06:28:19.120 | [Break]
06:28:23.120 | [Break]
06:28:26.120 | [Break]
06:28:29.620 | [Break]
06:28:33.120 | [Break]
06:28:36.620 | [Break]
06:28:40.120 | [Break]
06:28:43.620 | [Break]
06:28:47.120 | [Break]
06:28:51.120 | [Break]
06:28:54.120 | [Break]
06:28:57.620 | [Break]
06:29:01.120 | [Break]
06:29:04.620 | [Break]
06:29:08.120 | [Break]
06:29:11.620 | [Break]
06:29:15.120 | [Break]
06:29:19.120 | [Break]
06:29:22.120 | [Break]
06:29:25.620 | [Break]
06:29:29.120 | [Break]
06:29:32.620 | [Break]
06:29:36.120 | [Break]
06:29:39.620 | [Break]
06:29:43.120 | [Break]
06:29:47.120 | [Break]
06:29:50.120 | [Break]
06:29:53.620 | [Break]
06:29:57.120 | [Break]
06:30:00.620 | [Break]
06:30:04.120 | [Break]
06:30:07.620 | [Break]
06:30:11.120 | [Break]
06:30:15.120 | [Break]
06:30:18.120 | [Break]
06:30:21.620 | [Break]
06:30:25.120 | [Break]
06:30:28.620 | [Break]
06:30:32.120 | [Break]
06:30:35.620 | [Break]
06:30:39.120 | [Break]
06:30:43.120 | [Break]
06:30:46.120 | [Break]
06:30:49.620 | [Break]
06:30:53.120 | [Break]
06:30:56.620 | [Break]
06:31:00.120 | [Break]
06:31:03.620 | [Break]
06:31:07.120 | [Break]
06:31:11.120 | [Break]
06:31:14.120 | [Break]
06:31:17.620 | [Break]
06:31:21.120 | [Break]
06:31:24.620 | [Break]
06:31:28.120 | [Break]
06:31:31.620 | [Break]
06:31:35.120 | [Break]
06:31:39.120 | [Break]
06:31:42.120 | [Break]
06:31:45.620 | [Break]
06:31:49.120 | [Break]
06:31:52.620 | [Break]
06:31:56.120 | [Break]
06:31:59.620 | [Break]
06:32:03.120 | [Break]
06:32:07.120 | [Break]
06:32:10.120 | [Break]
06:32:13.620 | [Break]
06:32:17.120 | [Break]
06:32:20.620 | [Break]
06:32:24.120 | [Break]
06:32:27.620 | [Break]
06:32:31.120 | [Break]
06:32:35.120 | [Break]
06:32:38.120 | [Break]
06:32:41.620 | [Break]
06:32:45.120 | [Break]
06:32:48.620 | [Break]
06:32:52.120 | [Break]
06:32:55.620 | [Break]
06:32:59.120 | [Break]
06:33:03.120 | [Break]
06:33:06.120 | [Break]
06:33:09.620 | [Break]
06:33:13.120 | [Break]
06:33:16.620 | [Break]
06:33:20.120 | [Break]
06:33:23.620 | [Break]
06:33:27.120 | [Break]
06:33:31.120 | [Break]
06:33:34.120 | [Break]
06:33:37.620 | [Break]
06:33:41.120 | [Break]
06:33:44.620 | [Break]
06:33:48.120 | [Break]
06:33:51.620 | [Break]
06:33:55.120 | [Break]
06:33:59.120 | [Break]
06:34:02.120 | [Break]
06:34:05.620 | [Break]
06:34:09.120 | [Break]
06:34:12.620 | [Break]
06:34:16.120 | [Break]
06:34:19.620 | [Break]
06:34:23.120 | [Break]
06:34:27.120 | [Break]
06:34:30.120 | [Break]
06:34:33.620 | [Break]
06:34:37.120 | [Break]
06:34:40.620 | [Break]
06:34:44.120 | [Break]
06:34:47.620 | [Break]
06:34:51.120 | [Break]
06:34:55.120 | [Break]
06:34:58.120 | [Break]
06:35:01.620 | [Break]
06:35:05.120 | [Break]
06:35:08.620 | [Break]
06:35:12.120 | [Break]
06:35:15.620 | [Break]
06:35:19.120 | [Break]
06:35:23.120 | [Break]
06:35:26.120 | [Break]
06:35:29.620 | [Break]
06:35:33.120 | [Break]
06:35:36.620 | [Break]
06:35:40.120 | [Break]
06:35:43.620 | [Break]
06:35:47.120 | [Break]
06:35:51.120 | [Break]
06:35:54.120 | [Break]
06:35:57.620 | [Break]
06:36:01.120 | [Break]
06:36:04.620 | [Break]
06:36:08.120 | [Break]
06:36:11.620 | [Break]
06:36:15.120 | [Break]
06:36:19.120 | [Break]
06:36:22.120 | [Break]
06:36:25.620 | [Break]
06:36:29.120 | [Break]
06:36:32.620 | [Break]
06:36:36.120 | [Break]
06:36:39.620 | [Break]
06:36:43.120 | [Break]
06:36:47.120 | [Break]
06:36:50.120 | [Break]
06:36:53.620 | [Break]
06:36:57.120 | [Break]
06:37:00.620 | [Break]
06:37:04.120 | [Break]
06:37:07.620 | [Break]
06:37:11.120 | [Break]
06:37:15.120 | [Break]
06:37:18.120 | [Break]
06:37:21.620 | [Break]
06:37:25.120 | [Break]
06:37:28.620 | [Break]
06:37:32.120 | [Break]
06:37:35.620 | [Break]
06:37:39.120 | [Break]
06:37:43.120 | [Break]
06:37:46.120 | [Break]
06:37:49.620 | [Break]
06:37:53.120 | [Break]
06:37:56.620 | [Break]
06:38:00.120 | [Break]
06:38:03.620 | [Break]
06:38:07.120 | [Break]
06:38:11.120 | [Break]
06:38:14.120 | [Break]
06:38:17.620 | [Break]
06:38:21.120 | [Break]
06:38:24.620 | [Break]
06:38:28.120 | [Break]
06:38:31.620 | [Break]
06:38:35.120 | [Break]
06:38:39.120 | [Break]
06:38:42.120 | [Break]
06:38:45.620 | [Break]
06:38:49.120 | [Break]
06:38:52.620 | [Break]
06:38:56.120 | [Break]
06:38:59.620 | [Break]
06:39:03.120 | [Break]
06:39:07.120 | [Break]
06:39:10.120 | [Break]
06:39:13.620 | [Break]
06:39:17.120 | [Break]
06:39:20.620 | [Break]
06:39:24.120 | [Break]
06:39:27.620 | [Break]
06:39:31.120 | [Break]
06:39:35.120 | [Break]
06:39:38.120 | [Break]
06:39:41.620 | [Break]
06:39:45.120 | [Break]
06:39:48.620 | [Break]
06:39:52.120 | [Break]
06:39:55.620 | [Break]
06:39:59.120 | [Break]
06:40:03.120 | [Break]
06:40:06.120 | [Break]
06:40:09.620 | [Break]
06:40:13.120 | [Break]
06:40:16.620 | [Break]
06:40:20.120 | [Break]
06:40:23.620 | [Break]
06:40:27.120 | [Break]
06:40:31.120 | [Break]
06:40:34.120 | [Break]
06:40:37.620 | [Break]
06:40:41.120 | [Break]
06:40:44.620 | [Break]
06:40:48.120 | [Break]
06:40:51.620 | [Break]
06:40:55.120 | [Break]
06:40:59.120 | [Break]
06:41:02.120 | [Break]
06:41:05.620 | [Break]
06:41:09.120 | [Break]
06:41:12.620 | [Break]
06:41:16.120 | [Break]
06:41:19.620 | [Break]
06:41:23.120 | [Break]
06:41:27.120 | [Break]
06:41:30.120 | [Break]
06:41:33.620 | [Break]
06:41:37.120 | [Break]
06:41:40.620 | [Break]
06:41:44.120 | [Break]
06:41:47.620 | [Break]
06:41:51.120 | [Break]
06:41:55.120 | [Break]
06:41:58.120 | [Break]
06:42:01.620 | [Break]
06:42:05.120 | [Break]
06:42:08.620 | [Break]
06:42:12.120 | [Break]
06:42:15.620 | [Break]
06:42:19.620 | [Break]
06:42:22.620 | [Break]
06:42:26.120 | [Break]
06:42:29.620 | [Break]
06:42:33.120 | [Break]
06:42:36.620 | [Break]
06:42:40.120 | [Break]
06:42:43.620 | [Break]
06:42:47.620 | [Break]
06:42:50.620 | [Break]
06:42:54.120 | [Break]
06:42:57.620 | [Break]
06:43:01.120 | [Break]
06:43:04.620 | [Break]
06:43:08.120 | [Break]
06:43:11.620 | [Break]
06:43:15.620 | [Break]
06:43:18.620 | [Break]
06:43:22.120 | [Break]
06:43:25.620 | [Break]
06:43:29.120 | [Break]
06:43:32.620 | [Break]
06:43:36.120 | [Break]
06:43:39.620 | [Break]
06:43:43.620 | [Break]
06:43:46.620 | [Break]
06:43:50.120 | [Break]
06:43:53.620 | [Break]
06:43:57.120 | [Break]
06:44:00.620 | [Break]
06:44:04.120 | [Break]
06:44:07.620 | [Break]
06:44:11.620 | [Break]
06:44:14.620 | [Break]
06:44:18.120 | [Break]
06:44:21.620 | [Break]
06:44:25.120 | [Break]
06:44:28.620 | [Break]
06:44:32.120 | [Break]
06:44:35.620 | [Break]
06:44:39.620 | [Break]
06:44:42.620 | [Break]
06:44:46.120 | [Break]
06:44:49.620 | [Break]
06:44:53.120 | [Break]
06:44:56.620 | [Break]
06:45:00.120 | [Break]
06:45:03.620 | [Break]
06:45:07.620 | [Break]
06:45:10.620 | [Break]
06:45:14.120 | [Break]
06:45:17.620 | [Break]
06:45:21.120 | [Break]
06:45:24.620 | [Break]
06:45:28.120 | [Break]
06:45:31.620 | [Break]
06:45:35.620 | [Break]
06:45:38.620 | [Break]
06:45:42.120 | [Break]
06:45:45.620 | [Break]
06:45:49.120 | [Break]
06:45:52.620 | [Break]
06:45:56.120 | [Break]
06:45:59.620 | [Break]
06:46:03.620 | [Break]
06:46:06.620 | [Break]
06:46:10.120 | [Break]
06:46:13.620 | [Break]
06:46:17.120 | [Break]
06:46:20.620 | [Break]
06:46:24.120 | [Break]
06:46:27.620 | [Break]
06:46:31.620 | [Break]
06:46:34.620 | [Break]
06:46:38.120 | [Break]
06:46:41.620 | [Break]
06:46:45.120 | [Break]
06:46:48.620 | [Break]
06:46:52.120 | [Break]
06:46:55.620 | [Break]
06:46:59.620 | [Break]
06:47:02.620 | [Break]
06:47:06.120 | [Break]
06:47:09.620 | [Break]
06:47:13.120 | [Break]
06:47:16.620 | [Break]
06:47:20.120 | [Break]
06:47:23.620 | [Break]
06:47:27.620 | [Break]
06:47:30.620 | [Break]
06:47:34.120 | [Break]
06:47:37.620 | [Break]
06:47:41.120 | [Break]
06:47:44.620 | [Break]
06:47:48.120 | [Break]
06:47:51.620 | [Break]
06:47:55.620 | [Break]
06:47:58.620 | [Break]
06:48:02.120 | [Break]
06:48:05.620 | [Break]
06:48:09.120 | [Break]
06:48:12.620 | [Break]
06:48:16.120 | [Break]
06:48:19.620 | [Break]
06:48:23.620 | [Break]
06:48:26.620 | [Break]
06:48:30.120 | [Break]
06:48:33.620 | [Break]
06:48:37.120 | [Break]
06:48:40.620 | [Break]
06:48:44.120 | [Break]
06:48:48.120 | [Break]
06:48:51.120 | [Break]
06:48:54.620 | [Break]
06:48:58.120 | [Break]
06:49:01.620 | [Break]
06:49:05.120 | [Break]
06:49:08.620 | [Break]
06:49:12.120 | [Break]
06:49:16.120 | [Break]
06:49:19.120 | [Break]
06:49:22.620 | [Break]
06:49:26.120 | [Break]
06:49:29.620 | [Break]
06:49:33.120 | [Break]
06:49:36.620 | [Break]
06:49:40.120 | [Break]
06:49:44.120 | [Break]
06:49:47.120 | [Break]
06:49:50.620 | [Break]
06:49:54.120 | [Break]
06:49:57.620 | [Break]
06:50:01.120 | [Break]
06:50:04.620 | [Break]
06:50:08.120 | [Break]
06:50:12.120 | [Break]
06:50:15.120 | [Break]
06:50:18.620 | [Break]
06:50:22.120 | [Break]
06:50:25.620 | [Break]
06:50:29.120 | [Break]
06:50:32.620 | [Break]
06:50:36.120 | [Break]
06:50:40.120 | [Break]
06:50:43.120 | [Break]
06:50:46.620 | [Break]
06:50:50.120 | [Break]
06:50:53.620 | [Break]
06:50:57.120 | [Break]
06:51:00.620 | [Break]
06:51:04.120 | [Break]
06:51:08.120 | [Break]
06:51:11.120 | [Break]
06:51:14.620 | [Break]
06:51:18.120 | [Break]
06:51:21.620 | [Break]
06:51:25.120 | [Break]
06:51:28.620 | [Break]
06:51:32.120 | [Break]
06:51:36.120 | [Break]
06:51:39.120 | [Break]
06:51:42.620 | [Break]
06:51:46.120 | [Break]
06:51:49.620 | [Break]
06:51:53.120 | [Break]
06:51:56.620 | [Break]
06:52:00.120 | [Break]
06:52:04.120 | [Break]
06:52:07.120 | [Break]
06:52:10.620 | [Break]
06:52:14.120 | [Break]
06:52:17.620 | [Break]
06:52:21.120 | [Break]
06:52:24.620 | [Break]
06:52:28.120 | [Break]
06:52:32.120 | [Break]
06:52:35.120 | [Break]
06:52:38.620 | [Break]
06:52:42.120 | [Break]
06:52:45.620 | [Break]
06:52:49.120 | [Break]
06:52:52.620 | [Break]
06:52:56.120 | [Break]
06:53:00.120 | [Break]
06:53:03.120 | [Break]
06:53:06.620 | [Break]
06:53:10.120 | [Break]
06:53:13.620 | [Break]
06:53:17.120 | [Break]
06:53:20.620 | [Break]
06:53:24.120 | [Break]
06:53:28.120 | [Break]
06:53:31.120 | [Break]
06:53:34.620 | [Break]
06:53:38.120 | [Break]
06:53:41.620 | [Break]
06:53:45.120 | [Break]
06:53:48.620 | [Break]
06:53:52.120 | [Break]
06:53:56.120 | [Break]
06:53:59.120 | [Break]
06:54:02.620 | [Break]
06:54:06.120 | [Break]
06:54:09.620 | [Break]
06:54:13.120 | [Break]
06:54:16.620 | [Break]
06:54:20.120 | [Break]
06:54:24.120 | [Break]
06:54:27.120 | [Break]
06:54:30.620 | [Break]
06:54:34.120 | [Break]
06:54:37.620 | [Break]
06:54:41.120 | [Break]
06:54:44.620 | [Break]
06:54:48.120 | [Break]
06:54:52.120 | [Break]
06:54:55.120 | [Break]
06:54:58.620 | [Break]
06:55:02.120 | [Break]
06:55:05.620 | [Break]
06:55:09.120 | [Break]
06:55:12.620 | [Break]
06:55:16.120 | [Break]
06:55:20.120 | [Break]
06:55:23.120 | [Break]
06:55:26.620 | [Break]
06:55:30.120 | [Break]
06:55:33.620 | [Break]
06:55:37.120 | [Break]
06:55:40.620 | [Break]
06:55:44.120 | [Break]
06:55:48.120 | [Break]
06:55:51.120 | [Break]
06:55:54.620 | [Break]
06:55:58.120 | [Break]
06:56:01.620 | [Break]
06:56:05.120 | [Break]
06:56:08.620 | [Break]
06:56:12.120 | [Break]
06:56:16.120 | [Break]
06:56:19.120 | [Break]
06:56:22.620 | [Break]
06:56:26.120 | [Break]
06:56:29.620 | [Break]
06:56:33.120 | [Break]
06:56:36.620 | [Break]
06:56:40.120 | [Break]
06:56:44.120 | [Break]
06:56:47.120 | [Break]
06:56:50.620 | [Break]
06:56:54.120 | [Break]
06:56:57.620 | [Break]
06:57:01.120 | [Break]
06:57:04.620 | [Break]
06:57:08.120 | [Break]
06:57:12.120 | [Break]
06:57:15.120 | [Break]
06:57:18.620 | [Break]
06:57:22.120 | [Break]
06:57:25.620 | [Break]
06:57:29.120 | [Break]
06:57:32.620 | [Break]
06:57:36.120 | [Break]
06:57:40.120 | [Break]
06:57:43.120 | [Break]
06:57:46.620 | [Break]
06:57:50.120 | [Break]
06:57:53.620 | [Break]
06:57:57.120 | [Break]
06:58:00.620 | [Break]
06:58:04.120 | [Break]
06:58:08.120 | [Break]
06:58:11.120 | [Break]
06:58:14.620 | [Break]
06:58:18.120 | [Break]
06:58:21.620 | [Break]
06:58:25.120 | [Break]
06:58:28.620 | [Break]
06:58:32.120 | [Break]
06:58:36.120 | [Break]
06:58:39.120 | [Break]
06:58:42.620 | [Break]
06:58:46.120 | [Break]
06:58:49.620 | [Break]
06:58:53.120 | [Break]
06:58:56.620 | [Break]
06:59:00.120 | [Break]
06:59:04.120 | [Break]
06:59:07.120 | [Break]
06:59:10.620 | [Break]
06:59:14.120 | [Break]
06:59:17.620 | [Break]
06:59:21.120 | [Break]
06:59:24.620 | [Break]
06:59:28.120 | [Break]
06:59:32.120 | [Break]
06:59:35.120 | [Break]
06:59:38.620 | [Break]
06:59:42.120 | [Break]
06:59:45.620 | [Break]
06:59:49.120 | [Break]
06:59:52.620 | [Break]
06:59:56.120 | [Break]
07:00:00.120 | [Break]
07:00:03.120 | [Break]
07:00:06.620 | [Break]
07:00:10.120 | [Break]
07:00:13.620 | [Break]
07:00:17.120 | [Break]
07:00:20.620 | [Break]
07:00:24.120 | [Break]
07:00:28.120 | [Break]
07:00:31.120 | [Break]
07:00:34.620 | [Break]
07:00:38.120 | [Break]
07:00:41.620 | [Break]
07:00:45.120 | [Break]
07:00:48.620 | [Break]
07:00:52.120 | [Break]
07:00:56.120 | [Break]
07:00:59.120 | [Break]
07:01:02.620 | [Break]
07:01:06.120 | [Break]
07:01:09.620 | [Break]
07:01:13.120 | [Break]
07:01:16.620 | [Break]
07:01:20.120 | [Break]
07:01:24.120 | [Break]
07:01:27.120 | [Break]
07:01:30.620 | [Break]
07:01:34.120 | [Break]
07:01:37.620 | [Break]
07:01:41.120 | [Break]
07:01:44.620 | [Break]
07:01:48.120 | [Break]
07:01:52.120 | [Break]
07:01:55.120 | [Break]
07:01:58.620 | [Break]
07:02:02.120 | [Break]
07:02:05.620 | [Break]
07:02:09.120 | [Break]
07:02:12.620 | [Break]
07:02:16.120 | [Break]
07:02:20.120 | [Break]
07:02:23.120 | [Break]
07:02:26.620 | [Break]
07:02:30.120 | [Break]
07:02:33.620 | [Break]
07:02:37.120 | [Break]
07:02:40.620 | [Break]
07:02:44.120 | [Break]
07:02:48.120 | [Break]
07:02:51.120 | [Break]
07:02:54.620 | [Break]
07:02:58.120 | [Break]
07:03:01.620 | [Break]
07:03:05.120 | [Break]
07:03:08.620 | [Break]
07:03:12.120 | [Break]
07:03:16.120 | [Break]
07:03:19.120 | [Break]
07:03:22.620 | [Break]
07:03:26.120 | [Break]
07:03:29.620 | [Break]
07:03:33.120 | [Break]
07:03:36.620 | [Break]
07:03:40.120 | [Break]
07:03:44.120 | [Break]
07:03:47.120 | [Break]
07:03:50.620 | [Break]
07:03:54.120 | [Break]
07:03:57.620 | [Break]
07:04:01.120 | [Break]
07:04:04.620 | [Break]
07:04:08.120 | [Break]
07:04:12.120 | [Break]
07:04:15.120 | [Break]
07:04:18.620 | [Break]
07:04:22.120 | [Break]
07:04:25.620 | [Break]
07:04:29.120 | [Break]
07:04:32.620 | [Break]
07:04:36.120 | [Break]
07:04:40.120 | [Break]
07:04:43.120 | [Break]
07:04:46.620 | [Break]
07:04:50.120 | [Break]
07:04:53.620 | [Break]
07:04:57.120 | [Break]
07:05:00.620 | [Break]
07:05:04.120 | [Break]
07:05:08.120 | [Break]
07:05:11.120 | [Break]
07:05:14.620 | [Break]
07:05:18.120 | [Break]
07:05:21.620 | [Break]
07:05:25.120 | [Break]
07:05:28.620 | [Break]
07:05:32.120 | [Break]
07:05:36.120 | [Break]
07:05:39.120 | [Break]
07:05:42.620 | [Break]
07:05:46.120 | [Break]
07:05:49.620 | [Break]
07:05:53.120 | [Break]
07:05:56.620 | [Break]
07:06:00.120 | [Break]
07:06:04.120 | [Break]
07:06:07.120 | [Break]
07:06:10.620 | [Break]
07:06:14.120 | [Break]
07:06:17.620 | [Break]
07:06:21.120 | [Break]
07:06:24.620 | [Break]
07:06:28.120 | [Break]
07:06:32.120 | [Break]
07:06:35.120 | [Break]
07:06:38.620 | [Break]
07:06:42.120 | [Break]
07:06:45.620 | [Break]
07:06:49.120 | [Break]
07:06:52.620 | [Break]
07:06:56.120 | [Break]
07:07:00.120 | [Break]
07:07:03.120 | [Break]
07:07:06.620 | [Break]
07:07:10.120 | [Break]
07:07:13.620 | [Break]
07:07:17.120 | [Break]
07:07:20.620 | [Break]
07:07:24.120 | [Break]
07:07:28.120 | [Break]
07:07:31.120 | [Break]
07:07:34.620 | [Break]
07:07:38.120 | [Break]
07:07:41.620 | [Break]
07:07:45.120 | [Break]
07:07:48.620 | [Break]
07:07:52.620 | [Break]
07:07:55.620 | [Break]
07:07:59.120 | [Break]
07:08:02.620 | [Break]
07:08:06.120 | [Break]
07:08:09.620 | [Break]
07:08:13.120 | [Break]
07:08:16.620 | [Break]
07:08:20.620 | [Break]
07:08:23.620 | [Break]
07:08:27.120 | [Break]
07:08:30.620 | [Break]
07:08:34.120 | [Break]
07:08:37.620 | [Break]
07:08:41.120 | [Break]
07:08:44.620 | [Break]
07:08:48.620 | [Break]
07:08:51.620 | [Break]
07:08:55.120 | [Break]
07:08:58.620 | [Break]
07:09:02.120 | [Break]
07:09:05.620 | [Break]
07:09:09.120 | [Break]
07:09:12.620 | [Break]
07:09:16.620 | [Break]
07:09:19.620 | [Break]
07:09:23.120 | [Break]
07:09:26.620 | [Break]
07:09:30.120 | [Break]
07:09:33.620 | [Break]
07:09:37.120 | [Break]
07:09:40.620 | [Break]
07:09:44.620 | [Break]
07:09:47.620 | [Break]
07:09:51.120 | [Break]
07:09:54.620 | [Break]
07:09:58.120 | [Break]
07:10:01.620 | [Break]
07:10:05.120 | [Break]
07:10:08.620 | [Break]
07:10:12.620 | [Break]
07:10:15.620 | [Break]
07:10:19.120 | [Break]
07:10:22.620 | [Break]
07:10:26.120 | [Break]
07:10:29.620 | [Break]
07:10:33.120 | [Break]
07:10:36.620 | [Break]
07:10:40.620 | [Break]
07:10:43.620 | [Break]
07:10:47.120 | [Break]
07:10:50.620 | [Break]
07:10:54.120 | [Break]
07:10:57.620 | [Break]
07:11:01.120 | [Break]
07:11:04.620 | [Break]
07:11:08.620 | [Break]
07:11:11.620 | [Break]
07:11:15.120 | [Break]
07:11:18.620 | [Break]
07:11:22.120 | [Break]
07:11:25.620 | [Break]
07:11:29.120 | [Break]
07:11:32.620 | [Break]
07:11:36.620 | [Break]
07:11:39.620 | [Break]
07:11:43.120 | [Break]
07:11:46.620 | [Break]
07:11:50.120 | [Break]
07:11:53.620 | [Break]
07:11:57.120 | [Break]
07:12:00.620 | [Break]
07:12:04.620 | [Break]
07:12:07.620 | [Break]
07:12:11.120 | [Break]
07:12:14.620 | [Break]
07:12:18.120 | [Break]
07:12:21.620 | [Break]
07:12:25.120 | [Break]
07:12:28.620 | [Break]
07:12:32.620 | [Break]
07:12:35.620 | [Break]
07:12:39.120 | [Break]
07:12:42.620 | [Break]
07:12:46.120 | [Break]
07:12:49.620 | [Break]
07:12:53.120 | [Break]
07:12:56.620 | [Break]
07:13:00.620 | [Break]
07:13:03.620 | [Break]
07:13:07.120 | [Break]
07:13:10.620 | [Break]
07:13:14.120 | [Break]
07:13:17.620 | [Break]
07:13:21.120 | [Break]
07:13:24.620 | [Break]
07:13:28.620 | [Break]
07:13:31.620 | [Break]
07:13:35.120 | [Break]
07:13:38.620 | [Break]
07:13:42.120 | [Break]
07:13:45.620 | [Break]
07:13:49.120 | [Break]
07:13:52.620 | [Break]
07:13:56.620 | [Break]
07:13:59.620 | [Break]
07:14:03.120 | [Break]
07:14:06.620 | [Break]
07:14:10.120 | [Break]
07:14:13.620 | [Break]
07:14:17.120 | [Break]
07:14:20.620 | [Break]
07:14:24.620 | [Break]
07:14:27.620 | [Break]
07:14:31.120 | [Break]
07:14:34.620 | [Break]
07:14:38.120 | [Break]
07:14:41.620 | [Break]
07:14:45.120 | [Break]
07:14:48.620 | [Break]
07:14:52.620 | [Break]
07:14:55.620 | [Break]
07:14:59.120 | [Break]
07:15:02.620 | [Break]
07:15:06.120 | [Break]
07:15:09.620 | [Break]
07:15:13.120 | [Break]
07:15:16.620 | [Break]
07:15:20.620 | [Break]
07:15:23.620 | [Break]
07:15:27.120 | [Break]
07:15:30.620 | [Break]
07:15:34.120 | [Break]
07:15:37.620 | [Break]
07:15:41.120 | [Break]
07:15:44.620 | [Break]
07:15:48.620 | [Break]
07:15:51.620 | [Break]
07:15:55.120 | [Break]
07:15:58.620 | [Break]
07:16:02.120 | [Break]
07:16:05.620 | [Break]
07:16:09.120 | [Break]
07:16:12.620 | [Break]
07:16:16.620 | [Break]
07:16:19.620 | [Break]
07:16:23.120 | [Break]
07:16:26.620 | [Break]
07:16:30.120 | [Break]
07:16:33.620 | [Break]
07:16:37.120 | [Break]
07:16:40.620 | [Break]
07:16:44.620 | [Break]
07:16:47.620 | [Break]
07:16:51.120 | [Break]
07:16:54.620 | [Break]
07:16:58.120 | [Break]
07:17:01.620 | [Break]
07:17:05.120 | [Break]
07:17:08.620 | [Break]
07:17:12.620 | [Break]
07:17:15.620 | [Break]
07:17:19.120 | [Break]
07:17:22.620 | [Break]
07:17:26.120 | [Break]
07:17:29.620 | [Break]
07:17:33.120 | [Break]
07:17:36.620 | [Break]
07:17:40.620 | [Break]
07:17:43.620 | [Break]
07:17:47.120 | [Break]
07:17:50.620 | [Break]
07:17:54.120 | [Break]
07:17:57.620 | [Break]
07:18:01.120 | [Break]
07:18:04.620 | [Break]
07:18:08.620 | [Break]
07:18:11.620 | [Break]
07:18:15.120 | [Break]
07:18:18.620 | [Break]
07:18:22.120 | [Break]
07:18:25.620 | [Break]
07:18:29.120 | [Break]
07:18:32.620 | [Break]
07:18:36.620 | [Break]
07:18:39.620 | [Break]
07:18:43.120 | [Break]
07:18:46.620 | [Break]
07:18:50.120 | [Break]
07:18:53.620 | [Break]
07:18:57.120 | [Break]
07:19:00.620 | [Break]
07:19:04.620 | [Break]
07:19:07.620 | [Break]
07:19:11.120 | [Break]
07:19:14.620 | [Break]
07:19:18.120 | [Break]
07:19:21.620 | [Break]
07:19:25.120 | [Break]
07:19:28.620 | [Break]
07:19:32.620 | [Break]
07:19:35.620 | [Break]
07:19:39.120 | [Break]
07:19:42.620 | [Break]
07:19:46.120 | [Break]
07:19:49.620 | [Break]
07:19:53.120 | [Break]
07:19:56.620 | [Break]
07:20:00.620 | [Break]
07:20:03.620 | [Break]
07:20:07.120 | [Break]
07:20:10.620 | [Break]
07:20:14.120 | [Break]
07:20:17.620 | [Break]
07:20:21.120 | [Break]
07:20:24.620 | [Break]
07:20:28.620 | [Break]
07:20:31.620 | [Break]
07:20:35.120 | [Break]
07:20:38.620 | [Break]
07:20:42.120 | [Break]
07:20:45.620 | [Break]
07:20:49.120 | [Break]
07:20:52.620 | [Break]
07:20:56.620 | [Break]
07:20:59.620 | [Break]
07:21:03.120 | [Break]
07:21:06.620 | [Break]
07:21:10.120 | [Break]
07:21:13.620 | [Break]
07:21:17.120 | [Break]
07:21:20.620 | [Break]
07:21:24.620 | [Break]
07:21:27.620 | [Break]
07:21:31.120 | [Break]
07:21:34.620 | [Break]
07:21:38.120 | [Break]
07:21:41.620 | [Break]
07:21:45.120 | [Break]
07:21:48.620 | [Break]
07:21:52.620 | [Break]
07:21:55.620 | [Break]
07:21:59.120 | [Break]
07:22:02.620 | [Break]
07:22:06.120 | [Break]
07:22:09.620 | [Break]
07:22:13.120 | [Break]
07:22:16.620 | [Break]
07:22:20.620 | [Break]
07:22:23.620 | [Break]
07:22:27.120 | [Break]
07:22:30.620 | [Break]
07:22:34.120 | [Break]
07:22:37.620 | [Break]
07:22:41.120 | [Break]
07:22:44.620 | [Break]
07:22:48.620 | [Break]
07:22:51.620 | [Break]
07:22:55.120 | [Break]
07:22:58.620 | [Break]
07:23:02.120 | [Break]
07:23:05.620 | [Break]
07:23:09.120 | [Break]
07:23:12.620 | [Break]
07:23:16.620 | [Break]
07:23:19.620 | [Break]
07:23:23.120 | [Break]
07:23:26.620 | [Break]
07:23:30.120 | [Break]
07:23:33.620 | [Break]
07:23:37.120 | [Break]
07:23:40.620 | [Break]
07:23:44.620 | [Break]
07:23:47.620 | [Break]
07:23:51.120 | [Break]
07:23:54.620 | [Break]
07:23:58.120 | [Break]
07:24:01.620 | [Break]
07:24:05.120 | [Break]
07:24:08.620 | [Break]
07:24:12.620 | [Break]
07:24:15.620 | [Break]
07:24:19.120 | [Break]
07:24:22.620 | [Break]
07:24:26.120 | [Break]
07:24:29.620 | [Break]
07:24:33.120 | [Break]
07:24:36.620 | [Break]
07:24:40.620 | [Break]
07:24:43.620 | [Break]
07:24:47.120 | [Break]
07:24:50.620 | [Break]
07:24:54.120 | [Break]
07:24:57.620 | [Break]
07:25:01.120 | [Break]
07:25:04.620 | [Break]
07:25:08.620 | [Break]
07:25:11.620 | [Break]
07:25:15.120 | [Break]
07:25:18.620 | [Break]
07:25:22.120 | [Break]
07:25:25.620 | [Break]
07:25:29.120 | [Break]
07:25:32.620 | [Break]
07:25:36.620 | [Break]
07:25:39.620 | [Break]
07:25:43.120 | [Break]
07:25:46.620 | [Break]
07:25:50.120 | [Break]
07:25:53.620 | [Break]
07:25:57.120 | [Break]
07:26:00.620 | [Break]
07:26:04.620 | [Break]
07:26:07.620 | [Break]
07:26:11.120 | [Break]
07:26:14.620 | [Break]
07:26:18.120 | [Break]
07:26:21.620 | [Break]
07:26:25.120 | [Break]
07:26:28.620 | [Break]
07:26:32.620 | [Break]
07:26:35.620 | [Break]
07:26:39.120 | [Break]
07:26:42.620 | [Break]
07:26:46.120 | [Break]
07:26:49.620 | [Break]
07:26:53.120 | [Break]
07:26:56.620 | [Break]
07:27:00.620 | [Break]
07:27:03.620 | [Break]
07:27:07.120 | [Break]
07:27:10.620 | [Break]
07:27:14.120 | [Break]
07:27:17.620 | [Break]
07:27:21.120 | [Break]
07:27:24.620 | [Break]
07:27:28.620 | [Break]
07:27:31.620 | [Break]
07:27:35.120 | [Break]
07:27:38.620 | [Break]
07:27:42.120 | [Break]
07:27:45.620 | [Break]
07:27:49.120 | [Break]
07:27:52.620 | [Break]
07:27:56.620 | [Break]
07:27:59.620 | [Break]
07:28:03.120 | [Break]
07:28:06.620 | [Break]
07:28:10.120 | [Break]
07:28:13.620 | [Break]
07:28:17.120 | [Break]
07:28:20.620 | [Break]
07:28:24.620 | [Break]
07:28:27.620 | [Break]
07:28:31.120 | [Break]
07:28:34.620 | [Break]
07:28:38.120 | [Break]
07:28:41.620 | [Break]
07:28:45.120 | [Break]
07:28:48.620 | [Break]
07:28:52.620 | [Break]
07:28:55.620 | [Break]
07:28:59.120 | [Break]
07:29:02.620 | [Break]
07:29:06.120 | [Break]
07:29:09.620 | [Break]
07:29:13.120 | [Break]
07:29:16.620 | [Break]
07:29:20.620 | [Break]
07:29:23.620 | [Break]
07:29:27.120 | [Break]
07:29:30.620 | [Break]
07:29:34.120 | [Break]
07:29:37.620 | [Break]
07:29:41.120 | [Break]
07:29:44.620 | [Break]
07:29:48.620 | [Break]
07:29:51.620 | [Break]
07:29:55.120 | [Break]
07:29:58.620 | [Break]
07:30:02.120 | [Break]
07:30:05.620 | [Break]
07:30:09.120 | [Break]
07:30:12.620 | [Break]
07:30:16.620 | [Break]
07:30:19.620 | [Break]
07:30:23.120 | [Break]
07:30:26.620 | [Break]
07:30:30.120 | [Break]
07:30:33.620 | [Break]
07:30:37.120 | [Break]
07:30:40.620 | [Break]
07:30:44.620 | [Break]
07:30:47.620 | [Break]
07:30:51.120 | [Break]
07:30:54.620 | [Break]
07:30:58.120 | [Break]
07:31:01.620 | [Break]
07:31:05.120 | [Break]
07:31:08.620 | [Break]
07:31:12.620 | [Break]
07:31:15.620 | [Break]
07:31:19.120 | [Break]
07:31:22.620 | [Break]
07:31:26.120 | [Break]
07:31:29.620 | [Break]
07:31:33.120 | [Break]
07:31:36.620 | [Break]
07:31:40.620 | [Break]
07:31:43.620 | [Break]
07:31:47.120 | [Break]
07:31:50.620 | [Break]
07:31:54.120 | [Break]
07:31:57.620 | [Break]
07:32:01.120 | [Break]
07:32:04.620 | [Break]
07:32:08.620 | [Break]
07:32:11.620 | [Break]
07:32:15.120 | [Break]
07:32:18.620 | [Break]
07:32:22.120 | [Break]
07:32:25.620 | [Break]
07:32:29.120 | [Break]
07:32:32.620 | [Break]
07:32:36.620 | [Break]
07:32:39.620 | [Break]
07:32:43.120 | [Break]
07:32:46.620 | [Break]
07:32:50.120 | [Break]
07:32:53.620 | [Break]
07:32:57.120 | [Break]
07:33:00.620 | [Break]
07:33:04.620 | [Break]
07:33:07.620 | [Break]
07:33:11.120 | [Break]
07:33:14.620 | [Break]
07:33:18.120 | [Break]
07:33:21.620 | [Break]
07:33:25.120 | [Break]
07:33:28.620 | [Break]
07:33:32.620 | [Break]
07:33:35.620 | [Break]
07:33:39.120 | [Break]
07:33:42.620 | [Break]
07:33:46.120 | [Break]
07:33:49.620 | [Break]
07:33:53.120 | [Break]
07:33:56.620 | [Break]
07:34:00.620 | [Break]
07:34:03.620 | [Break]
07:34:07.120 | [Break]
07:34:10.620 | [Break]
07:34:14.120 | [Break]
07:34:17.620 | [Break]
07:34:21.120 | [Break]
07:34:24.620 | [Break]
07:34:28.620 | [Break]
07:34:31.620 | [Break]
07:34:35.120 | [Break]
07:34:38.620 | [Break]
07:34:42.120 | [Break]
07:34:45.620 | [Break]
07:34:49.120 | [Break]
07:34:52.620 | [Break]
07:34:56.620 | [Break]
07:34:59.620 | [Break]
07:35:03.120 | [Break]
07:35:06.620 | [Break]
07:35:10.120 | [Break]
07:35:13.620 | [Break]
07:35:17.120 | [Break]
07:35:20.620 | [Break]
07:35:24.620 | [Break]
07:35:28.120 | [Break]
07:35:31.120 | [Break]
07:35:34.620 | [Break]
07:35:38.120 | [Break]
07:35:41.620 | [Break]
07:35:45.620 | [Break]
07:35:48.620 | [Break]
07:35:52.120 | [Break]
07:35:55.620 | [Break]
07:35:59.120 | [Break]
07:36:02.620 | [Break]
07:36:06.120 | [Break]
07:36:09.620 | [Break]
07:36:13.620 | [Break]
07:36:16.620 | [Break]
07:36:20.120 | [Break]
07:36:23.620 | [Break]
07:36:27.120 | [Break]
07:36:30.620 | [Break]
07:36:34.120 | [Break]
07:36:37.620 | [Break]
07:36:41.620 | [Break]
07:36:44.620 | [Break]
07:36:48.120 | [Break]
07:36:51.620 | [Break]
07:36:55.120 | [Break]
07:36:58.620 | [Break]
07:37:02.120 | [Break]
07:37:05.620 | [Break]
07:37:09.620 | [Break]
07:37:12.620 | [Break]
07:37:16.120 | [Break]
07:37:19.620 | [Break]
07:37:23.120 | [Break]
07:37:26.620 | [Break]
07:37:30.120 | [Break]
07:37:33.620 | [Break]
07:37:37.620 | [Break]
07:37:40.620 | [Break]
07:37:44.120 | [Break]
07:37:47.620 | [Break]
07:37:51.120 | [Break]
07:37:54.620 | [Break]
07:37:58.120 | [Break]
07:38:01.620 | [Break]
07:38:05.620 | [Break]
07:38:08.620 | [Break]
07:38:12.120 | [Break]
07:38:15.620 | [Break]
07:38:19.120 | [Break]
07:38:22.620 | [Break]
07:38:26.120 | [Break]
07:38:29.620 | [Break]
07:38:33.620 | [Break]
07:38:36.620 | [Break]
07:38:40.120 | [Break]
07:38:43.620 | [Break]
07:38:47.120 | [Break]
07:38:50.620 | [Break]
07:38:54.120 | [Break]
07:38:57.620 | [Break]
07:39:01.620 | [Break]
07:39:04.620 | [Break]
07:39:08.120 | [Break]
07:39:11.620 | [Break]
07:39:15.120 | [Break]
07:39:18.620 | [Break]
07:39:22.120 | [Break]
07:39:25.620 | [Break]
07:39:29.620 | [Break]
07:39:32.620 | [Break]
07:39:36.120 | [Break]
07:39:39.620 | [Break]
07:39:43.120 | [Break]
07:39:46.620 | [Break]
07:39:50.120 | [Break]
07:39:53.620 | [Break]
07:39:57.620 | [Break]
07:40:00.620 | [Break]
07:40:04.120 | [Break]
07:40:07.620 | [Break]
07:40:11.120 | [Break]
07:40:14.620 | [Break]
07:40:18.120 | [Break]
07:40:21.620 | [Break]
07:40:25.620 | [Break]
07:40:28.620 | [Break]
07:40:32.120 | [Break]
07:40:35.620 | [Break]
07:40:39.120 | [Break]
07:40:42.620 | [Break]
07:40:46.120 | [Break]
07:40:49.620 | [Break]
07:40:53.620 | [Break]
07:40:56.620 | [Break]
07:41:00.120 | [Break]
07:41:03.620 | [Break]
07:41:07.120 | [Break]
07:41:10.620 | [Break]
07:41:14.120 | [Break]
07:41:17.620 | [Break]
07:41:21.620 | [Break]
07:41:24.620 | [Break]
07:41:28.120 | [Break]
07:41:31.620 | [Break]
07:41:35.120 | [Break]
07:41:38.620 | [Break]
07:41:42.120 | [Break]
07:41:45.620 | [Break]
07:41:49.620 | [Break]
07:41:52.620 | [Break]
07:41:56.120 | [Break]
07:41:59.620 | [Break]
07:42:03.120 | [Break]
07:42:06.620 | [Break]
07:42:10.120 | [Break]
07:42:13.620 | [Break]
07:42:17.620 | [Break]
07:42:20.620 | [Break]
07:42:24.120 | [Break]
07:42:27.620 | [Break]
07:42:31.120 | [Break]
07:42:34.620 | [Break]
07:42:38.120 | [Break]
07:42:41.620 | [Break]
07:42:45.620 | [Break]
07:42:48.620 | [Break]
07:42:52.120 | [Break]
07:42:55.620 | [Break]
07:42:59.120 | [Break]
07:43:02.620 | [Break]
07:43:06.120 | [Break]
07:43:09.620 | [Break]
07:43:13.620 | [Break]
07:43:16.620 | [Break]
07:43:20.120 | [Break]
07:43:23.620 | [Break]
07:43:27.120 | [Break]
07:43:30.620 | [Break]
07:43:34.120 | [Break]
07:43:37.620 | [Break]
07:43:41.620 | [Break]
07:43:44.620 | [Break]
07:43:48.120 | [Break]
07:43:51.620 | [Break]
07:43:55.120 | [Break]
07:43:58.620 | [Break]
07:44:02.120 | [Break]
07:44:05.620 | [Break]
07:44:09.620 | [Break]
07:44:12.620 | [Break]
07:44:16.120 | [Break]
07:44:19.620 | [Break]
07:44:23.120 | [Break]
07:44:26.620 | [Break]
07:44:30.120 | [Break]
07:44:33.620 | [Break]
07:44:37.620 | [Break]
07:44:40.620 | [Break]
07:44:44.120 | [Break]
07:44:47.620 | [Break]
07:44:51.120 | [Break]
07:44:54.620 | [Break]
07:44:58.120 | [Break]
07:45:01.620 | [Break]
07:45:05.620 | [Break]
07:45:08.620 | [Break]
07:45:12.120 | [Break]
07:45:15.620 | [Break]
07:45:19.120 | [Break]
07:45:22.620 | [Break]
07:45:26.120 | [Break]
07:45:29.620 | [Break]
07:45:33.620 | [Break]
07:45:36.620 | [Break]
07:45:40.120 | [Break]
07:45:43.620 | [Break]
07:45:47.120 | [Break]
07:45:50.620 | [Break]
07:45:54.120 | [Break]
07:45:57.620 | [Break]
07:46:01.620 | [Break]
07:46:04.620 | [Break]
07:46:08.120 | [Break]
07:46:11.620 | [Break]
07:46:15.120 | [Break]
07:46:18.620 | [Break]
07:46:22.120 | [Break]
07:46:25.620 | [Break]
07:46:29.620 | [Break]
07:46:32.620 | [Break]
07:46:36.120 | [Break]
07:46:39.620 | [Break]
07:46:43.120 | [Break]
07:46:46.620 | [Break]
07:46:50.120 | [Break]
07:46:53.620 | [Break]
07:46:57.620 | [Break]
07:47:00.620 | [Break]
07:47:04.120 | [Break]
07:47:07.620 | [Break]
07:47:11.120 | [Break]
07:47:14.620 | [Break]
07:47:18.120 | [Break]
07:47:21.620 | [Break]
07:47:25.620 | [Break]
07:47:28.620 | [Break]
07:47:32.120 | [Break]
07:47:35.620 | [Break]
07:47:39.120 | [Break]
07:47:42.620 | [Break]
07:47:46.120 | [Break]
07:47:49.620 | [Break]
07:47:53.620 | [Break]
07:47:56.620 | [Break]
07:48:00.120 | [Break]
07:48:03.620 | [Break]
07:48:07.120 | [Break]
07:48:10.620 | [Break]
07:48:14.120 | [Break]
07:48:17.620 | [Break]
07:48:21.620 | [Break]
07:48:24.620 | [Break]
07:48:28.120 | [Break]
07:48:31.620 | [Break]
07:48:35.120 | [Break]
07:48:38.620 | [Break]
07:48:42.120 | [Break]
07:48:45.620 | [Break]
07:48:49.620 | [Break]
07:48:52.620 | [Break]
07:48:56.120 | [Break]
07:48:59.620 | [Break]
07:49:03.120 | [Break]
07:49:06.620 | [Break]
07:49:10.120 | [Break]
07:49:13.620 | [Break]
07:49:17.620 | [Break]
07:49:20.620 | [Break]
07:49:24.120 | [Break]
07:49:27.620 | [Break]
07:49:31.120 | [Break]
07:49:34.620 | [Break]
07:49:38.120 | [Break]
07:49:41.620 | [Break]
07:49:45.620 | [Break]
07:49:48.620 | [Break]
07:49:52.120 | [Break]
07:49:55.620 | [Break]
07:49:59.120 | [Break]
07:50:02.620 | [Break]
07:50:06.120 | [Break]
07:50:09.620 | [Break]
07:50:13.620 | [Break]
07:50:16.620 | [Break]
07:50:20.120 | [Break]
07:50:23.620 | [Break]
07:50:27.120 | [Break]
07:50:30.620 | [Break]
07:50:34.120 | [Break]
07:50:37.620 | [Break]
07:50:41.620 | [Break]
07:50:44.620 | [Break]
07:50:48.120 | [Break]
07:50:51.620 | [Break]
07:50:55.120 | [Break]
07:50:58.620 | [Break]
07:51:02.120 | [Break]
07:51:05.620 | [Break]
07:51:09.620 | [Break]
07:51:12.620 | [Break]
07:51:16.120 | [Break]
07:51:19.620 | [Break]
07:51:23.120 | [Break]
07:51:26.620 | [Break]
07:51:30.120 | [Break]
07:51:33.620 | [Break]
07:51:37.620 | [Break]
07:51:40.620 | [Break]
07:51:44.120 | [Break]
07:51:47.620 | [Break]
07:51:51.120 | [Break]
07:51:54.620 | [Break]
07:51:58.120 | [Break]
07:52:01.620 | [Break]
07:52:05.620 | [Break]
07:52:08.620 | [Break]
07:52:12.120 | [Break]
07:52:15.620 | [Break]
07:52:19.120 | [Break]
07:52:22.620 | [Break]
07:52:26.120 | [Break]
07:52:29.620 | [Break]
07:52:33.620 | [Break]
07:52:36.620 | [Break]
07:52:40.120 | [Break]
07:52:43.620 | [Break]
07:52:47.120 | [Break]
07:52:50.620 | [Break]
07:52:54.120 | [Break]
07:52:57.620 | [Break]
07:53:01.620 | [Break]
07:53:04.620 | [Break]
07:53:08.120 | [Break]
07:53:11.620 | [Break]
07:53:15.120 | [Break]
07:53:18.620 | [Break]
07:53:22.120 | [Break]
07:53:25.620 | [Break]
07:53:29.620 | [Break]
07:53:32.620 | [Break]
07:53:36.120 | [Break]
07:53:39.620 | [Break]
07:53:43.120 | [Break]
07:53:46.620 | [Break]
07:53:50.120 | [Break]
07:53:53.620 | [Break]
07:53:57.620 | [Break]
07:54:00.620 | [Break]
07:54:04.120 | [Break]
07:54:07.620 | [Break]
07:54:11.120 | [Break]
07:54:14.620 | [Break]
07:54:18.120 | [Break]
07:54:21.620 | [Break]
07:54:25.620 | [Break]
07:54:28.620 | [Break]
07:54:32.120 | [Break]
07:54:35.620 | [Break]
07:54:39.120 | [Break]
07:54:42.620 | [Break]
07:54:46.120 | [Break]
07:54:49.620 | [Break]
07:54:53.620 | [Break]
07:54:56.620 | [Break]
07:55:00.120 | [Break]
07:55:03.620 | [Break]
07:55:07.120 | [Break]
07:55:10.620 | [Break]
07:55:14.120 | [Break]
07:55:17.620 | [Break]
07:55:21.620 | [Break]
07:55:24.620 | [Break]
07:55:28.120 | [Break]
07:55:31.620 | [Break]
07:55:35.120 | [Break]
07:55:38.620 | [Break]
07:55:42.120 | [Break]
07:55:45.620 | [Break]
07:55:49.620 | [Break]
07:55:52.620 | [Break]
07:55:56.120 | [Break]
07:55:59.620 | [Break]
07:56:03.120 | [Break]
07:56:06.620 | [Break]
07:56:10.120 | [Break]
07:56:13.620 | [Break]
07:56:17.620 | [Break]
07:56:20.620 | [Break]
07:56:24.120 | [Break]
07:56:27.620 | [Break]
07:56:31.120 | [Break]
07:56:34.620 | [Break]
07:56:38.120 | [Break]
07:56:41.620 | [Break]
07:56:45.620 | [Break]
07:56:48.620 | [Break]
07:56:52.120 | [Break]
07:56:55.620 | [Break]
07:56:59.120 | [Break]
07:57:02.620 | [Break]
07:57:06.120 | [Break]
07:57:09.620 | [Break]
07:57:13.620 | [Break]
07:57:16.620 | [Break]
07:57:20.120 | [Break]
07:57:23.620 | [Break]
07:57:27.120 | [Break]
07:57:30.620 | [Break]
07:57:34.120 | [Break]
07:57:37.620 | [Break]
07:57:41.620 | [Break]
07:57:44.620 | [Break]
07:57:48.120 | [Break]
07:57:51.620 | [Break]
07:57:55.120 | [Break]
07:57:58.620 | [Break]
07:58:02.120 | [Break]
07:58:05.620 | [Break]
07:58:09.620 | [Break]
07:58:12.620 | [Break]
07:58:16.120 | [Break]
07:58:19.620 | [Break]
07:58:23.120 | [Break]
07:58:26.620 | [Break]
07:58:30.120 | [Break]
07:58:33.620 | [Break]
07:58:37.620 | [Break]
07:58:40.620 | [Break]
07:58:44.120 | [Break]
07:58:47.620 | [Break]
07:58:51.120 | [Break]
07:58:54.620 | [Break]
07:58:58.120 | [Break]
07:59:01.620 | [Break]
07:59:05.620 | [Break]
07:59:08.620 | [Break]
07:59:12.120 | [Break]
07:59:15.620 | [Break]
07:59:19.120 | [Break]
07:59:22.620 | [Break]
07:59:26.120 | [Break]
07:59:29.620 | [Break]
07:59:33.620 | [Break]
07:59:36.620 | [Break]
07:59:40.120 | [Break]
07:59:43.620 | [Break]
07:59:47.120 | [Break]
07:59:50.620 | [Break]
07:59:54.120 | [Break]
07:59:57.620 | [Break]
08:00:01.620 | [Break]
08:00:04.620 | [Break]
08:00:08.120 | [Break]
08:00:11.620 | [Break]
08:00:15.120 | [Break]
08:00:18.620 | [Break]
08:00:22.120 | [Break]
08:00:25.620 | [Break]
08:00:29.620 | [Break]
08:00:32.620 | [Break]
08:00:36.120 | [Break]
08:00:39.620 | [Break]
08:00:43.120 | [Break]
08:00:46.620 | [Break]
08:00:50.120 | [Break]
08:00:53.620 | [Break]
08:00:57.620 | [Break]
08:01:00.620 | [Break]
08:01:04.120 | [Break]
08:01:07.620 | [Break]
08:01:11.120 | [Break]
08:01:14.620 | [Break]
08:01:18.120 | [Break]
08:01:21.620 | [Break]
08:01:25.620 | [Break]
08:01:28.620 | [Break]
08:01:32.120 | [Break]
08:01:35.620 | [Break]
08:01:39.120 | [Break]
08:01:42.620 | [Break]
08:01:46.120 | [Break]
08:01:49.620 | [Break]
08:01:53.620 | [Break]
08:01:56.620 | [Break]
08:02:00.120 | [Break]
08:02:03.620 | [Break]
08:02:07.120 | [Break]
08:02:10.620 | [Break]
08:02:14.120 | [Break]
08:02:17.620 | [Break]
08:02:21.620 | [Break]
08:02:24.620 | [Break]
08:02:28.120 | [Break]
08:02:31.620 | [Break]
08:02:35.120 | [Break]
08:02:38.620 | [Break]
08:02:42.120 | [Break]
08:02:45.620 | [Break]
08:02:49.620 | [Break]
08:02:52.620 | [Break]
08:02:56.120 | [Break]
08:02:59.620 | [Break]
08:03:03.120 | [Break]
08:03:06.620 | [Break]
08:03:10.120 | [Break]
08:03:13.620 | [Break]
08:03:17.620 | [Break]
08:03:20.620 | [Break]
08:03:24.120 | [Break]
08:03:27.620 | [Break]
08:03:31.120 | [Break]
08:03:34.620 | [Break]
08:03:38.120 | [Break]
08:03:41.620 | [Break]
08:03:45.620 | [Break]
08:03:48.620 | [Break]
08:03:52.120 | [Break]
08:03:55.620 | [Break]
08:03:59.120 | [Break]
08:04:02.620 | [Break]
08:04:06.120 | [Break]
08:04:09.620 | [Break]
08:04:13.620 | [Break]
08:04:16.620 | [Break]
08:04:20.120 | [Break]
08:04:23.620 | [Break]
08:04:27.120 | [Break]
08:04:30.620 | [Break]
08:04:34.120 | [Break]
08:04:37.620 | [Break]
08:04:41.620 | [Break]
08:04:44.620 | [Break]
08:04:48.120 | [Break]
08:04:51.620 | [Break]
08:04:55.120 | [Break]
08:04:58.620 | [Break]
08:05:02.120 | [Break]
08:05:05.620 | [Break]
08:05:09.620 | [Break]
08:05:12.620 | [Break]
08:05:16.120 | [Break]
08:05:19.620 | [Break]
08:05:23.120 | [Break]
08:05:26.620 | [Break]
08:05:30.120 | [Break]
08:05:33.620 | [Break]
08:05:37.620 | [Break]
08:05:40.620 | [Break]
08:05:44.120 | [Break]
08:05:47.620 | [Break]
08:05:51.120 | [Break]
08:05:54.620 | [Break]
08:05:58.120 | [Break]
08:06:01.620 | [Break]
08:06:05.620 | [Break]
08:06:08.620 | [Break]
08:06:12.120 | [Break]
08:06:15.620 | [Break]
08:06:19.120 | [Break]
08:06:22.620 | [Break]
08:06:26.120 | [Break]
08:06:29.620 | [Break]
08:06:33.620 | [Break]
08:06:36.620 | [Break]
08:06:40.120 | [Break]
08:06:43.620 | [Break]
08:06:47.120 | [Break]
08:06:50.620 | [Break]
08:06:54.120 | [Break]
08:06:57.620 | [Break]
08:07:01.620 | [Break]
08:07:04.620 | [Break]
08:07:08.120 | [Break]
08:07:11.620 | [Break]
08:07:15.120 | [Break]
08:07:18.620 | [Break]
08:07:22.120 | [Break]
08:07:25.620 | [Break]
08:07:29.620 | [Break]
08:07:32.620 | [Break]
08:07:36.120 | [Break]
08:07:39.620 | [Break]
08:07:43.120 | [Break]
08:07:46.620 | [Break]
08:07:50.120 | [Break]
08:07:53.620 | [Break]
08:07:57.620 | [Break]
08:08:00.620 | [Break]
08:08:04.120 | [Break]
08:08:07.620 | [Break]
08:08:11.120 | [Break]
08:08:14.620 | [Break]
08:08:18.120 | [Break]
08:08:21.620 | [Break]
08:08:25.620 | [Break]
08:08:28.620 | [Break]
08:08:32.120 | [Break]
08:08:35.620 | [Break]
08:08:39.120 | [Break]
08:08:42.620 | [Break]
08:08:46.120 | [Break]
08:08:49.620 | [Break]
08:08:53.620 | [Break]
08:08:56.620 | [Break]
08:09:00.120 | [Break]
08:09:03.620 | [Break]
08:09:07.120 | [Break]
08:09:10.620 | [Break]
08:09:14.120 | [Break]
08:09:17.620 | [Break]
08:09:21.620 | [Break]
08:09:24.620 | [Break]
08:09:28.120 | [Break]
08:09:31.620 | [Break]
08:09:35.120 | [Break]
08:09:38.620 | [Break]
08:09:42.120 | [Break]
08:09:45.620 | [Break]
08:09:49.620 | [Break]
08:09:52.620 | [Break]
08:09:56.120 | [Break]
08:09:59.620 | [Break]
08:10:03.120 | [Break]
08:10:06.620 | [Break]
08:10:10.120 | [Break]
08:10:13.620 | [Break]
08:10:17.620 | [Break]
08:10:20.620 | [Break]
08:10:24.120 | [Break]
08:10:27.620 | [Break]
08:10:31.120 | [Break]
08:10:34.620 | [Break]
08:10:38.120 | [Break]
08:10:41.620 | [Break]
08:10:45.620 | [Break]
08:10:48.620 | [Break]
08:10:52.120 | [Break]
08:10:55.620 | [Break]
08:10:59.120 | [Break]
08:11:02.620 | [Break]
08:11:06.120 | [Break]
08:11:09.620 | [Break]
08:11:13.620 | [Break]
08:11:16.620 | [Break]
08:11:20.120 | [Break]
08:11:23.620 | [Break]
08:11:27.120 | [Break]
08:11:30.620 | [Break]
08:11:34.120 | [Break]
08:11:37.620 | [Break]
08:11:41.620 | [Break]
08:11:44.620 | [Break]
08:11:48.120 | [Break]
08:11:51.620 | [Break]
08:11:55.120 | [Break]
08:11:58.620 | [Break]
08:12:02.120 | [Break]
08:12:05.620 | [Break]
08:12:09.620 | [Break]
08:12:12.620 | [Break]
08:12:16.120 | [Break]
08:12:19.620 | [Break]
08:12:23.120 | [Break]
08:12:26.620 | [Break]
08:12:30.120 | [Break]
08:12:33.620 | [Break]
08:12:37.620 | [Break]
08:12:40.620 | [Break]
08:12:44.120 | [Break]
08:12:47.620 | [Break]
08:12:51.120 | [Break]
08:12:54.620 | [Break]
08:12:58.120 | [Break]
08:13:01.620 | [Break]
08:13:05.620 | [Break]
08:13:08.620 | [Break]
08:13:12.120 | [Break]
08:13:15.620 | [Break]
08:13:19.120 | [Break]
08:13:22.620 | [Break]
08:13:26.120 | [Break]
08:13:29.620 | [Break]
08:13:33.620 | [Break]
08:13:36.620 | [Break]
08:13:40.120 | [Break]
08:13:43.620 | [Break]
08:13:47.120 | [Break]
08:13:50.620 | [Break]
08:13:54.120 | [Break]
08:13:57.620 | [Break]
08:14:01.620 | [Break]
08:14:04.620 | [Break]
08:14:08.120 | [Break]
08:14:11.620 | [Break]
08:14:15.120 | [Break]
08:14:18.620 | [Break]
08:14:22.120 | [Break]
08:14:25.620 | [Break]
08:14:29.620 | [Break]
08:14:32.620 | [Break]
08:14:36.120 | [Break]
08:14:39.620 | [Break]
08:14:43.120 | [Break]
08:14:46.620 | [Break]
08:14:50.120 | [Break]
08:14:53.620 | [Break]
08:14:57.620 | [Break]
08:15:00.620 | [Break]
08:15:04.120 | [Break]
08:15:07.620 | [Break]
08:15:11.120 | [Break]
08:15:14.620 | [Break]
08:15:18.120 | [Break]
08:15:21.620 | [Break]
08:15:25.620 | [Break]
08:15:28.620 | [Break]
08:15:32.120 | [Break]
08:15:35.620 | [Break]
08:15:39.120 | [Break]
08:15:42.620 | [Break]
08:15:46.120 | [Break]
08:15:49.620 | [Break]
08:15:53.620 | [Break]
08:15:56.620 | [Break]
08:16:00.120 | [Break]
08:16:03.620 | [Break]
08:16:07.120 | [Break]
08:16:10.620 | [Break]
08:16:14.120 | [Break]
08:16:17.620 | [Break]
08:16:21.620 | [Break]
08:16:24.620 | [Break]
08:16:28.120 | [Break]
08:16:31.620 | [Break]
08:16:35.120 | [Break]
08:16:38.620 | [Break]
08:16:42.120 | [Break]
08:16:45.620 | [Break]
08:16:49.620 | [Break]
08:16:52.620 | [Break]
08:16:56.120 | [Break]
08:16:59.620 | [Break]
08:17:03.120 | [Break]
08:17:06.620 | [Break]
08:17:10.120 | [Break]
08:17:13.620 | [Break]
08:17:17.620 | [Break]
08:17:20.620 | [Break]
08:17:24.120 | [Break]
08:17:27.620 | [Break]
08:17:31.120 | [Break]
08:17:34.620 | [Break]
08:17:38.120 | [Break]
08:17:41.620 | [Break]
08:17:45.620 | [Break]
08:17:48.620 | [Break]
08:17:52.120 | [Break]
08:17:55.620 | [Break]
08:17:59.120 | [Break]
08:18:02.620 | [Break]
08:18:06.120 | [Break]
08:18:09.620 | [Break]
08:18:13.620 | [Break]
08:18:16.620 | [Break]
08:18:20.120 | [Break]
08:18:23.620 | [Break]
08:18:27.120 | [Break]
08:18:30.620 | [Break]
08:18:34.120 | [Break]
08:18:37.620 | [Break]
08:18:41.620 | [Break]
08:18:44.620 | [Break]
08:18:48.120 | [Break]
08:18:51.620 | [Break]
08:18:55.120 | [Break]
08:18:58.620 | [Break]
08:19:02.120 | [Break]
08:19:05.620 | [Break]
08:19:09.620 | [Break]
08:19:12.620 | [Break]
08:19:16.120 | [Break]
08:19:19.620 | [Break]
08:19:23.120 | [Break]
08:19:26.620 | [Break]
08:19:30.120 | [Break]
08:19:33.620 | [Break]
08:19:37.620 | [Break]
08:19:40.620 | [Break]
08:19:44.120 | [Break]
08:19:47.620 | [Break]
08:19:51.120 | [Break]
08:19:54.620 | [Break]
08:19:58.120 | [Break]
08:20:01.620 | [Break]
08:20:05.620 | [Break]
08:20:08.620 | [Break]
08:20:12.120 | [Break]
08:20:15.620 | [Break]
08:20:19.120 | [Break]
08:20:22.620 | [Break]
08:20:26.120 | [Break]
08:20:29.620 | [Break]
08:20:33.620 | [Break]
08:20:36.620 | [Break]
08:20:40.120 | [Break]
08:20:43.620 | [Break]
08:20:47.120 | [Break]
08:20:50.620 | [Break]
08:20:54.120 | [Break]
08:20:57.620 | [Break]
08:21:01.620 | [Break]
08:21:04.620 | [Break]
08:21:08.120 | [Break]
08:21:11.620 | [Break]
08:21:15.120 | [Break]
08:21:18.620 | [Break]
08:21:22.120 | [Break]
08:21:25.620 | [Break]
08:21:29.620 | [Break]
08:21:32.620 | [Break]
08:21:36.120 | [Break]
08:21:39.620 | [Break]
08:21:43.120 | [Break]
08:21:46.620 | [Break]
08:21:50.120 | [Break]
08:21:53.620 | [Break]
08:21:57.620 | [Break]
08:22:00.620 | [Break]
08:22:04.120 | [Break]
08:22:07.620 | [Break]
08:22:11.120 | [Break]
08:22:14.620 | [Break]
08:22:18.120 | [Break]
08:22:21.620 | [Break]
08:22:25.620 | [Break]
08:22:28.620 | [Break]
08:22:32.120 | [Break]
08:22:35.620 | [Break]
08:22:39.120 | [Break]
08:22:42.620 | [Break]
08:22:46.120 | [Break]
08:22:49.620 | [Break]
08:22:53.620 | [Break]
08:22:56.620 | [Break]
08:23:00.120 | [Break]
08:23:03.620 | [Break]
08:23:07.120 | [Break]
08:23:10.620 | [Break]
08:23:14.120 | [Break]
08:23:17.620 | [Break]
08:23:21.620 | [Break]
08:23:24.620 | [Break]
08:23:28.120 | [Break]
08:23:31.620 | [Break]
08:23:35.120 | [Break]
08:23:38.620 | [Break]
08:23:42.120 | [Break]
08:23:45.620 | [Break]
08:23:49.620 | [Break]
08:23:52.620 | [Break]
08:23:56.120 | [Break]
08:23:59.620 | [Break]
08:24:03.120 | [Break]
08:24:06.620 | [Break]
08:24:10.120 | [Break]
08:24:13.620 | [Break]
08:24:17.620 | [Break]
08:24:20.620 | [Break]
08:24:24.120 | [Break]
08:24:27.620 | [Break]
08:24:31.120 | [Break]
08:24:34.620 | [Break]
08:24:38.120 | [Break]
08:24:41.620 | [Break]
08:24:45.620 | [Break]
08:24:48.620 | [Break]
08:24:52.120 | [Break]
08:24:55.620 | [Break]
08:24:59.120 | [Break]
08:25:02.620 | [Break]
08:25:06.120 | [Break]
08:25:09.620 | [Break]
08:25:13.620 | [Break]
08:25:16.620 | [Break]
08:25:20.120 | [Break]
08:25:23.620 | [Break]
08:25:27.120 | [Break]
08:25:30.620 | [Break]
08:25:34.120 | [Break]
08:25:37.620 | [Break]
08:25:41.620 | [Break]
08:25:44.620 | [Break]
08:25:48.120 | [Break]
08:25:51.620 | [Break]
08:25:55.120 | [Break]
08:25:58.620 | [Break]
08:26:02.120 | [Break]
08:26:05.620 | [Break]
08:26:09.620 | [Break]
08:26:12.620 | [Break]
08:26:16.120 | [Break]
08:26:19.620 | [Break]
08:26:23.120 | [Break]
08:26:26.620 | [Break]
08:26:30.120 | [Break]
08:26:33.620 | [Break]
08:26:37.620 | [Break]
08:26:40.620 | [Break]
08:26:44.120 | [Break]
08:26:47.620 | [Break]
08:26:51.120 | [Break]
08:26:54.620 | [Break]
08:26:58.120 | [Break]
08:27:01.620 | [Break]
08:27:05.620 | [Break]
08:27:08.620 | [Break]
08:27:12.120 | [Break]
08:27:15.620 | [Break]
08:27:19.120 | [Break]
08:27:22.620 | [Break]
08:27:26.120 | [Break]
08:27:29.620 | [Break]
08:27:33.620 | [Break]
08:27:36.620 | [Break]
08:27:40.120 | [Break]
08:27:43.620 | [Break]
08:27:47.120 | [Break]
08:27:50.620 | [Break]
08:27:54.120 | [Break]
08:27:57.620 | [Break]
08:28:01.620 | [Break]
08:28:04.620 | [Break]
08:28:08.120 | [Break]
08:28:11.620 | [Break]
08:28:15.120 | [Break]
08:28:18.620 | [Break]
08:28:22.120 | [Break]
08:28:25.620 | [Break]
08:28:29.620 | [Break]
08:28:32.620 | [Break]
08:28:35.620 | [Break]
08:28:39.120 | [Break]
08:28:42.620 | [Break]
08:28:46.120 | [Break]
08:28:49.620 | [Break]
08:28:53.120 | [Break]
08:28:56.620 | [Break]
08:29:00.620 | [Break]
08:29:03.620 | [Break]
08:29:07.120 | [Break]
08:29:10.620 | [Break]
08:29:14.120 | [Break]
08:29:17.620 | [Break]
08:29:21.120 | [Break]
08:29:24.620 | [Break]
08:29:28.620 | [Break]
08:29:31.620 | [Break]
08:29:35.120 | [Break]
08:29:38.620 | [Break]
08:29:42.120 | [Break]
08:29:45.620 | [Break]
08:29:49.120 | [Break]
08:29:52.620 | [Break]
08:29:56.620 | [Break]
08:29:59.620 | [Break]
08:30:03.120 | [Break]
08:30:06.620 | [Break]
08:30:10.120 | [Break]
08:30:13.620 | [Break]
08:30:17.120 | [Break]
08:30:20.620 | [Break]
08:30:24.620 | [Break]
08:30:27.620 | [Break]
08:30:31.120 | [Break]
08:30:34.620 | [Break]
08:30:38.120 | [Break]
08:30:41.620 | [Break]
08:30:45.120 | [Break]
08:30:48.620 | [Break]
08:30:52.620 | [Break]
08:30:55.620 | [Break]
08:30:59.120 | [Break]
08:31:02.620 | [Break]
08:31:06.120 | [Break]
08:31:09.620 | [Break]
08:31:13.120 | [Break]
08:31:16.620 | [Break]
08:31:20.620 | [Break]
08:31:23.620 | [Break]
08:31:27.120 | [Break]
08:31:30.620 | [Break]
08:31:34.120 | [Break]
08:31:37.620 | [Break]
08:31:41.120 | [Break]
08:31:44.620 | [Break]
08:31:48.620 | [Break]
08:31:51.620 | [Break]
08:31:55.120 | [Break]
08:31:58.620 | [Break]
08:32:02.120 | [Break]
08:32:05.620 | [Break]
08:32:09.120 | [Break]
08:32:12.620 | [Break]
08:32:16.620 | [Break]
08:32:19.620 | [Break]
08:32:23.120 | [Break]
08:32:26.620 | [Break]
08:32:30.120 | [Break]
08:32:33.620 | [Break]
08:32:37.120 | [Break]
08:32:40.620 | [Break]
08:32:44.620 | [Break]
08:32:47.620 | [Break]
08:32:51.120 | [Break]
08:32:54.620 | [Break]
08:32:58.120 | [Break]
08:33:01.620 | [Break]
08:33:05.120 | [Break]
08:33:08.620 | [Break]
08:33:12.620 | [Break]
08:33:15.620 | [Break]
08:33:19.120 | [Break]
08:33:22.620 | [Break]
08:33:26.120 | [Break]
08:33:29.620 | [Break]
08:33:33.120 | [Break]
08:33:36.620 | [Break]
08:33:40.620 | [Break]
08:33:43.620 | [Break]
08:33:47.120 | [Break]
08:33:50.620 | [Break]
08:33:54.120 | [Break]
08:33:57.620 | [Break]
08:34:01.120 | [Break]
08:34:04.620 | [Break]
08:34:08.620 | [Break]
08:34:11.620 | [Break]
08:34:15.120 | [Break]
08:34:18.620 | [Break]
08:34:22.120 | [Break]
08:34:25.620 | [Break]
08:34:29.120 | [Break]
08:34:32.620 | [Break]
08:34:36.620 | [Break]
08:34:39.620 | [Break]
08:34:43.120 | [Break]
08:34:46.620 | [Break]
08:34:50.120 | [Break]
08:34:53.620 | [Break]
08:34:57.120 | [Break]
08:35:00.620 | [Break]
08:35:04.620 | [Break]
08:35:07.620 | [Break]
08:35:11.120 | [Break]
08:35:14.620 | [Break]
08:35:18.120 | [Break]
08:35:21.620 | [Break]
08:35:25.120 | [Break]
08:35:28.620 | [Break]
08:35:32.620 | [Break]
08:35:35.620 | [Break]
08:35:39.120 | [Break]
08:35:42.620 | [Break]
08:35:46.120 | [Break]
08:35:49.620 | [Break]
08:35:53.120 | [Break]
08:35:56.620 | [Break]
08:36:00.620 | [Break]
08:36:03.620 | [Break]
08:36:07.120 | [Break]
08:36:10.620 | [Break]
08:36:14.120 | [Break]
08:36:17.620 | [Break]
08:36:21.120 | [Break]
08:36:24.620 | [Break]
08:36:28.620 | [Break]
08:36:31.620 | [Break]
08:36:35.120 | [Break]
08:36:38.620 | [Break]
08:36:42.120 | [Break]
08:36:45.620 | [Break]
08:36:49.120 | [Break]
08:36:52.620 | [Break]
08:36:56.620 | [Break]
08:36:59.620 | [Break]
08:37:03.120 | [Break]
08:37:06.620 | [Break]
08:37:10.120 | [Break]
08:37:13.620 | [Break]
08:37:17.120 | [Break]
08:37:20.620 | [Break]
08:37:24.620 | [Break]
08:37:27.620 | [Break]
08:37:31.120 | [Break]
08:37:34.620 | [Break]
08:37:38.120 | [Break]
08:37:41.620 | [Break]
08:37:45.120 | [Break]
08:37:48.620 | [Break]
08:37:52.620 | [Break]
08:37:55.620 | [Break]
08:37:59.120 | [Break]
08:38:02.620 | [Break]
08:38:06.120 | [Break]
08:38:09.620 | [Break]
08:38:13.120 | [Break]
08:38:16.620 | [Break]
08:38:20.620 | [Break]
08:38:23.620 | [Break]
08:38:27.120 | [Break]
08:38:30.620 | [Break]
08:38:34.120 | [Break]
08:38:37.620 | [Break]
08:38:41.120 | [Break]
08:38:44.620 | [Break]
08:38:48.620 | [Break]
08:38:51.620 | [Break]
08:38:55.120 | [Break]
08:38:58.620 | [Break]
08:39:02.120 | [Break]
08:39:05.620 | [Break]
08:39:09.120 | [Break]
08:39:12.620 | [Break]
08:39:16.620 | [Break]
08:39:19.620 | [Break]
08:39:23.120 | [Break]
08:39:26.620 | [Break]
08:39:30.120 | [Break]
08:39:33.620 | [Break]
08:39:37.120 | [Break]
08:39:40.620 | [Break]
08:39:44.620 | [Break]
08:39:47.620 | [Break]
08:39:51.120 | [Break]
08:39:54.620 | [Break]
08:39:58.120 | [Break]
08:40:01.620 | [Break]
08:40:05.120 | [Break]
08:40:08.620 | [Break]
08:40:12.620 | [Break]
08:40:15.620 | [Break]
08:40:19.120 | [Break]
08:40:22.620 | [Break]
08:40:26.120 | [Break]
08:40:29.620 | [Break]
08:40:33.120 | [Break]
08:40:36.620 | [Break]
08:40:40.620 | [Break]
08:40:43.620 | [Break]
08:40:47.120 | [Break]
08:40:50.620 | [Break]
08:40:54.120 | [Break]
08:40:57.620 | [Break]
08:41:01.120 | [Break]
08:41:04.620 | [Break]
08:41:08.620 | [Break]
08:41:11.620 | [Break]
08:41:15.120 | [Break]
08:41:18.620 | [Break]
08:41:22.120 | [Break]
08:41:25.620 | [Break]
08:41:29.120 | [Break]
08:41:32.620 | [Break]
08:41:36.620 | [Break]
08:41:39.620 | [Break]
08:41:43.120 | [Break]
08:41:46.620 | [Break]
08:41:50.120 | [Break]
08:41:53.620 | [Break]
08:41:57.120 | [Break]
08:42:00.620 | [Break]
08:42:04.620 | [Break]
08:42:07.620 | [Break]
08:42:11.120 | [Break]
08:42:14.620 | [Break]
08:42:18.120 | [Break]
08:42:21.620 | [Break]
08:42:25.120 | [Break]
08:42:28.620 | [Break]
08:42:32.620 | [Break]
08:42:35.620 | [Break]
08:42:39.120 | [Break]
08:42:42.620 | [Break]
08:42:46.120 | [Break]
08:42:49.620 | [Break]
08:42:53.120 | [Break]
08:42:56.620 | [Break]
08:43:00.620 | [Break]
08:43:03.620 | [Break]
08:43:07.120 | [Break]
08:43:10.620 | [Break]
08:43:14.120 | [Break]
08:43:17.620 | [Break]
08:43:21.120 | [Break]
08:43:24.620 | [Break]
08:43:28.620 | [Break]
08:43:31.620 | [Break]
08:43:35.120 | [Break]
08:43:38.620 | [Break]
08:43:42.120 | [Break]
08:43:45.620 | [Break]
08:43:49.120 | [Break]
08:43:52.620 | [Break]
08:43:56.620 | [Break]
08:43:59.620 | [Break]
08:44:03.120 | [Break]
08:44:06.620 | [Break]
08:44:10.120 | [Break]
08:44:13.620 | [Break]
08:44:17.120 | [Break]
08:44:20.620 | [Break]
08:44:24.620 | [Break]
08:44:27.620 | [Break]
08:44:31.120 | [Break]
08:44:34.620 | [Break]
08:44:38.120 | [Break]
08:44:41.620 | [Break]
08:44:45.120 | [Break]
08:44:48.620 | [Break]
08:44:52.620 | [Break]
08:44:55.620 | [Break]
08:44:59.120 | [Break]
08:45:02.620 | [Break]
08:45:06.120 | [Break]
08:45:09.620 | [Break]
08:45:13.120 | [Break]
08:45:16.620 | [Break]
08:45:20.620 | [Break]
08:45:23.620 | [Break]
08:45:27.120 | [Break]
08:45:30.620 | [Break]
08:45:34.120 | [Break]
08:45:37.620 | [Break]
08:45:41.120 | [Break]
08:45:44.620 | [Break]
08:45:48.620 | [Break]
08:45:51.620 | [Break]
08:45:55.120 | [Break]
08:45:58.620 | [Break]
08:46:02.120 | [Break]
08:46:05.620 | [Break]
08:46:09.120 | [Break]
08:46:12.620 | [Break]
08:46:16.620 | [Break]
08:46:19.620 | [Break]
08:46:23.120 | [Break]
08:46:26.620 | [Break]
08:46:30.120 | [Break]
08:46:33.620 | [Break]
08:46:37.120 | [Break]
08:46:40.620 | [Break]
08:46:44.620 | [Break]
08:46:47.620 | [Break]
08:46:51.120 | [Break]
08:46:54.620 | [Break]
08:46:58.120 | [Break]
08:47:01.620 | [Break]
08:47:05.120 | [Break]
08:47:08.620 | [Break]
08:47:12.620 | [Break]
08:47:15.620 | [Break]
08:47:19.120 | [Break]
08:47:22.620 | [Break]
08:47:26.120 | [Break]
08:47:29.620 | [Break]
08:47:33.120 | [Break]
08:47:36.620 | [Break]
08:47:40.620 | [Break]
08:47:43.620 | [Break]
08:47:47.120 | [Break]
08:47:50.620 | [Break]
08:47:54.120 | [Break]
08:47:57.620 | [Break]
08:48:01.120 | [Break]
08:48:04.620 | [Break]
08:48:08.620 | [Break]
08:48:11.620 | [Break]
08:48:15.120 | [Break]
08:48:18.620 | [Break]
08:48:22.120 | [Break]
08:48:25.620 | [Break]
08:48:29.120 | [Break]
08:48:32.620 | [Break]
08:48:36.620 | [Break]
08:48:39.620 | [Break]
08:48:43.120 | [Break]
08:48:46.620 | [Break]
08:48:50.120 | [Break]
08:48:53.620 | [Break]
08:48:57.120 | [Break]
08:49:00.620 | [Break]
08:49:04.620 | [Break]
08:49:07.620 | [Break]
08:49:11.120 | [Break]
08:49:14.620 | [Break]
08:49:18.120 | [Break]
08:49:21.620 | [Break]
08:49:25.120 | [Break]
08:49:28.620 | [Break]
08:49:32.620 | [Break]
08:49:35.620 | [Break]
08:49:39.120 | [Break]
08:49:42.620 | [Break]
08:49:46.120 | [Break]
08:49:49.620 | [Break]
08:49:53.120 | [Break]
08:49:56.620 | [Break]
08:50:00.620 | [Break]
08:50:03.620 | [Break]
08:50:07.120 | [Break]
08:50:10.620 | [Break]
08:50:14.120 | [Break]
08:50:17.620 | [Break]
08:50:21.120 | [Break]
08:50:24.620 | [Break]
08:50:28.620 | [Break]
08:50:31.620 | [Break]
08:50:35.120 | [Break]
08:50:38.620 | [Break]
08:50:42.120 | [Break]
08:50:45.620 | [Break]
08:50:49.120 | [Break]
08:50:52.620 | [Break]
08:50:56.620 | [Break]
08:50:59.620 | [Break]
08:51:03.120 | [Break]
08:51:06.620 | [Break]
08:51:10.120 | [Break]
08:51:13.620 | [Break]
08:51:17.120 | [Break]
08:51:20.620 | [Break]