Small AI Teams with Huge Impact — Vik Paruchuri, Datalab

00:00:00.000 | Okay, my name is Vikas. I'm the CEO of Datalab, and today I'm going to talk about how we got to

00:00:19.460 | 40k GitHub stars, seven-figure ARR, and trained state-of-the-art models with a team of three.

00:00:25.660 | So I spent the last year training these models, like Brittany mentioned, Marker, and Surya.

00:00:30.960 | I also built repositories around them. I left my AI research job, and I started a company and raised a seed round.

00:00:37.180 | I did not get enough sleep. It's important. And this is Datalab. So we made our first hire in January.

00:00:44.620 | We're now a team of four. Faraz is new enough that he's not pictured. We've grown revenue 5x since January.

00:00:50.260 | We're at seven-figure ARR, and our customers include Tier 1 AI labs, universities, Fortune 500, and AI startups,

00:00:57.320 | including Gamma, who I used to make this presentation.

00:01:01.100 | So today's focus, I'm going to talk about how we've grown with a small team.

00:01:05.280 | I'm going to talk about my philosophy on building teams and why I think we're at kind of an inflection point

00:01:11.260 | in how we think about building teams. And I'm really going to talk about this idea that headcount does not equal productivity.

00:01:17.180 | There's like this really persistent notion in Silicon Valley that you raise money, you hire a bunch of people,

00:01:21.520 | and you build more. But it almost never, in my opinion, works out perfectly that way.

00:01:26.240 | All right. So my last company was called DataQuest. I'm very fond of the data prefix, apparently.

00:01:32.040 | And we scaled to 30 people and 4 million ARR bootstrapped during COVID. It was an online education startup.

00:01:37.860 | And then, unfortunately, we had to do two rounds of layoffs post-COVID when online education kind of tanked.

00:01:44.060 | We went from 30 to 15, and then again from 15 to 7. And it was obviously awful for the people we had to lay off.

00:01:50.100 | But I noticed something really interesting. Productivity and happiness increased a couple of months after both layoffs,

00:01:56.160 | to the point where we were actually much more productive after both cycles than we were at the beginning.

00:02:00.920 | And I started to wonder why that was. Like, how could reducing the team so much actually improve productivity?

00:02:07.540 | And I came up with these four hypotheses. One, we'd hired a lot of specialists.

00:02:12.420 | So as you scale, like Grant mentioned in the earlier talk, you end up building these very specialized functions and teams.

00:02:18.220 | And those specialists often can't flex across the company to solve the key issues of the company.

00:02:23.340 | Two, we were a remote team, which required a lot of intentional process and heavy syncing,

00:02:29.180 | which just eats into your time and just makes it really hard to get on the same page.

00:02:33.940 | Because of that, we had a lot of meeting overload. And especially once we got middle management in place,

00:02:38.640 | people whose job is kind of professionally to manage, we ended up with just a lot of meetings on people's calendars

00:02:44.260 | and not enough time to actually work.

00:02:45.700 | And then senior people, we hired kind of a mix of experience, like most companies do.

00:02:50.160 | We hired junior, mid-level, senior.

00:02:52.320 | And then senior people ended up getting kind of tied down and doing a lot of work to manage the more junior people.

00:02:58.780 | We actually had a case where we had a three-person team, and we cut it down to one.

00:03:02.620 | And the team actually got much more productive because it freed up the senior person's time.

00:03:08.280 | And kind of every company, I feel like, goes through this journey.

00:03:11.500 | There's this initial golden period when everyone is aligned, you're on the same page,

00:03:15.640 | you're building this amazing stuff, and that's really when you build the core thing of your company.

00:03:20.460 | Like Google with Search or Microsoft with Windows.

00:03:24.380 | It's kind of when you figure out your business model.

00:03:26.460 | And then you hire a bunch to fill out the edges around it.

00:03:29.020 | Like you hire a bunch of enterprise sales, you hire a bunch of marketing, you hire a bunch of engineers

00:03:33.740 | who are kind of in very small boxes to build very small features.

00:03:36.680 | I had a friend at Amazon who worked there for two years and built a shopping cart button.

00:03:41.060 | It's fine, right?

00:03:42.860 | But, I mean, at that scale of org, that's kind of the tiny box you get fit in.

00:03:46.640 | And you end up with a lot of bureaucracy, a lot of sinks, a lot of unclear priorities.

00:03:50.300 | And this pattern is unfortunately very common.

00:03:53.360 | But I started to think, what if that golden period just lasted forever?

00:03:56.860 | Why do you actually need to end it?

00:03:58.860 | And as I started working with Jeremy Howard at Answer AI,

00:04:03.680 | I got to understand his philosophy for building a company a little bit better.

00:04:07.360 | And his idea is basically hire less than 15 generalists.

00:04:11.520 | So people who can really do everything across the stack

00:04:14.400 | and really understand all aspects of the company,

00:04:16.740 | fill in the edges with AI and internal tooling.

00:04:19.340 | So Jeremy's invested a lot recently in fast HTML and things like Monster UI

00:04:24.200 | because he sees them as kind of building block libraries

00:04:26.680 | to really build out the other tools that the company's working on.

00:04:29.380 | And then use simple, boring tech, right?

00:04:31.640 | Like, you don't need to get too fancy.

00:04:33.060 | You don't need a Kubernetes cluster when you're a three-person company.

00:04:37.820 | But this unfortunately requires kind of a high cultural bar for folks.

00:04:42.240 | You need people who really want to and can understand everything you're doing.

00:04:47.040 | So you need engineers who talk to customers.

00:04:49.000 | You need go-to-market people who actually build.

00:04:51.840 | And that's not necessarily easy to find.

00:04:54.460 | You need high trust.

00:04:56.020 | So basically, you need people who are in it because they're building something together

00:05:00.580 | and not in it for other reasons like politics or personal advancement, etc.

00:05:05.100 | And everyone needs to really care about the customers and focus on them.

00:05:08.520 | I think these are the prerequisites for this kind of team working,

00:05:12.180 | this less-than-15-person team of generalists.

00:05:14.700 | I'll give you a quick example.

00:05:17.000 | So we recently trained a model, Surya OCR3.

00:05:20.220 | We recently shipped it but have not announced it yet.

00:05:23.380 | So it's 500 million parameters that supports 90 languages

00:05:25.740 | and 99% accuracy on our challenging internal benchmarks that include math.

00:05:29.920 | And it also does some features that no other model does,

00:05:33.420 | like character-level bounding boxes.

00:05:34.820 | It uses PDF text as grounding at a line level.

00:05:37.560 | So it was a very challenging model to train.

00:05:40.080 | And in order to do it, Tharun, who's a research engineer at Datalab,

00:05:43.700 | and I had to handle the entire process from end to end.

00:05:46.740 | So that included talking to customers, figuring out what they wanted.

00:05:49.900 | It included reading a bunch of papers

00:05:52.060 | and figuring out the right architecture, prototyping,

00:05:54.780 | doing the model training itself,

00:05:56.360 | which you always hope is 90% architecture,

00:05:58.740 | but it's always 90% data cleaning.

00:06:00.440 | So building a data pipeline library, building out the data sets.

00:06:03.980 | Then we had to write the inference code.

00:06:05.680 | So we had to connect it to our repos,

00:06:07.980 | get the inference written for all our customers,

00:06:09.700 | and then integrate it into our products.

00:06:11.540 | So this is a scope that in a big company,

00:06:13.640 | you'd probably have four, ten,

00:06:15.740 | you'd have a lot of teams doing this.

00:06:17.240 | And every time you hand off between teams in a traditional company,

00:06:20.760 | you lose context, right?

00:06:22.280 | The people who talk to the customers

00:06:23.800 | lossily communicate it to the people who build,

00:06:26.900 | who lossily communicate it to the people who train the model.

00:06:29.440 | It just gets, it becomes very inefficient.

00:06:31.940 | You end up eating a lot of time in just syncing context.

00:06:34.640 | It never gets fully synced.

00:06:36.200 | You're not able to build a great end-to-end experience as a result.

00:06:39.120 | And you have very slow feedback loops, right?

00:06:40.980 | Like you talk to a customer today,

00:06:42.280 | and it might impact your model training in months.

00:06:45.260 | Whereas if you have generalists who can work across the stack,

00:06:48.040 | you get seamless context, right?

00:06:49.720 | You never need to share context and do inefficient syncing.

00:06:52.180 | You get a really tight integration between all aspects of the company

00:06:55.540 | and very, very fast feedback cycles.

00:06:57.500 | And the reason we were able to do this is we used AI

00:07:00.440 | to take kind of the easy, low-leverage pieces of this,

00:07:03.660 | like building a data pipeline library

00:07:05.700 | or helping us really figure out how to integrate it into the API,

00:07:09.200 | whereas we did the higher-level work in each of these silos.

00:07:11.640 | So if you get one thing from this talk, this is the thing.

00:07:16.840 | More people does not equal more productivity.

00:07:18.720 | All right.

00:07:20.980 | And how do you make this work?

00:07:22.560 | Like how do you operationalize this?

00:07:24.000 | So the first thing you have to do is hire senior generalists.

00:07:27.080 | And senior to me does not mean years of experience.

00:07:29.320 | It really means maturity.

00:07:30.640 | You need people who can look at a problem and say,

00:07:32.900 | I'm going to figure out how to solve this.

00:07:34.360 | I'm going to do what it takes.

00:07:35.840 | And I really care enough to iterate with the customer to solve it.

00:07:38.640 | You need to avoid over-complication, right?

00:07:41.360 | Like I'm an engineer.

00:07:42.440 | A lot of us are engineers.

00:07:43.760 | We love over-complicating things.

00:07:45.420 | Like, hey, let me deploy this Kubernetes cluster

00:07:47.560 | and multi-stage pipeline to solve like a data extraction problem.

00:07:50.700 | But in reality, you need people who can go back

00:07:53.240 | and like kind of set aside the fixation on shiny tech

00:07:56.700 | and just do the simplest possible thing,

00:07:58.200 | which usually is I'm just going to write a shell script

00:08:00.240 | to run this on one machine.

00:08:01.340 | There's that famous like Hadoop versus shell script blog post

00:08:04.280 | from a few years ago

00:08:05.620 | when you like you could replace a whole Hadoop cluster

00:08:07.520 | with just like a 64-core machine.

00:08:09.440 | You need people who appreciate that ethos.

00:08:12.140 | And you need to work in person, I personally think.

00:08:14.660 | Remote is great for a lot of reasons,

00:08:17.000 | but it's not great for a small team that needs to move fast

00:08:20.060 | because you need to set up a lot of process.

00:08:22.140 | And process, to me, is kind of the death

00:08:24.660 | of this really fast collaboration and tight feedback loop.

00:08:27.440 | And then how do you do it architecturally?

00:08:30.320 | So I alluded to this a little bit,

00:08:33.060 | but you have to reuse components aggressively.

00:08:35.380 | So we reuse a lot of components

00:08:37.220 | between our on-prem and our API deployments.

00:08:39.260 | We keep our technology super simple.

00:08:41.060 | Like we don't use React.

00:08:42.260 | We don't use any fancy front-end frameworks.

00:08:43.920 | It's all server-rendered HTML

00:08:45.340 | with like light, HTMX, and Alpine.

00:08:47.520 | And then super clean modular code

00:08:49.720 | that AI can really add to very well.

00:08:52.100 | Like we re-architected our marker repo

00:08:54.020 | to be extremely modular

00:08:55.720 | and easy to work with and well-documented.

00:08:57.900 | And that makes it much easier

00:08:59.520 | to use AI to actually add to it.

00:09:02.640 | All right, so basically, keep everything simple.

00:09:04.880 | Code is clean, readable, maintainable.

00:09:06.760 | Architecture, as few moving pieces as possible.

00:09:09.800 | Minimize your surface area.

00:09:11.720 | And then process.

00:09:12.600 | Minimize bureaucracy, high trust, continuous discussions.

00:09:16.160 | If you feel like someone's going to need

00:09:18.960 | a lot of management, like don't hire them.

00:09:20.580 | Like you need people who can move fast

00:09:22.760 | without being managed.

00:09:25.820 | All right, and then how do you fill in

00:09:27.480 | the edges with models?

00:09:28.360 | So a challenge we're going to face as we scale

00:09:31.240 | is this idea that we're a document processing

00:09:33.480 | document intelligence company.

00:09:34.640 | And every customer has a slightly different way

00:09:37.060 | that they want to parse their docs.

00:09:38.140 | And if you go back kind of to the last generation

00:09:41.140 | of OCR companies, the way they solve this

00:09:43.180 | is they hired a bunch of forward-deployed engineers.

00:09:44.800 | You sat at a client site and you just kind of iterated

00:09:47.700 | with them until it was good enough.

00:09:48.980 | But in the future, you can really train a model

00:09:51.700 | to handle this complexity, right?

00:09:53.100 | Like we can train a model to essentially loop over

00:09:55.280 | customer outputs until it gets to the right state.

00:09:58.000 | So you can kind of replace that entire

00:09:59.880 | forward-deployed engineering side of the org.

00:10:02.280 | And then when does this model fail?

00:10:05.060 | Like we're early, right?

00:10:06.620 | I don't know exactly when this model falls apart.

00:10:08.880 | But Gamma, as we just saw, is a great example

00:10:12.080 | of a small team with very, very meaningful growth in ARR.

00:10:16.020 | I think the key is being able to say no, right?

00:10:18.340 | A lot of these edges are choices, right?

00:10:20.240 | You can choose to go hire a bunch of forward-deployed engineers

00:10:22.740 | and put them at your client sites,

00:10:23.920 | or you can choose to solve it a different way.

00:10:25.840 | And maybe that different way is slightly less efficient

00:10:27.980 | in terms of revenue, but it might be more efficient

00:10:30.780 | in terms of your long-term company trajectory and health.

00:10:33.220 | So it's really unknown if this will work forever,

00:10:36.760 | but in my opinion, it's your choice, right?

00:10:39.060 | Like you can choose to make this model work,

00:10:40.760 | or you can choose to do the less efficient,

00:10:43.320 | let's scale to hundreds of people model.

00:10:47.020 | All right, so LLMs are surprisingly bad

00:10:50.200 | at generating Venn diagrams,

00:10:51.360 | so that explains why this slide is not so well done.

00:10:55.340 | But basically, we have three core roles,

00:10:57.420 | and the responsibilities overlap a lot.

00:10:59.700 | So everybody talks to customers,

00:11:01.520 | everybody builds product in some way,

00:11:03.820 | and research engineer and full-stack engineer

00:11:06.320 | overlap quite a bit.

00:11:07.340 | And then go-to-market is really like

00:11:09.500 | your traditional kind of sales, marketing,

00:11:11.180 | support functions all collapsed

00:11:13.300 | into kind of like a more generalist role.

00:11:14.980 | And really, like, I feel like politics

00:11:19.720 | are the death of small teams, right?

00:11:21.320 | Like we want people who only care about the work,

00:11:23.340 | the people around them, and customers, right?

00:11:25.580 | Like minimal ego.

00:11:26.900 | You need some ego to kind of advance your own ideas,

00:11:29.900 | but not so much that you're willing to fight for them

00:11:32.140 | at the detriment of kind of the health of the company.

00:11:34.320 | We pay top-of-market salary, right?

00:11:36.920 | Like it's always weird to me

00:11:38.020 | that startups pay 150 or 200K

00:11:40.160 | when they've raised 20 million, right?

00:11:42.520 | Like you should be able to hire fewer people

00:11:45.300 | with higher salaries and get more done.

00:11:47.620 | At least that's what I've seen.

00:11:49.280 | Meaningful work.

00:11:50.460 | So big challenges in scope, right?

00:11:52.080 | Like if you come in, you get to work across the stack,

00:11:54.140 | you get to ship things end-to-end.

00:11:55.940 | And that's very exciting for some people.

00:11:57.880 | It's not exciting to other people,

00:11:59.580 | and they kind of self-select.

00:12:00.720 | And then you really need a good way

00:12:03.320 | to screen for low ego and GSD, right?

00:12:05.840 | Like you need people who will ship,

00:12:07.100 | not talk about shipping.

00:12:08.320 | And that's another downside of remote culture,

00:12:11.260 | in my opinion, it gets very hard to tell the two apart.

00:12:13.640 | And then patience, right?

00:12:15.520 | Like the worst hires I've personally made

00:12:17.840 | have all been when I thought

00:12:18.980 | I had to fill a role very quickly.

00:12:20.300 | All of my best hires have been when I said,

00:12:22.620 | okay, let me find the best person and hire them.

00:12:25.940 | Even though I may not necessarily have a role today,

00:12:28.320 | they're a great generalist.

00:12:29.500 | This is actually a big debate in NBA and NFL drafting too,

00:12:33.720 | like best player available versus drafting for fit.

00:12:37.720 | All right, so really,

00:12:38.840 | I think the thing to think about as you scale

00:12:40.560 | is like, how do we scale productivity, not headcount?

00:12:43.140 | And you can do that in a few ways, right?

00:12:45.040 | Like you can raise salary bands as the company grows.

00:12:47.280 | So you hire more and more experienced people

00:12:49.660 | into the same role.

00:12:50.600 | You can invest more in compute, right?

00:12:52.780 | Like a one researcher with access to eight GPUs

00:12:54.920 | is less productive than one with access to 64 GPUs.

00:12:57.980 | You can invest in AI tools that multiply productivity, right?

00:13:00.900 | There's so many tools out there now

00:13:02.320 | that are worth paying for

00:13:03.960 | that can abstract away a lot of these edges for you.

00:13:06.740 | And finally, I'd be remiss if I didn't say,

00:13:10.880 | if this culture sounds interesting to you,

00:13:12.680 | drop me a line.

00:13:13.600 | Those are all my socials.

00:13:15.060 | We'd love to chat.

00:13:16.080 | All right.

00:13:17.680 | Yes, I think we do the microphone for questions, right?

00:13:22.580 | Okay.

00:13:26.160 | So when you went from 30 to 15 and then to seven,

00:13:30.320 | I mean, my takeaway from this whole talk

00:13:31.840 | is like the human touch points

00:13:33.640 | are really what slow things down, right?

00:13:35.020 | Was there any additional focus

00:13:40.300 | on reducing the domains that you were focusing on

00:13:43.140 | or like your capability sets

00:13:44.600 | or it was like basically your same product offering

00:13:47.900 | just with less folks focused on it?

00:13:49.560 | Yeah, that's a really good question.

00:13:50.980 | So at a very high level, we offered the same product,

00:13:53.660 | but we cut some features that were less relevant.

00:13:55.980 | Like we'd built up a lot of those edges

00:13:58.240 | that you kind of like end up building over the years.

00:14:01.340 | And we ended up slicing a lot of those edges.

00:14:03.480 | So I think what happens when you hire a lot of people

00:14:05.940 | is you don't have enough work

00:14:07.280 | and you start making work for people, right?

00:14:09.020 | And they end up building all of these edges

00:14:10.860 | that actually aren't that useful to the customer.

00:14:12.520 | But when you have a tiny team,

00:14:13.820 | there's so much work

00:14:14.820 | that you actually have to ruthlessly prioritize.

00:14:16.700 | And I think you always want to be in that zone.

00:14:19.240 | And that's kind of where we ended up back.

00:14:25.800 | Oh, sorry.

00:14:27.900 | No worries.

00:14:28.580 | So it's a hypothetical question for you.

00:14:32.120 | So we take you and drop you in the middle

00:14:36.040 | of a giant company that's been around for 100 years,

00:14:38.980 | hundreds of thousands of employees,

00:14:40.880 | lots of bureaucracy, lots of ego.

00:14:42.900 | It's got super comfortable with revenue stream.

00:14:48.380 | And they're clearly folding over on themselves

00:14:51.220 | with too many people.

00:14:52.400 | How do you change that culture?

00:14:53.960 | Yeah, I'm not the right person for that.

00:14:56.520 | I've never done that before.

00:14:57.560 | I would say the people who want to change the culture,

00:15:02.080 | go start a small company and build the same thing,

00:15:04.140 | just build it better.

00:15:05.320 | That's a common pattern, right?

00:15:06.620 | That's a common disruption and growth cycle.

00:15:08.940 | I think that's the best way to do it.

00:15:10.800 | Once a culture gets ossified,

00:15:12.920 | I've worked at the State Department, Pepsi, UPS.

00:15:15.620 | Once a culture gets ossified enough,

00:15:17.480 | you're not going to change it.

00:15:18.600 | It just is what it is.

00:15:20.080 | Generally, with that pattern,

00:15:21.140 | what happens is these companies recognize

00:15:23.060 | that they're a target,

00:15:24.340 | and they start to buy up those small startups

00:15:26.300 | and crush them.

00:15:26.860 | Yeah, sometimes that happens.

00:15:28.640 | But Google is a great example

00:15:30.420 | of where that didn't happen, right?

00:15:31.760 | So you haven't talked about

00:15:35.780 | how you source these really good generalists.

00:15:39.320 | Yeah, that's a great question.

00:15:41.200 | Well, one way is this.

00:15:42.640 | Another way is just open source

00:15:47.340 | and Twitter are great ways to hire.

00:15:49.560 | A lot of best candidates

00:15:51.620 | have actually come from Twitter,

00:15:52.500 | which is weird.

00:15:53.040 | I refuse to call it X.

00:15:54.640 | It's still Twitter.

00:15:55.200 | But yeah, I don't have a great answer to that,

00:15:58.860 | but I think if you do good work

00:16:00.140 | and you put it out in public

00:16:01.460 | and you talk about how you're building,

00:16:02.880 | that seems to attract people

00:16:04.740 | who really care about this mission

00:16:05.980 | and want to build in the same way.

00:16:07.280 | At least that's been my experience.

00:16:08.800 | Thank you.

00:16:09.880 | Well, actually, it's related.

00:16:15.120 | So how do you structure

00:16:17.020 | your interview process and recruitment?

00:16:18.820 | How does it look like

00:16:20.720 | you maybe do a trial period?

00:16:22.860 | Yeah, that's a great question.

00:16:24.880 | So three steps.

00:16:26.460 | So step one is people come in,

00:16:28.700 | we do a short chat.

00:16:30.080 | It's really like talking to a peer.

00:16:31.540 | Like, here's a challenge I'm having.

00:16:33.920 | Let me talk it through with you

00:16:35.340 | and see if we can solve it together.

00:16:36.600 | If that goes well,

00:16:37.840 | step two is let's think of a project

00:16:39.940 | we can build together.

00:16:40.760 | So we do a paid project.

00:16:41.840 | It's usually around 10 hours.

00:16:43.020 | We pay $1,000.

00:16:43.980 | It sounds like a lot,

00:16:45.700 | but it's actually a tiny amount of money

00:16:46.980 | to figure out if someone's a fit or not.

00:16:49.320 | And then we review the project

00:16:50.640 | and if it's good,

00:16:51.340 | we come in and just do a culture fit.

00:16:52.680 | Like, how does it feel

00:16:53.740 | if we're all just interacting

00:16:54.760 | as humans and people

00:16:55.920 | and if it feels like a good fit,

00:16:57.900 | like, it's a hire.

00:16:58.460 | Yeah, and what is your success rate?

00:17:02.620 | There are maybe 10% of the people

00:17:04.780 | that goes through that process?

00:17:06.960 | Oh, that's an interesting question.

00:17:09.220 | Like, usually we don't,

00:17:10.720 | once we kind of get someone

00:17:12.720 | to the beginning of the process,

00:17:13.800 | we have high confidence they'll be good.

00:17:15.400 | Like, we don't want to waste anyone's time.

00:17:17.040 | But we probably,

00:17:18.760 | of the people we've interviewed,

00:17:19.840 | I think 40% have we've ended up hiring.

00:17:22.420 | Nice.

00:17:22.820 | Thank you.

00:17:23.660 | All right.

00:17:27.240 | I'm out of time.

00:17:28.060 | Thank you, folks.

00:17:28.660 | This was great.

00:17:29.340 | Thank you.

00:17:29.640 | Thank you.

00:17:30.540 | We'll see you next time.