back to index

Rajat Monga: TensorFlow | Lex Fridman Podcast #22


Chapters

0:0 Introduction
1:17 Early days of Google Brain
1:47 Early wins
3:36 Scaling
8:24 Google TensorFlow
12:11 Graph
13:19 Early days
14:46 Documentation
18:27 hobbyist perspective
19:30 tensorflow extended
20:38 deep learning
21:32 data sets
24:38 Karis API
26:2 Python and TensorFlow
27:6 Ecosystem
29:2 Enable Machine Learning
33:5 Challenges
35:29 Starting over
37:52 Competition
44:15 How does TensorFlow grow
47:49 Predicting the future
49:47 TensorFlow for beginners
51:50 Team cohesion
54:38 Hiring process
57:22 Culture fit

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Rajat Manga.
00:00:03.080 | He's an engineering director at Google,
00:00:04.920 | leading the TensorFlow team.
00:00:06.960 | TensorFlow is an open source library
00:00:09.160 | at the center of much of the work going on
00:00:11.000 | in the world in deep learning, both the cutting edge
00:00:13.480 | research and the large scale application
00:00:15.680 | of learning based approaches.
00:00:17.720 | But it's quickly becoming much more than a software library.
00:00:20.920 | It's now an ecosystem of tools for the deployment
00:00:23.760 | of machine learning in the cloud, on the phone,
00:00:25.720 | in the browser, on both generic and specialized hardware.
00:00:29.840 | TPU, GPU, and so on.
00:00:31.920 | Plus, there's a big emphasis on growing a passionate community
00:00:35.200 | of developers.
00:00:36.600 | Rajat, Jeff Dean, and a large team
00:00:38.880 | of engineers at Google Brain are working
00:00:40.920 | to define the future of machine learning with TensorFlow 2.0,
00:00:44.640 | which is now in alpha.
00:00:46.200 | I think the decision to open source TensorFlow
00:00:49.160 | is a definitive moment in the tech industry.
00:00:51.720 | It showed that open innovation can be successful
00:00:54.360 | and inspire many companies to open source their code
00:00:56.880 | to publish and, in general, engage
00:00:58.840 | in the open exchange of ideas.
00:01:01.200 | This conversation is part of the Artificial Intelligence
00:01:03.920 | podcast.
00:01:05.040 | If you enjoy it, subscribe on YouTube, iTunes,
00:01:07.840 | or simply connect with me on Twitter at Lex Friedman,
00:01:10.840 | spelled F-R-I-D. And now, here's my conversation
00:01:14.960 | with Rajat Manga.
00:01:17.920 | You were involved with Google Brain
00:01:19.720 | since its start in 2011 with Jeff Dean.
00:01:24.840 | It started with Disbelief, the proprietary machine learning
00:01:29.200 | library, and turned to TensorFlow in 2014,
00:01:32.800 | the open source library.
00:01:35.680 | So what were the early days of Google Brain like?
00:01:39.080 | What were the goals, the missions?
00:01:41.800 | How do you even proceed forward when
00:01:44.880 | there's so much possibilities before you?
00:01:47.720 | It was interesting back then when I started,
00:01:50.520 | or when we were even just talking about it.
00:01:55.320 | The idea of deep learning was interesting and intriguing
00:01:59.480 | in some ways.
00:02:00.400 | It hadn't yet taken off, but it held some promise.
00:02:04.840 | It had shown some very promising and early results.
00:02:08.640 | I think the idea where Andrew and Jeff had started
00:02:11.360 | was, what if we can take this, what people are doing
00:02:15.400 | in research, and scale it to what Google has in terms
00:02:19.160 | of the compute power, and also put that kind of data
00:02:23.960 | together, what does it mean?
00:02:25.280 | And so far, the results have been, if you scale the compute,
00:02:28.280 | scale the data, it does better.
00:02:30.160 | And would that work?
00:02:31.480 | And so that was the first year or two,
00:02:33.400 | can we prove that out, right?
00:02:35.080 | And with Disbelief, when we started the first year,
00:02:37.480 | we got some early wins, which is always great.
00:02:40.760 | - What were the wins like?
00:02:41.920 | What was the wins where you were,
00:02:44.120 | there's some promise to this, this is gonna be good?
00:02:46.600 | - I think there are two early wins where one was speech,
00:02:49.680 | that we collaborated very closely
00:02:51.400 | with the speech research team,
00:02:52.480 | who was also getting interested in this.
00:02:54.840 | And the other one was on images where we,
00:02:57.880 | the cat paper, as we call it,
00:02:59.480 | that was covered by a lot of folks.
00:03:03.160 | - And the birth of Google Brain was around neural networks.
00:03:07.040 | That was, so it was deep learning from the very beginning.
00:03:09.320 | That was the whole mission.
00:03:10.840 | So what would, in terms of scale,
00:03:15.040 | what was the sort of dream of what this could become?
00:03:20.040 | Like, were there echoes of this open source
00:03:23.120 | TensorFlow community that might be brought in?
00:03:26.260 | Was there a sense of TPUs?
00:03:28.640 | Was there a sense of like, machine learning
00:03:31.160 | is now gonna be at the core of the entire company?
00:03:33.720 | Is it going to grow into that direction?
00:03:36.040 | - Yeah, I think, so that was interesting.
00:03:38.320 | And like, if I think back to 2012 or 2011,
00:03:41.400 | and first was, can we scale it in the year
00:03:44.800 | or so we had started scaling it
00:03:46.440 | to hundreds and thousands of machines.
00:03:48.160 | In fact, we had some runs even going to 10,000 machines,
00:03:51.080 | and all of those shows great promise.
00:03:52.920 | In terms of machine learning at Google,
00:03:56.800 | the good thing was Google's been doing machine learning
00:03:58.780 | for a long time.
00:04:00.240 | Deep learning was new, but as we scaled this up,
00:04:03.760 | we showed that, yes, that was possible,
00:04:05.600 | and it was gonna impact lots of things.
00:04:07.840 | Like, we started seeing real products wanting to use this.
00:04:11.200 | Again, speech was the first.
00:04:12.760 | There were image things that photos came out of,
00:04:15.240 | and then many other products as well.
00:04:17.440 | So that was exciting.
00:04:18.940 | As we went into that a couple of years,
00:04:23.200 | externally also academia started to,
00:04:25.840 | there was lots of push on, okay, deep learning's interesting,
00:04:28.360 | we should be doing more, and so on.
00:04:30.640 | And so by 2014, we were looking at,
00:04:34.400 | okay, this is a big thing, it's gonna grow.
00:04:36.820 | And not just internally, externally as well.
00:04:39.480 | Yes, maybe Google's ahead of where everybody is,
00:04:42.320 | but there's a lot to do.
00:04:43.640 | So a lot of this started to make sense and come together.
00:04:46.720 | - So the decision to open source,
00:04:49.560 | I was just chatting with Chris Glattner about this,
00:04:52.240 | the decision to go open source with TensorFlow,
00:04:54.640 | I would say, for me personally,
00:04:57.080 | seems to be one of the big seminal moments
00:04:59.640 | in all of software engineering ever.
00:05:01.720 | I think that's when a large company like Google
00:05:04.640 | decides to take a large project
00:05:06.480 | that many lawyers might argue has a lot of IP,
00:05:10.840 | just decide to go open source with it,
00:05:12.940 | and in so doing, lead the entire world in saying,
00:05:15.280 | you know what, open innovation is a pretty powerful thing,
00:05:19.400 | and it's okay to do.
00:05:20.820 | That was, I mean, that's an incredible moment in time.
00:05:26.340 | So do you remember those discussions happening?
00:05:29.320 | Whether open source should be happening?
00:05:31.440 | What was that like?
00:05:32.720 | - I would say, I think, so the initial idea came from Jeff,
00:05:36.880 | who was a big proponent of this.
00:05:39.440 | I think it came off of two big things.
00:05:42.480 | One was research-wise, we were a research group.
00:05:46.320 | We were putting all our research out there,
00:05:49.640 | we were building on others' research,
00:05:51.720 | and we wanted to push the state of the art forward,
00:05:55.000 | and part of that was to share the research.
00:05:56.840 | That's how I think deep learning and machine learning
00:05:58.960 | has really grown so fast.
00:06:00.440 | So the next step was, okay, now,
00:06:03.360 | would software help with that?
00:06:05.360 | And it seemed like they were existing.
00:06:08.460 | A few libraries out there,
00:06:10.360 | Tiano being one, Torch being another, and a few others,
00:06:13.200 | but they were all done by academia,
00:06:15.520 | and so the level was significantly different.
00:06:18.080 | The other one was, from a software perspective,
00:06:22.040 | Google had done lots of software that we used internally,
00:06:27.040 | and we published papers.
00:06:29.120 | Often there was an open source project
00:06:31.720 | that came out of that,
00:06:32.640 | that somebody else picked up that paper and implemented,
00:06:35.440 | and they were very successful.
00:06:38.280 | Back then, it was like, okay, there's Hadoop,
00:06:41.460 | which has come off of tech that we've built.
00:06:44.160 | We know the tech we've built is way better
00:06:46.220 | for a number of different reasons.
00:06:47.880 | We've invested a lot of effort in that,
00:06:50.440 | and turns out we have Google Cloud,
00:06:54.320 | and we are now not really providing our tech,
00:06:57.520 | but we are saying, okay, we have Bigtable,
00:07:00.360 | which is the original thing.
00:07:02.040 | We are going to now provide HBase APIs on top of that,
00:07:04.720 | which isn't as good, but that's what everybody's used to.
00:07:07.480 | So there's this, like, can we make something that is better
00:07:10.960 | and really just provide?
00:07:12.320 | Helps the community in lots of ways,
00:07:14.320 | but also helps push a good standard forward.
00:07:18.320 | - So how does Cloud fit into that?
00:07:19.920 | There's a TensorFlow open source library,
00:07:22.680 | and how does the fact that you can use
00:07:26.280 | so many of the resources that Google provides
00:07:28.240 | and the Cloud fit into that strategy?
00:07:31.480 | - So TensorFlow itself is open,
00:07:33.600 | and you can use it anywhere, right?
00:07:34.920 | And we want to make sure that continues to be the case.
00:07:38.360 | On Google Cloud, we do make sure that
00:07:41.720 | there's lots of integrations with everything else,
00:07:43.840 | and we want to make sure that it works
00:07:45.400 | really, really well there.
00:07:47.320 | - You're leading the TensorFlow effort.
00:07:50.400 | Can you tell me the history and the timeline
00:07:51.880 | of TensorFlow project in terms of major design decisions,
00:07:55.880 | so like the open source decision,
00:07:58.160 | but really, you know, what to include and not?
00:08:01.600 | There's this incredible ecosystem
00:08:03.200 | that I'd like to talk about.
00:08:04.960 | There's all these parts, but what if you just,
00:08:07.960 | some sample moments that defined
00:08:12.960 | what TensorFlow eventually became through its,
00:08:16.000 | I don't know if you're allowed to say history when it's,
00:08:19.520 | but in deep learning, everything moves so fast
00:08:21.320 | in just a few years, is there any history?
00:08:23.480 | - Yes, yes.
00:08:24.880 | So looking back, we were building TensorFlow,
00:08:29.800 | I guess we open sourced it in 2015, November 2015.
00:08:34.280 | We started on it in summer of 2014, I guess.
00:08:38.640 | And somewhere like three to six, late 2014,
00:08:43.000 | by then we had decided that,
00:08:44.920 | okay, there's a high likelihood we'll open source it.
00:08:47.120 | So we started thinking about that
00:08:48.920 | and making sure we're heading down that path.
00:08:51.520 | At that point, by that point, we had seen a few,
00:08:57.320 | you know, lots of different use cases at Google.
00:08:59.320 | So there were things like, okay,
00:09:01.000 | yes, you want to run it at large scale in the data center.
00:09:04.200 | Yes, we need to support different kind of hardware.
00:09:07.560 | We had GPUs at that point,
00:09:09.440 | we had our first GPU at that point,
00:09:11.880 | or was about to come out, you know, roughly around that time.
00:09:15.720 | So the design sort of included those.
00:09:18.720 | We had started to push on mobile.
00:09:21.800 | So we were running models on mobile.
00:09:24.920 | At that point, people were customizing code.
00:09:28.160 | So we wanted to make sure TensorFlow
00:09:29.520 | could support that as well,
00:09:30.720 | so that that sort of became part of that overall design.
00:09:35.280 | - When you say mobile,
00:09:36.560 | you mean like pretty complicated algorithms
00:09:38.640 | running on the phone?
00:09:40.040 | - That's correct.
00:09:40.880 | So when you have a model that you deploy on the phone
00:09:44.320 | and run it the right--
00:09:45.320 | - So already at that time,
00:09:46.400 | there was ideas of running machine learning on the phone.
00:09:48.840 | - That's correct.
00:09:49.680 | We already had a couple of products
00:09:51.440 | that were doing that by then.
00:09:53.320 | And in those cases, we had basically customized
00:09:56.440 | handcrafted code or some internal libraries that we're using.
00:10:00.200 | - So I was actually at Google during this time
00:10:02.600 | in a parallel, I guess, universe,
00:10:04.600 | but we were using Theano and Caffe.
00:10:07.200 | - Yeah.
00:10:08.040 | - Was there some degree to which you were balancing,
00:10:11.640 | like trying to see what Caffe was offering people,
00:10:15.560 | trying to see what Theano was offering,
00:10:18.040 | that you want to make sure you're delivering
00:10:20.000 | on whatever that is, perhaps the Python part of thing,
00:10:23.760 | maybe did that influence any design decisions?
00:10:27.560 | - Totally.
00:10:28.400 | So when we built this belief,
00:10:29.640 | and some of that was in parallel
00:10:31.640 | with some of these libraries coming up,
00:10:33.440 | I mean, Theano itself is older,
00:10:35.400 | but we were building this belief focused on
00:10:40.520 | our internal thing because our systems were very different.
00:10:43.000 | By the time we got to this,
00:10:44.120 | we looked at a number of libraries that were out there.
00:10:47.160 | Theano, there were folks in the group
00:10:49.320 | who had experience with Torch, with Lua.
00:10:52.160 | There were folks here who had seen Caffe.
00:10:54.800 | I mean, actually, Yang Jing was here as well.
00:10:57.560 | There's, what other libraries?
00:11:03.040 | I think we looked at a number of things.
00:11:04.960 | Might even have looked at JNR back then.
00:11:06.880 | I'm trying to remember if it was there.
00:11:09.440 | In fact, yeah, we did discuss ideas around,
00:11:12.080 | okay, should we have a graph or not?
00:11:14.240 | And they were, so putting all these together
00:11:19.360 | was definitely, you know,
00:11:20.520 | they were key decisions that we wanted.
00:11:22.680 | We had seen limitations in our prior disbelief things.
00:11:27.280 | A few of them were just in terms of
00:11:30.960 | research was moving so fast, we wanted the flexibility.
00:11:33.800 | The hardware was changing fast,
00:11:36.400 | we expected to change that,
00:11:37.800 | so that those probably were two things.
00:11:39.920 | And yeah, I think the flexibility in terms of
00:11:43.520 | being able to express all kinds of crazy things
00:11:45.360 | was definitely a big one then.
00:11:47.000 | - So what, the graph decisions,
00:11:48.680 | without moving towards TensorFlow 2.0,
00:11:52.440 | there's more, by default, there'll be eager execution.
00:11:56.760 | So sort of hiding the graph a little bit
00:11:59.200 | because it's less intuitive in terms of
00:12:01.920 | the way people develop and so on.
00:12:03.600 | What was that discussion like in terms of using graphs?
00:12:06.760 | It seemed, it's kind of the Theano way,
00:12:09.360 | did it seem the obvious choice?
00:12:11.640 | - So I think where it came from was
00:12:14.400 | our disbelief had a graph-like thing as well.
00:12:17.680 | A much more, it wasn't a general graph,
00:12:19.800 | it was more like a straight line thing.
00:12:21.880 | More like what you might think of CAFE,
00:12:25.080 | I guess, in that sense.
00:12:26.440 | But the graph was, and we always cared
00:12:30.040 | about the production stuff.
00:12:31.160 | Like even with disbelief, you were deploying
00:12:32.560 | a whole bunch of stuff in production.
00:12:34.480 | So graph did come from that when we thought of,
00:12:37.480 | okay, should we do that in Python?
00:12:39.200 | And we experimented with some ideas where
00:12:41.800 | it looked a lot simpler to use,
00:12:43.880 | but not having a graph meant,
00:12:46.760 | okay, how do you deploy now?
00:12:47.960 | So that was probably what tilted the balance for us
00:12:51.200 | and eventually we ended up with a graph.
00:12:52.960 | - And I guess the question there is, did you,
00:12:55.400 | I mean, so production seems to be
00:12:57.400 | the really good thing to focus on,
00:12:59.880 | but did you even anticipate the other side of it
00:13:02.480 | where there could be, what is it,
00:13:04.600 | what are the numbers, something crazy,
00:13:06.640 | 41 million downloads?
00:13:09.000 | - Yep.
00:13:09.840 | (laughing)
00:13:12.080 | - I mean, was that even like a possibility
00:13:15.480 | in your mind that it would be as popular as it became?
00:13:19.200 | - So I think we did see a need for this
00:13:23.240 | a lot from the research perspective
00:13:27.600 | and like early days of deep learning in some ways.
00:13:30.960 | 41 million, no, I don't think I imagined this number then.
00:13:35.560 | It seemed like there's a potential future
00:13:41.720 | where lots more people would be doing this
00:13:43.800 | and how do we enable that?
00:13:45.720 | I would say this kind of growth,
00:13:48.160 | I probably started seeing somewhat after the open sourcing
00:13:52.680 | where it was like, okay, deep learning
00:13:55.840 | is actually growing way faster
00:13:57.920 | for a lot of different reasons
00:13:59.280 | and we are in just the right place to push on that
00:14:02.800 | and leverage that and deliver on lots of things
00:14:06.160 | that people want.
00:14:07.520 | - So what changed once you open sourced?
00:14:09.800 | Like how this incredible amount of attention
00:14:13.400 | from a global population of developers,
00:14:16.520 | how did the project start changing?
00:14:18.240 | I don't even actually remember during those times.
00:14:22.240 | I know looking now, there's really good documentation,
00:14:24.600 | there's an ecosystem of tools,
00:14:26.640 | there's a community, there's a YouTube channel now, right?
00:14:29.800 | - Yeah. (laughing)
00:14:31.200 | - It's very community driven.
00:14:33.840 | Back then, I guess 0.1 version,
00:14:37.600 | is that the version?
00:14:39.840 | - I think we called it 0.6 or five,
00:14:42.160 | something like that, I forget what that is.
00:14:43.760 | - What changed leading into 1.0?
00:14:46.200 | - It's interesting, I think we've gone through
00:14:50.440 | a few things there.
00:14:51.680 | When we started out, when we first came out,
00:14:53.720 | people loved the documentation we have
00:14:56.120 | because it was just a huge step up from everything else
00:14:58.880 | because all of those were academic projects,
00:15:00.480 | people doing, who don't think about documentation.
00:15:03.400 | I think what that changed was,
00:15:07.000 | instead of deep learning being a research thing,
00:15:10.400 | some people who were just developers
00:15:12.600 | could now suddenly take this out
00:15:14.680 | and do some interesting things with it, right?
00:15:16.960 | Who had no clue what machine learning was before then.
00:15:20.280 | And that, I think really changed how things
00:15:23.280 | started to scale up in some ways and pushed on it.
00:15:26.720 | Over the next few months as we looked at,
00:15:30.400 | how do we stabilize things,
00:15:32.000 | as we look at not just researchers,
00:15:33.880 | now we want stability, people want to deploy things,
00:15:36.520 | that's how we started planning for 1.0.
00:15:39.000 | And there are certain needs for that perspective,
00:15:42.200 | and so again, documentation comes up,
00:15:44.360 | designs, more kinds of things to put that together.
00:15:48.200 | And so that was exciting to get that to a stage
00:15:52.240 | where more and more enterprises wanted to buy in
00:15:55.400 | and really get behind that.
00:15:57.760 | And I think post 1.0 and over the next few releases,
00:16:02.680 | that enterprise adoption also started to take off.
00:16:05.280 | I would say between the initial release and 1.0,
00:16:08.000 | it was, okay, researchers, of course,
00:16:11.080 | then a lot of hobbies and early interest,
00:16:13.760 | people excited about this who started to get on board,
00:16:15.960 | and then over the 1.x thing, lots of enterprises.
00:16:19.040 | - I imagine anything that's below 1.0
00:16:23.840 | gets pressure to be,
00:16:25.960 | enterprise probably wants something that's stable.
00:16:28.040 | - Exactly.
00:16:28.880 | - And do you have a sense now that TensorFlow is stable?
00:16:33.320 | Like it feels like deep learning in general
00:16:35.560 | is extremely dynamic field.
00:16:37.720 | There's so much changing.
00:16:39.000 | TensorFlow has been growing incredibly.
00:16:43.400 | You have a sense of stability at the helm of it?
00:16:46.760 | I mean, I know you're in the midst of it, but--
00:16:48.400 | - Yeah, I think in the midst of it,
00:16:51.680 | it's often easy to forget what an enterprise wants
00:16:55.120 | and what some of the people on that side want.
00:16:58.800 | There are still people running models
00:17:00.440 | that are three years old, four years old,
00:17:02.680 | so Inception is still used by tons of people.
00:17:06.040 | Even ResNet-50 is what, a couple of years old now or more,
00:17:08.960 | but there are tons of people who use that,
00:17:10.920 | and they're fine.
00:17:12.240 | They don't need the last couple of bits of performance
00:17:15.320 | or quality, they want some stability
00:17:17.720 | and things that just work.
00:17:19.640 | And so there is value in providing that
00:17:22.240 | with that kind of stability and making it really simpler,
00:17:25.240 | because that allows a lot more people to access it.
00:17:27.840 | And then there's the research crowd which wants,
00:17:31.240 | okay, they wanna do these crazy things
00:17:33.080 | exactly like you're saying, right?
00:17:34.320 | Not just deep learning in the straight-up models
00:17:37.080 | that used to be there, they want RNNs,
00:17:40.640 | and even RNNs are maybe old, they are transformers now,
00:17:43.480 | and now it needs to combine with RL and GANs and so on.
00:17:48.480 | So there's definitely that area,
00:17:51.200 | the boundary that's shifting and pushing
00:17:53.360 | the state of the art,
00:17:55.200 | but I think there's more and more of the past
00:17:57.200 | that's much more stable,
00:17:59.720 | and even stuff that was two, three years old
00:18:02.720 | is very, very usable by lots of people.
00:18:04.960 | So that part makes it a lot easier.
00:18:07.480 | - So I imagine, maybe you can correct me if I'm wrong,
00:18:09.840 | one of the biggest use cases is essentially
00:18:12.440 | taking something like ResNet-50
00:18:14.440 | and doing some kind of transfer learning
00:18:17.280 | on a very particular problem that you have.
00:18:19.600 | It's basically probably what majority of the world does.
00:18:23.120 | And you wanna make that as easy as possible.
00:18:27.400 | - So I would say, for the hobbyist perspective,
00:18:30.480 | that's the most common case, right?
00:18:32.840 | In fact, the apps on phones and stuff that you'll see,
00:18:35.440 | the early ones, that's the most common case.
00:18:37.720 | I would say there are a couple of reasons for that.
00:18:40.360 | One is that everybody talks about that.
00:18:43.520 | It looks great on slides.
00:18:45.840 | - Yeah, it's visual. - That's a part
00:18:46.920 | of the presentation, yeah, exactly.
00:18:48.940 | What enterprises want is, that is part of it,
00:18:53.200 | but that's not the big thing.
00:18:54.520 | Enterprises really have data
00:18:56.200 | that they wanna make predictions on.
00:18:58.120 | This is often what they used to do
00:19:00.440 | with the people who were doing ML,
00:19:01.880 | was just regression models, linear regression,
00:19:04.360 | logistic regression, linear models,
00:19:06.520 | or maybe gradient-boosted trees and so on.
00:19:09.880 | Some of them still benefit from deep learning,
00:19:11.820 | but they weren't that, that's the bread and butter,
00:19:14.520 | like the structured data and so on.
00:19:16.380 | So depending on the audience you look at,
00:19:18.280 | they're a little bit different.
00:19:19.600 | - And they just have, I mean, the best of enterprise
00:19:23.420 | probably just has a very large dataset,
00:19:26.520 | or deep learning can probably shine.
00:19:28.680 | - That's correct, that's right.
00:19:30.280 | And then the, I think the other pieces that they want,
00:19:33.280 | again, with 2.0, or the developer summit we put together,
00:19:36.440 | is the whole TensorFlow Extended piece,
00:19:39.040 | which is the entire pipeline.
00:19:40.640 | They care about stability across doing their entire thing.
00:19:43.600 | They want simplicity across the entire thing.
00:19:46.280 | I don't need to just train a model.
00:19:47.720 | I need to do that every day again, over and over again.
00:19:51.320 | - I wonder to which degree you have a role in,
00:19:54.340 | I don't know, so I teach a course on deep learning.
00:19:57.080 | I have people like lawyers come up to me and say,
00:20:00.740 | you know, say, "When is machine learning gonna enter legal,
00:20:04.140 | "the legal realm?"
00:20:05.600 | The same thing in all kinds of disciplines,
00:20:09.480 | immigration, insurance.
00:20:13.760 | Often when I see what it boils down to is,
00:20:16.360 | these companies are often a little bit old school
00:20:19.480 | in the way they organize the data.
00:20:20.880 | So the data is just not ready yet, it's not digitized.
00:20:24.040 | Do you also find yourself being in the role of
00:20:26.880 | an evangelist for, like,
00:20:29.320 | let's get, organize your data, folks,
00:20:33.120 | and then you'll get the big benefit of TensorFlow.
00:20:35.520 | Do you get those, have those conversations?
00:20:38.040 | - Yeah, yeah, I get all kinds of questions there from,
00:20:42.340 | okay, what can I, what do I need to make this work, right?
00:20:47.660 | Do we really need deep learning?
00:20:50.860 | I mean, there are all these things,
00:20:52.120 | I already used this linear model, why would this help?
00:20:55.240 | I don't have enough data, let's say, you know,
00:20:57.200 | or I wanna use machine learning,
00:21:00.040 | but I have no clue where to start.
00:21:01.800 | So it varies, back to all the way to the experts
00:21:04.980 | who wise for very specific things, so it's interesting.
00:21:08.600 | - Is there a good answer?
00:21:09.640 | It boils down to, oftentimes, digitizing data.
00:21:12.520 | So whatever you want automated,
00:21:14.480 | whatever data you want to make prediction based on,
00:21:17.560 | you have to make sure that it's in an organized form.
00:21:20.720 | And you've, like, within the TensorFlow ecosystem,
00:21:24.000 | there's now, you're providing more and more datasets
00:21:26.560 | and more and more pre-trained models.
00:21:28.960 | Are you finding yourself also the organizer of datasets?
00:21:32.400 | - Yes, I think with TensorFlow datasets
00:21:34.520 | that we just released, that's definitely come up
00:21:37.520 | where people want these datasets, can we organize them
00:21:40.120 | and can we make that easier?
00:21:41.440 | So that's definitely one important thing.
00:21:45.320 | The other related thing I would say is I often tell people,
00:21:47.680 | you know what, don't think of the most fanciest thing
00:21:50.960 | that the newest model that you see.
00:21:53.320 | Make something very basic work and then you can improve it.
00:21:56.400 | There's just lots of things you can do with it.
00:21:58.920 | - Yeah, start with the basics, true.
00:22:00.640 | One of the big things that makes TensorFlow
00:22:03.280 | even more accessible was the appearance,
00:22:06.120 | whenever that happened, of Keras,
00:22:08.360 | the Keras standard, sort of outside of TensorFlow.
00:22:12.400 | I think it was Keras on top of Theano at first,
00:22:17.760 | only, and then Keras became on top of TensorFlow.
00:22:22.480 | Do you know when Keras chose to also add TensorFlow
00:22:27.480 | as a backend, was it just the community
00:22:32.320 | that drove that initially?
00:22:33.960 | Do you know if there was discussions, conversations?
00:22:37.000 | - Yeah, so Francois started the Keras project
00:22:40.960 | before he was at Google and the first thing was Theano.
00:22:44.560 | I don't remember if that was after TensorFlow
00:22:47.160 | was created or way before.
00:22:48.480 | And then at some point when TensorFlow
00:22:52.080 | started becoming popular, there were enough similarities
00:22:54.200 | that he decided to, okay, create this interface
00:22:56.360 | and put TensorFlow as a backend.
00:22:58.200 | I believe that might still have been before
00:23:01.200 | he joined Google, so we weren't really talking about that.
00:23:06.200 | He decided on his own and thought that was interesting
00:23:09.760 | and relevant to the community.
00:23:11.320 | In fact, I didn't find out about him being at Google
00:23:17.120 | until a few months after he was here.
00:23:19.680 | He was working on some research ideas
00:23:21.880 | and doing Keras on his nights and weekends project.
00:23:24.520 | - Oh, interesting.
00:23:25.360 | So he wasn't part of the TensorFlow.
00:23:28.560 | He didn't join initially.
00:23:29.760 | - He joined research and he was doing some amazing research.
00:23:32.320 | He has some papers on that and research.
00:23:35.480 | He's a great researcher as well.
00:23:37.120 | And at some point we realized, oh,
00:23:40.640 | he's doing this good stuff.
00:23:42.480 | People seem to like the API and he's right here.
00:23:45.440 | So we talked to him and he said,
00:23:47.760 | okay, why don't I come over to your team
00:23:50.640 | and work with you for a quarter
00:23:52.840 | and let's make that integration happen.
00:23:55.560 | And we talked to his manager and he said,
00:23:56.880 | sure, my quarter's fine.
00:23:58.600 | And that quarter's been something like two years now.
00:24:02.480 | (laughing)
00:24:03.400 | So he's fully on this.
00:24:05.120 | - So Keras got integrated into TensorFlow
00:24:09.680 | like in a deep way.
00:24:12.040 | And now with 2.0, TensorFlow 2.0,
00:24:15.240 | sort of Keras is kind of the recommended way
00:24:18.760 | for a beginner to interact with TensorFlow.
00:24:21.720 | Which makes that initial sort of transfer learning
00:24:24.640 | or the basic use cases, even for enterprise,
00:24:28.080 | super simple, right?
00:24:29.320 | - That's correct.
00:24:30.440 | - So what was that decision like?
00:24:32.040 | That seems like,
00:24:32.880 | that's kind of a bold decision as well.
00:24:37.760 | - We did spend a lot of time thinking about that one.
00:24:41.240 | We had a bunch of APIs, some built by us.
00:24:46.040 | There was a parallel layers API that we were building.
00:24:48.800 | And when we decided to do Keras in parallel,
00:24:51.600 | so there were like, okay, two things that we are looking at.
00:24:54.440 | And the first thing we was trying to do
00:24:55.960 | is just have them look similar,
00:24:58.240 | like be as integrated as possible,
00:25:00.120 | share all of that stuff.
00:25:02.200 | There were also like three other APIs
00:25:04.040 | that others had built over time
00:25:05.880 | because we didn't have a standard one.
00:25:09.080 | But one of the messages that we kept hearing
00:25:11.480 | from the community, okay, which one do we use?
00:25:13.240 | And they kept saying like, okay,
00:25:14.480 | here's a model in this one,
00:25:15.600 | and here's a model in this one, which should I pick?
00:25:18.880 | So that's sort of like, okay,
00:25:20.960 | we had to address that straight on with 2.0.
00:25:24.080 | The whole idea was we need to simplify,
00:25:26.360 | we had to pick one.
00:25:27.400 | Based on where we were, we were like, okay,
00:25:31.240 | let's see what are the people like.
00:25:35.680 | And Keras was clearly one that lots of people loved.
00:25:39.320 | There were lots of great things about it.
00:25:41.640 | So we settled on that.
00:25:43.920 | - Organically, that's kind of the best way to do it.
00:25:46.440 | It was great.
00:25:47.520 | It was surprising, nevertheless,
00:25:48.760 | to sort of bring in an outside.
00:25:51.120 | I mean, there was a feeling like Keras might
00:25:54.080 | be almost like a competitor
00:25:55.440 | in a certain kind of a two tensor flow.
00:25:58.040 | And in a sense, it became an empowering element
00:26:01.320 | of tensor flow.
00:26:02.240 | - That's right.
00:26:03.280 | Yeah, it's interesting how you can put two things together
00:26:06.400 | which can align, right?
00:26:08.280 | And in this case, I think Francois, the team,
00:26:11.760 | and a bunch of us have chatted,
00:26:14.240 | and I think we all want to see the same kind of things.
00:26:17.360 | We all care about making it easier
00:26:18.840 | for the huge set of developers out there,
00:26:21.480 | and that makes a difference.
00:26:23.520 | - So Python has Guido Van Rossum,
00:26:26.920 | who until recently held the position
00:26:28.960 | of benevolent dictator for life.
00:26:33.600 | Does a huge successful open source project
00:26:36.520 | like TensorFlow need one person who makes a final decision?
00:26:40.680 | So you've did a pretty successful TensorFlow Dev Summit
00:26:45.480 | just now, last couple of days.
00:26:47.520 | There's clearly a lot of different new features
00:26:51.080 | being incorporated in an amazing ecosystem and so on.
00:26:53.880 | How are those design decisions made?
00:26:57.320 | Is there a BDFL in TensorFlow,
00:27:02.800 | or is it more distributed and organic?
00:27:05.880 | - I think it's somewhat different, I would say.
00:27:08.800 | I've always been involved in the key design directions,
00:27:14.960 | but there are lots of things that are distributed
00:27:18.520 | where there are a number of people,
00:27:20.680 | Martin Wick being one who has really driven
00:27:23.320 | a lot of our open source stuff, a lot of the APIs,
00:27:26.600 | and there are a number of other people
00:27:29.280 | who've been pushed and been responsible
00:27:32.760 | for different parts of it.
00:27:34.160 | We do have regular design reviews.
00:27:37.920 | Over the last year, we've really spent a lot of time
00:27:40.160 | opening up to the community and adding transparency.
00:27:43.320 | We're setting more processes in place,
00:27:45.920 | so RFCs, special interest groups,
00:27:49.200 | really grow that community and scale that.
00:27:52.160 | I think the kind of scale that ecosystem is in,
00:27:57.800 | I don't think we could scale with having me
00:27:59.560 | as the standpoint of decision maker.
00:28:02.320 | - I got it.
00:28:03.480 | So yeah, the growth of that ecosystem.
00:28:05.920 | Maybe you can talk about it a little bit.
00:28:08.080 | First of all, it started with Andrej Karpathy
00:28:10.760 | when he first did ConvNetJS.
00:28:13.160 | The fact that you can train in your own network
00:28:15.400 | in the browser in JavaScript was incredible.
00:28:18.520 | So now TensorFlow.js is really making that a serious,
00:28:23.960 | a legit thing, a way to operate,
00:28:27.560 | whether it's in the back end or the front end.
00:28:29.560 | Then there's the TensorFlow Extended,
00:28:31.400 | like you mentioned.
00:28:32.720 | There's TensorFlow Lite for mobile.
00:28:35.360 | And all of it, as far as I can tell,
00:28:37.480 | it's really converging towards being able to save models
00:28:42.480 | in the same kind of way.
00:28:43.480 | You can move around, you can train on the desktop,
00:28:46.680 | and then move it to mobile and so on.
00:28:48.720 | - That's right.
00:28:49.560 | - There's that cohesiveness.
00:28:52.280 | So can you maybe give me, whatever I missed,
00:28:56.120 | a bigger overview of the mission of the ecosystem
00:28:58.840 | that's trying to be built and where is it moving forward?
00:29:02.080 | - Yeah.
00:29:02.920 | So in short, the way I like to think of this is
00:29:06.760 | our goal is to enable machine learning.
00:29:09.720 | And in a couple of ways.
00:29:11.680 | One is we have lots of exciting things going on in ML today.
00:29:16.520 | We started with deep learning,
00:29:17.520 | but we now support a bunch of other algorithms too.
00:29:21.400 | So one is to, on the research side,
00:29:23.800 | keep pushing on the state of the art.
00:29:26.040 | How do we enable researchers
00:29:27.240 | to build the next amazing thing?
00:29:28.960 | So BERT came out recently.
00:29:31.760 | It's great that people are able to do new kinds of research.
00:29:33.960 | There are lots of amazing research
00:29:35.400 | that happens across the world.
00:29:37.520 | So that's one direction.
00:29:38.840 | The other is, how do you take that across
00:29:42.480 | all the people outside who want to take that research
00:29:45.200 | and do some great things with it
00:29:46.640 | and integrate it to build real products,
00:29:48.640 | to have a real impact on people?
00:29:51.800 | And so that's the other axis in some ways.
00:29:55.040 | At a high level, one way I think about it is
00:29:59.640 | there are a crazy number of compute devices across the world.
00:30:04.240 | And we often used to think of ML and training
00:30:07.920 | and all of this as, okay,
00:30:08.920 | something you do either in a workstation
00:30:10.840 | or the data center or cloud.
00:30:12.600 | But we see things running on the phones.
00:30:15.720 | We see things running on really tiny chips.
00:30:17.680 | I mean, we had some demos of the developer summit.
00:30:20.760 | And so the way I think about this ecosystem is,
00:30:25.760 | how do we help get machine learning on every device
00:30:29.960 | that has a compute capability?
00:30:32.560 | And that continues to grow.
00:30:33.800 | And so in some ways, this ecosystem has looked at
00:30:38.720 | various aspects of that and grown over time
00:30:41.160 | to cover more of those.
00:30:42.480 | And we continue to push the boundaries.
00:30:44.680 | In some areas, we've built more tooling
00:30:48.200 | and things around that to help you.
00:30:50.040 | I mean, the first tool we started was TensorBoard.
00:30:52.800 | You wanted to learn just the training piece.
00:30:55.000 | TFX or TensorFlow Extended
00:30:58.120 | to really do your entire ML pipelines
00:31:00.440 | if you care about all that production stuff.
00:31:03.920 | But then going to the edge,
00:31:06.640 | going to different kinds of things.
00:31:09.520 | And it's not just us now.
00:31:11.840 | We're at a place where there are lots of libraries
00:31:14.480 | being built on top.
00:31:15.840 | So there are some for research,
00:31:17.800 | maybe things like TensorFlow Agents
00:31:20.080 | or TensorFlow Probability that started as research things
00:31:22.480 | or for researchers for focusing
00:31:24.240 | on certain kinds of algorithms.
00:31:26.160 | But they're also being deployed
00:31:27.320 | or used by production folks.
00:31:30.280 | And some have come from within Google,
00:31:33.360 | just teams across Google
00:31:34.760 | who wanted to build these things.
00:31:37.040 | Others have come from just the community
00:31:39.720 | because there are different pieces
00:31:41.840 | that different parts of the community care about.
00:31:44.640 | And I see our goal as enabling even that.
00:31:49.640 | We cannot and won't build every single thing.
00:31:53.280 | That just doesn't make sense.
00:31:54.880 | But if we can enable others
00:31:56.560 | to build the things that they care about,
00:31:58.360 | and there's a broader community that cares about that,
00:32:01.480 | and we can help encourage that,
00:32:02.920 | and that's great.
00:32:05.320 | That really helps the entire ecosystem, not just those.
00:32:08.640 | One of the big things about 2.0 that we're pushing on is,
00:32:11.880 | okay, we have these so many different pieces, right?
00:32:14.680 | How do we help make all of them work well together?
00:32:18.360 | So there are a few key pieces there that we're pushing on,
00:32:22.000 | one being the core format in there
00:32:23.920 | and how we share the models themselves
00:32:26.640 | through SaveModel and TensorFlow Hub and so on.
00:32:29.600 | And a few of the pieces that we really put this together.
00:32:34.040 | - I was very skeptical that that's,
00:32:35.920 | when TensorFlow.js came out,
00:32:37.320 | it didn't seem, or Deep Learning.js.
00:32:40.200 | - Yeah, that was the first.
00:32:41.680 | - It seems like technically a very difficult project.
00:32:44.920 | As a standalone, it's not as difficult,
00:32:47.040 | but as a thing that integrates into the ecosystem,
00:32:50.000 | it seems very difficult.
00:32:51.280 | So, I mean, there's a lot of aspects of this
00:32:53.280 | you're making look easy, but,
00:32:54.720 | on the technical side,
00:32:57.240 | how many challenges have to be overcome here?
00:32:59.540 | - A lot.
00:33:01.560 | - And still have to be overcome.
00:33:03.120 | That's the question here, too.
00:33:04.960 | - There are lots of steps to it.
00:33:06.240 | I mean, we've iterated over the last few years,
00:33:08.080 | so there's a lot we've learned.
00:33:10.760 | I, yeah, often when things come together well,
00:33:14.280 | things look easy, and that's exactly the point.
00:33:16.480 | It should be easy for the end user,
00:33:18.400 | but there are lots of things that go behind that.
00:33:21.400 | If I think about still challenges ahead,
00:33:25.400 | there are,
00:33:26.760 | you know, we have a lot more devices coming on board,
00:33:32.920 | for example, from the hardware perspective.
00:33:35.360 | How do we make it really easy for these vendors
00:33:37.680 | to integrate with something like TensorFlow, right?
00:33:41.280 | So there's a lot of compiler stuff
00:33:43.680 | that others are working on.
00:33:45.360 | There are things we can do in terms of our APIs
00:33:48.360 | and so on that we can do.
00:33:49.680 | As we, you know, TensorFlow started
00:33:53.960 | as a very monolithic system,
00:33:55.840 | and to some extent it still is.
00:33:57.680 | There are less, lots of tools around it,
00:33:59.440 | but the core is still pretty large and monolithic.
00:34:02.960 | One of the key challenges for us to scale that out
00:34:05.760 | is how do we break that apart with clearer interfaces?
00:34:10.400 | It's, you know, in some ways it's software engineering 101,
00:34:14.560 | but for a system that's now four years old, I guess, or more,
00:34:19.560 | and that's still rapidly evolving
00:34:21.640 | and that we're not slowing down with,
00:34:24.040 | it's hard to, you know, change and modify
00:34:26.840 | and really break apart.
00:34:28.280 | It's sort of like, as people say, right,
00:34:29.960 | it's like changing the engine with a car running
00:34:32.640 | or fix that, that's exactly what we're trying to do.
00:34:35.120 | - So there's a challenge here
00:34:37.600 | because the downside of so many people
00:34:41.600 | being excited about TensorFlow
00:34:43.880 | and coming to rely on it in many of their applications
00:34:48.640 | is that you're kind of responsible,
00:34:52.080 | like it's the technical debt.
00:34:53.560 | You're responsible for previous versions
00:34:55.720 | to some degree still working.
00:34:57.640 | So when you're trying to innovate,
00:34:59.960 | I mean, it's probably easier to just start from scratch
00:35:03.800 | every few months.
00:35:04.960 | (laughs)
00:35:05.840 | - Absolutely.
00:35:07.240 | - So do you feel the pain of that?
00:35:09.360 | A 2.0 does break some back compatibility,
00:35:14.320 | but not too much.
00:35:15.400 | It seems like the conversion is pretty straightforward.
00:35:18.160 | Do you think that's still important
00:35:20.280 | given how quickly deep learning is changing?
00:35:22.920 | Can you just, the things that you've learned,
00:35:26.400 | can you just start over or is there pressure to not?
00:35:29.320 | - It's a tricky balance.
00:35:31.600 | So if it was just a researcher writing a paper
00:35:36.360 | who a year later will not look at that code again,
00:35:39.440 | sure, it doesn't matter.
00:35:40.760 | There are a lot of production systems
00:35:43.480 | that rely on TensorFlow,
00:35:44.720 | both at Google and across the world.
00:35:47.280 | And people worry about this.
00:35:49.760 | I mean, these systems run for a long time.
00:35:52.440 | So it is important to keep that compatibility and so on.
00:35:57.280 | And yes, it does come with a huge cost.
00:35:59.760 | There's, we have to think about a lot of things
00:36:03.000 | as we do new things and make new changes.
00:36:05.840 | I think it's a trade-off, right?
00:36:09.160 | You can, you might slow certain kinds of things down,
00:36:13.040 | but the overall value you're bringing because of that
00:36:15.480 | is much bigger because it's not just about
00:36:18.640 | breaking the person yesterday,
00:36:20.600 | it's also about telling the person tomorrow
00:36:23.720 | that you know what, this is how we do things.
00:36:26.360 | We're not going to break you when you come on board
00:36:28.600 | because there are lots of new people
00:36:29.920 | who are also going to come on board.
00:36:31.680 | You know, one way I like to think about this,
00:36:34.720 | and I always push the team to think about as well,
00:36:38.000 | when you want to do new things,
00:36:39.600 | you want to start with a clean slate,
00:36:42.040 | design with a clean slate in mind.
00:36:44.920 | And then we'll figure out how to make sure
00:36:47.520 | all the other things work.
00:36:48.680 | And yes, we do make compromises occasionally,
00:36:51.320 | but unless you design with the clean slate
00:36:55.240 | and not worry about that,
00:36:56.560 | you'll never get to a good place.
00:36:58.400 | - That's brilliant.
00:36:59.240 | So even if you are responsible in the idea stage,
00:37:04.080 | when you're thinking of new,
00:37:05.800 | just put all that behind you.
00:37:07.720 | Okay, that's really well put.
00:37:09.600 | So I have to ask this because a lot of students,
00:37:12.000 | developers ask me,
00:37:13.240 | how I feel about PyTorch versus TensorFlow.
00:37:16.320 | So I've recently completely switched
00:37:18.280 | my research group to TensorFlow.
00:37:20.920 | I wish everybody would just use the same thing,
00:37:23.280 | and TensorFlow is as close to that, I believe, as we have.
00:37:26.960 | But do you enjoy competition?
00:37:31.000 | So TensorFlow is leading in many ways,
00:37:34.320 | on many dimensions in terms of ecosystem,
00:37:36.760 | in terms of the number of users,
00:37:39.040 | momentum, power, production level, so on.
00:37:41.200 | But a lot of researchers are now also using PyTorch.
00:37:46.000 | Do you enjoy that kind of competition
00:37:47.520 | or do you just ignore it and focus on
00:37:49.800 | making TensorFlow the best that it can be?
00:37:52.320 | - So just like research or anything people are doing,
00:37:55.480 | it's great to get different kinds of ideas.
00:37:58.120 | And when we started with TensorFlow,
00:38:01.480 | like I was saying earlier,
00:38:03.320 | one, it was very important for us
00:38:05.560 | to also have production in mind.
00:38:07.440 | We didn't want just research, right?
00:38:09.000 | And that's why we chose certain things.
00:38:11.320 | Now PyTorch came along and said,
00:38:12.840 | you know what, I only care about research.
00:38:14.880 | This is what I'm trying to do.
00:38:16.320 | What's the best thing I can do for this?
00:38:18.400 | And it started iterating and said,
00:38:20.880 | okay, I don't need to worry about graphs.
00:38:22.560 | Let me just run things.
00:38:25.200 | I don't care if it's not as fast as it can be,
00:38:27.440 | but let me just make this part easy.
00:38:30.520 | And there are things you can learn from that, right?
00:38:32.600 | They again had the benefit of seeing what had come before,
00:38:36.800 | but also exploring certain different kinds of spaces.
00:38:40.560 | And they had some good things there,
00:38:43.600 | building on say things like JNUR and so on before that.
00:38:46.720 | So competition is definitely interesting.
00:38:49.360 | It made us, this is an area that we had thought about,
00:38:51.920 | like I said, very early on.
00:38:53.760 | Over time we had revisited this a couple of times,
00:38:56.640 | should we add this again?
00:38:59.040 | At some point we said, you know what,
00:39:01.080 | it seems like this can be done well,
00:39:02.920 | so let's try it again.
00:39:04.320 | And that's how we started pushing on eager execution.
00:39:07.720 | How do we combine those two together?
00:39:09.920 | Which has finally come very well together in 2.0,
00:39:13.160 | but it took us a while to get all the things together
00:39:15.760 | and so on.
00:39:16.600 | - So let me ask, put another way,
00:39:19.360 | I think eager execution is a really powerful thing
00:39:21.880 | that was added.
00:39:22.720 | Do you think it wouldn't have been,
00:39:24.400 | you know, Muhammad Ali versus Fraser, right?
00:39:28.440 | Do you think it wouldn't have been added as quickly
00:39:31.240 | if PyTorch wasn't there?
00:39:33.800 | - It might have taken longer.
00:39:35.800 | Yeah, it was, I mean, we had tried some variants
00:39:38.240 | of that before, so I'm sure it would have happened,
00:39:40.960 | but it might have taken longer.
00:39:42.280 | - I'm grateful that TensorFlow is following
00:39:44.240 | the way they did.
00:39:45.080 | It's doing some incredible work last couple of years.
00:39:47.800 | What other things that we didn't talk about
00:39:49.640 | are you looking forward in 2.0?
00:39:52.640 | That comes to mind.
00:39:54.040 | So we talked about some of the ecosystem stuff,
00:39:56.520 | making it easily accessible through Keras,
00:40:00.000 | eager execution.
00:40:01.440 | Is there other things that we missed?
00:40:02.840 | - Yeah, so I would say one is just where 2.0 is,
00:40:07.520 | and you know, with all the things that we've talked about.
00:40:10.760 | I think as we think beyond that,
00:40:13.760 | there are lots of other things that it enables us to do
00:40:16.600 | and that we're excited about.
00:40:18.760 | So what it's setting us up for,
00:40:20.720 | okay, here are these really clean APIs.
00:40:22.520 | We've cleaned up the surface for what the users want.
00:40:25.640 | What it also allows us to do a whole bunch of stuff
00:40:28.320 | behind the scenes once we are ready with 2.0.
00:40:31.600 | So for example, in TensorFlow with graphs
00:40:36.600 | and all the things you could do,
00:40:37.720 | you could always get a lot of good performance
00:40:40.600 | if you spent the time to tune it, right?
00:40:43.280 | And we've clearly shown that, lots of people do that.
00:40:48.720 | With 2.0, with these APIs,
00:40:52.040 | where we are, we can give you a lot of performance
00:40:55.120 | just with whatever you do.
00:40:56.600 | Because we see it's much cleaner,
00:41:01.400 | we know most people are gonna do things this way.
00:41:03.720 | We can really optimize for that
00:41:05.520 | and get a lot of those things out of the box.
00:41:09.040 | And it really allows us, both for single machine
00:41:11.920 | and distributed and so on,
00:41:13.880 | to really explore other spaces behind the scenes
00:41:17.200 | after 2.0 in the future versions as well.
00:41:19.720 | So right now the team's really excited about that.
00:41:23.000 | That over time, I think we'll see that.
00:41:25.840 | The other piece that I was talking about
00:41:27.760 | in terms of just restructuring the monolithic thing
00:41:31.640 | into more pieces and making it more modular,
00:41:34.360 | I think that's gonna be really important
00:41:36.800 | for a lot of the other people in the ecosystem
00:41:41.800 | or the organizations and so on that wanted to build things.
00:41:44.840 | - Can you elaborate a little bit what you mean
00:41:46.400 | by making TensorFlow ecosystem more modular?
00:41:50.720 | - So the way it's organized today is there's one,
00:41:55.040 | there are lots of repositories
00:41:56.320 | in the TensorFlow organization at GitHub.
00:41:58.360 | The core one where we have TensorFlow,
00:42:01.120 | it has the execution engine,
00:42:04.120 | it has the key backends for CPUs and GPUs,
00:42:08.320 | it has the work to do distributed stuff.
00:42:12.600 | And all of these just work together
00:42:14.440 | in a single library or binary.
00:42:17.280 | There's no way to split them apart easily.
00:42:18.840 | I mean, there are some interfaces,
00:42:20.000 | but they're not very clean.
00:42:21.640 | In a perfect world, you would have clean interfaces
00:42:23.960 | where, okay, I wanna run it on my fancy cluster
00:42:27.760 | with some custom networking,
00:42:29.400 | just implement this and do that.
00:42:31.000 | I mean, we kind of support that,
00:42:32.680 | but it's hard for people today.
00:42:34.640 | I think as we are starting to see more interesting things
00:42:38.200 | in some of these spaces,
00:42:39.480 | having that clean separation will really start to help.
00:42:43.360 | And again, going to the large size of the ecosystem
00:42:47.400 | and the different groups involved there,
00:42:50.200 | enabling people to evolve
00:42:52.600 | and push on things more independently
00:42:54.400 | just allows it to scale better.
00:42:56.080 | - And by people, you mean individual developers and--
00:42:59.120 | - And organizations.
00:42:59.960 | - And organizations.
00:43:01.840 | So the hope is that everybody sort of major,
00:43:04.280 | I don't know, Pepsi or something uses,
00:43:06.920 | like major corporations go to TensorFlow to this kind of--
00:43:11.080 | - Yeah, if you look at enterprise like Pepsi or these,
00:43:13.680 | I mean, a lot of them are already using TensorFlow.
00:43:15.840 | They are not the ones that do the development
00:43:18.960 | or changes in the core.
00:43:20.400 | Some of them do, but a lot of them don't.
00:43:21.960 | I mean, they touch small pieces.
00:43:23.760 | There are lots of these,
00:43:25.680 | some of them being, let's say, hardware vendors
00:43:27.680 | who are building their custom hardware
00:43:29.000 | and they want their own pieces.
00:43:30.880 | Or some of them being bigger companies, say IBM.
00:43:34.200 | I mean, they're involved in some of our
00:43:36.520 | special interest groups,
00:43:38.160 | and they see a lot of users who want certain things
00:43:41.040 | and they want to optimize for that.
00:43:42.640 | So folks like that often.
00:43:44.480 | - Autonomous vehicle companies, perhaps.
00:43:46.360 | - Exactly, yes.
00:43:48.200 | - So yeah, like I mentioned,
00:43:50.040 | TensorFlow has been downloaded 41 million times,
00:43:52.800 | 50,000 commits, almost 10,000 pull requests,
00:43:56.520 | 1,800 contributors.
00:43:58.360 | So I'm not sure if you can explain it,
00:44:02.160 | but what does it take to build a community like that?
00:44:06.840 | In retrospect, what do you think,
00:44:09.200 | what is the critical thing that allowed
00:44:11.240 | for this growth to happen,
00:44:12.680 | and how does that growth continue?
00:44:14.640 | - Yeah, yeah, that's an interesting question.
00:44:17.960 | I wish I had all the answers there, I guess,
00:44:20.320 | so we could replicate it.
00:44:21.640 | I think there are a number of things
00:44:25.600 | that need to come together, right?
00:44:27.920 | One, just like any new thing,
00:44:32.520 | it is about, there's a sweet spot of timing,
00:44:35.960 | what's needed, does it grow with
00:44:38.920 | what's needed, so in this case, for example,
00:44:41.640 | TensorFlow is not just grown because it was a good tool,
00:44:43.680 | it's also grown with the growth of deep learning itself.
00:44:46.720 | So those factors come into play.
00:44:49.040 | Other than that, though,
00:44:50.360 | I think just hearing, listening to the community,
00:44:55.240 | what they need, being open to,
00:44:58.440 | like in terms of external contributions,
00:45:01.120 | we've spent a lot of time in making sure
00:45:04.560 | we can accept those contributions well,
00:45:06.880 | we can help the contributors in adding those,
00:45:09.520 | putting the right process in place,
00:45:11.360 | getting the right kind of community,
00:45:13.400 | welcoming them and so on.
00:45:15.200 | Like over the last year,
00:45:17.160 | we've really pushed on transparency,
00:45:19.080 | that that's important for an open source project.
00:45:22.320 | People want to know where things are going,
00:45:23.840 | and we're like, okay, here's a process
00:45:26.240 | where you can do that, here are our seas and so on.
00:45:29.400 | So thinking through, there are lots of community aspects
00:45:32.960 | that come into that you can really work on.
00:45:36.520 | As a small project, it's maybe easy to do
00:45:38.760 | because there's like two developers and you can do those.
00:45:42.200 | As you grow, putting more of these processes in place,
00:45:47.000 | thinking about the documentation,
00:45:49.160 | thinking about what do developers care about,
00:45:51.960 | what kind of tools would they want to use?
00:45:55.200 | All of these come into play, I think.
00:45:56.920 | - So one of the big things I think
00:45:58.480 | that feeds the TensorFlow fire
00:46:00.720 | is people building something on TensorFlow.
00:46:04.000 | And implement a particular architecture
00:46:07.720 | that does something cool and useful.
00:46:09.520 | And they put that on GitHub.
00:46:11.120 | And so it just feeds this growth.
00:46:15.600 | Do you have a sense that with 2.0 and 1.0
00:46:19.600 | that there may be a little bit of a partitioning
00:46:21.560 | like there is with Python 2 and 3,
00:46:24.080 | that there'll be a code base?
00:46:26.040 | And in the older versions of TensorFlow,
00:46:28.360 | they will not be as compatible easily?
00:46:31.120 | Or are you pretty confident that this kind of conversion
00:46:35.560 | is pretty natural and easy to do?
00:46:37.920 | - So we're definitely working hard
00:46:39.920 | to make that very easy to do.
00:46:41.440 | There's lots of tooling that we talked about
00:46:43.440 | at the developer summit this week.
00:46:45.760 | And we continue to invest in that tooling.
00:46:48.280 | It's, you know, when you think of these
00:46:50.920 | significant version changes, that's always a risk.
00:46:53.520 | And we are really pushing hard
00:46:55.720 | to make that transition very, very smooth.
00:46:59.200 | I think, so at some level, people want to move
00:47:03.640 | when they see the value in the new thing.
00:47:05.600 | They don't want to move just because it's a new thing.
00:47:07.680 | And some people do,
00:47:08.520 | but most people want a really good thing.
00:47:11.480 | And I think over the next few months
00:47:13.800 | as people start to see the value,
00:47:15.400 | we'll definitely see that shift happening.
00:47:17.640 | So I'm pretty excited and confident
00:47:19.720 | that we'll see people moving.
00:47:21.640 | As you said earlier, this field is also moving rapidly.
00:47:24.680 | So that'll help because we can do more things.
00:47:26.760 | And, you know, all the new things
00:47:28.080 | will clearly happen in 2.X.
00:47:29.520 | So people will have lots of good reasons to move.
00:47:32.280 | - So what do you think TensorFlow 3.0 looks like?
00:47:36.160 | Is that, is there, are things happening so crazy
00:47:40.360 | that even at the end of this year
00:47:42.520 | it seems impossible to plan for?
00:47:44.320 | Or is it possible to plan for the next five years?
00:47:48.600 | - I think it's tricky.
00:47:50.840 | There are some things that we can expect
00:47:54.560 | in terms of, okay, change, yes, change is going to happen.
00:47:58.040 | (laughing)
00:47:59.760 | Are there some things going to stick around
00:48:01.720 | and some things not going to stick around?
00:48:03.760 | I would say the basics of deep learning,
00:48:08.200 | the, you know, say convolution models
00:48:10.440 | or the basic kind of things,
00:48:12.760 | they'll probably be around in some form still in five years.
00:48:16.360 | Will RLN GAN stay?
00:48:18.640 | Very likely based on where they are.
00:48:21.240 | Will we have new things?
00:48:22.880 | Probably, but those are hard to predict.
00:48:24.720 | And some, directionally, some things that we can see is,
00:48:29.720 | you know, and things that we're starting to do, right,
00:48:32.800 | with some of our projects right now
00:48:35.480 | is just 2.0 combining eager execution and graphs
00:48:39.160 | where we're starting to make it more like
00:48:41.520 | just your natural programming language.
00:48:43.200 | You're not trying to program something else.
00:48:45.720 | Similarly with Swift for TensorFlow,
00:48:47.280 | we're taking that approach.
00:48:48.320 | Can you do something round up, right?
00:48:50.080 | So some of those ideas seem like, okay,
00:48:52.160 | that's the right direction.
00:48:54.120 | In five years, we expect to see more in that area.
00:48:57.160 | Other things we don't know is,
00:49:00.120 | will hardware accelerators be the same?
00:49:03.240 | Will we be able to train with four bits instead of 32 bits?
00:49:08.240 | - And I think the TPU side of things is exploring that.
00:49:11.520 | I mean, TPU's already on version three.
00:49:14.000 | It seems that the evolution of TPU and TensorFlow
00:49:17.560 | are sort of, they're co-evolving almost.
00:49:22.160 | In terms of both are learning from each other
00:49:24.760 | and from the community and from the applications
00:49:27.960 | where the biggest benefit is achieved.
00:49:29.720 | - That's right.
00:49:30.560 | - You've been trying to sort of,
00:49:32.320 | with eager with Keras to make TensorFlow
00:49:34.280 | as accessible and easy to use as possible.
00:49:36.480 | What do you think for beginners
00:49:38.040 | is the biggest thing they struggle with?
00:49:40.000 | Have you encountered that?
00:49:42.080 | Or is basically what Keras is solving
00:49:44.280 | is that eager, like we talked about?
00:49:47.400 | - Yeah, for some of them, like you said, right,
00:49:50.920 | beginners want to just be able to take some image model,
00:49:54.880 | they don't care if it's Inception or ResNet
00:49:57.040 | or something else, and do some training
00:49:59.560 | or transfer learning on their kind of model.
00:50:02.480 | Being able to make that easy is important.
00:50:04.440 | So in some ways, if you do that by providing them
00:50:08.600 | simple models with, say, in hub or so on,
00:50:11.400 | they don't care about what's inside that box,
00:50:13.720 | but they want to be able to use it.
00:50:15.160 | So we're pushing on, I think, different levels.
00:50:17.640 | If you look at just a component that you get
00:50:19.960 | which has the layers already smooshed in,
00:50:22.760 | the beginners probably just want that.
00:50:25.200 | Then the next step is, okay,
00:50:26.720 | look at building layers with Keras.
00:50:29.040 | If you go out to research,
00:50:30.240 | then they are probably writing custom layers themselves
00:50:33.120 | or doing their own loops.
00:50:34.360 | So there's a whole spectrum there.
00:50:36.320 | - And then providing the pre-trained models
00:50:38.640 | seems to really decrease the time from you trying to start.
00:50:43.640 | So you could basically in a Colab notebook
00:50:46.800 | achieve what you need.
00:50:49.080 | So I'm basically answering my own question
00:50:51.240 | because I think what TensorFlow delivered on recently
00:50:54.240 | is trivial for beginners.
00:50:56.920 | So I was just wondering if there was other pain points
00:51:00.720 | you're trying to ease, but I'm not sure there would be.
00:51:02.480 | - No, those are probably the big ones.
00:51:04.200 | I mean, I see high schoolers doing a whole bunch of things
00:51:07.040 | now, which is pretty amazing.
00:51:08.800 | - It's both amazing and terrifying.
00:51:10.680 | - Yes.
00:51:12.640 | - In a sense that when they grow up,
00:51:15.840 | some incredible ideas will be coming from them.
00:51:19.240 | So there's certainly a technical aspect to your work,
00:51:21.840 | but you also have a management aspect to your role
00:51:25.200 | with TensorFlow leading the project,
00:51:27.960 | a large number of developers and people.
00:51:31.080 | So what do you look for in a good team?
00:51:34.680 | What do you think?
00:51:35.960 | You know, Google has been at the forefront of exploring
00:51:38.400 | what it takes to build a good team.
00:51:40.440 | And TensorFlow is one of the most cutting edge technologies
00:51:45.520 | in the world.
00:51:46.360 | So in this context, what do you think makes for a good team?
00:51:49.360 | - It's definitely something I think a fair bit about.
00:51:53.160 | I think in terms of the team being able to deliver
00:51:58.160 | something well, one of the things that's important
00:52:02.640 | is a cohesion across the team.
00:52:05.800 | So being able to execute together in doing things.
00:52:10.440 | It's not an, like at this scale,
00:52:13.160 | an individual engineer can only do so much.
00:52:15.440 | There's a lot more that they can do together,
00:52:18.240 | even though we have some amazing superstars across Google
00:52:21.760 | and in the team, but there's, you know,
00:52:25.120 | often the way I see it as the product
00:52:27.360 | of what the team generates is way larger than the whole,
00:52:30.520 | or you know, the individual put together.
00:52:34.440 | And so how do we have all of them work together,
00:52:37.320 | the culture of the team itself.
00:52:40.040 | Hiring good people is important,
00:52:43.080 | but part of that is it's not just that, okay,
00:52:45.680 | we hire a bunch of smart people and throw them together
00:52:48.160 | and let them do things.
00:52:49.760 | It's also people have to care about what they're building.
00:52:52.960 | People have to be motivated for the right kind of things.
00:52:57.400 | That's often an important factor.
00:52:59.840 | And, you know, finally, how do you put that together
00:53:04.640 | with a somewhat unified vision of where we want to go?
00:53:08.840 | So are we all looking in the same direction
00:53:11.240 | or each of us going all over?
00:53:13.600 | And sometimes it's a mix.
00:53:16.120 | Google's a very bottom-up organization in some sense,
00:53:20.560 | also research even more so, and that's how we started.
00:53:26.400 | But as we've become this larger product and ecosystem,
00:53:30.920 | I think it's also important to combine that well
00:53:33.200 | with a mix of, okay, here's the direction we want to go in.
00:53:38.000 | There is exploration we'll do around that,
00:53:39.880 | but let's keep staying in that direction,
00:53:42.840 | not just all over the place.
00:53:44.480 | - And is there a way you monitor the health of the team?
00:53:46.880 | Sort of like, is there a way you know you did a good job?
00:53:51.880 | The team is good?
00:53:53.080 | Like, I mean, you're sort of, you're saying nice things,
00:53:56.240 | but it's sometimes difficult to determine how aligned.
00:54:00.880 | - Yes. - 'Cause it's not binary.
00:54:02.160 | It's not like, there's tensions and complexities and so on.
00:54:06.760 | And the other element of this is the mission of superstars.
00:54:09.240 | You know, there's so much, even at Google,
00:54:11.800 | such a large percentage of work is done
00:54:13.560 | by individual superstars too.
00:54:16.000 | So there's a, and sometimes those superstars
00:54:19.960 | can be against the dynamic of a team and those tensions.
00:54:23.280 | I mean, I'm sure in TensorFlow it might be
00:54:26.600 | a little bit easier because the mission of the project
00:54:28.880 | is so sort of beautiful.
00:54:31.760 | You're at the cutting edge, so it's exciting.
00:54:33.720 | - Yep.
00:54:34.840 | - But have you struggled with that?
00:54:36.720 | Has there been challenges?
00:54:38.360 | - There are always people challenges
00:54:39.880 | in different kinds of ways.
00:54:41.280 | That said, I think we've been,
00:54:43.040 | what's good about getting people who care
00:54:46.880 | and are, you know, have the same kind of culture,
00:54:50.440 | and that's Google in general to a large extent.
00:54:53.480 | But also, like you said, given that the project
00:54:56.120 | has had so many exciting things to do,
00:54:58.800 | there's been room for lots of people
00:55:00.760 | to do different kinds of things and grow,
00:55:02.440 | which does make the problem a bit easier, I guess.
00:55:06.480 | And it allows people, depending on what they're doing,
00:55:09.920 | if there's room around them, then that's fine.
00:55:12.320 | But yes, we do care about whether a superstar or not
00:55:18.160 | that they need to work well with the team across Google.
00:55:22.600 | - That's interesting to hear.
00:55:23.680 | So it's like superstar or not, the productivity broadly
00:55:28.000 | is about the team.
00:55:30.560 | - Yeah, yeah.
00:55:31.520 | I mean, they might add a lot of value,
00:55:33.000 | but if they're hurting the team, then that's a problem.
00:55:35.760 | - So in hiring engineers, it's so interesting, right?
00:55:39.080 | The hiring process, what do you look for?
00:55:41.880 | How do you determine a good developer
00:55:44.320 | or a good member of a team
00:55:46.280 | from just a few minutes or hours together?
00:55:48.620 | (chuckles)
00:55:50.320 | Again, no magic answers, I'm sure.
00:55:52.000 | - Yeah, yeah.
00:55:52.840 | I mean, Google has a hiring process
00:55:55.400 | that we've refined over the last 20 years, I guess,
00:55:58.840 | and that you've probably heard and seen a lot about.
00:56:02.280 | So we do work with the same hiring process,
00:56:05.040 | and that's really helped.
00:56:06.480 | For me in particular, I would say,
00:56:10.920 | in addition to the core technical skills,
00:56:14.240 | what does matter is their motivation in what they want to do
00:56:19.240 | because if that doesn't align well with where we want to go,
00:56:23.000 | that's not going to lead to long-term success
00:56:25.360 | for either them or the team.
00:56:27.320 | And I think that becomes more important
00:56:30.040 | the more senior the person is,
00:56:31.480 | but it's important at every level.
00:56:33.600 | Like even the junior most engineer,
00:56:34.960 | if they're not motivated to do well
00:56:36.400 | at what they're trying to do,
00:56:37.720 | however smart they are,
00:56:38.840 | it's going to be hard for them to succeed.
00:56:40.360 | - Does the Google hiring process touch on that passion?
00:56:44.540 | So like trying to determine,
00:56:46.480 | 'cause I think as far as I understand,
00:56:48.480 | maybe you can speak to it,
00:56:49.600 | that the Google hiring process sort of helps,
00:56:53.360 | in the initial, like determines the skill set there,
00:56:56.400 | is your puzzle-solving ability,
00:56:57.920 | problem-solving ability good?
00:56:59.920 | But like, I'm not sure,
00:57:02.540 | but it seems that the determining whether the person
00:57:05.600 | is like fire inside them,
00:57:07.600 | that burns to do anything really,
00:57:09.040 | it doesn't really matter,
00:57:09.880 | it's just some cool stuff, I'm going to do it.
00:57:12.500 | That, I don't know,
00:57:15.320 | is that something that ultimately ends up
00:57:17.280 | when they have a conversation with you
00:57:18.840 | or once it gets closer to the team?
00:57:22.640 | - So one of the things we do have as part of the process
00:57:25.440 | is just a culture fit,
00:57:27.160 | like part of the interview process itself,
00:57:29.200 | in addition to just the technical skills,
00:57:31.040 | and each engineer or whoever the interviewer is,
00:57:34.280 | is supposed to rate the person on the culture
00:57:38.360 | and the culture fit with Google and so on.
00:57:40.000 | So that is definitely part of the process.
00:57:42.200 | Now, there are various kinds of projects
00:57:45.880 | and different kinds of things,
00:57:46.920 | so there might be variants
00:57:48.820 | in the kind of culture you want there and so on,
00:57:51.400 | and yes, that does vary.
00:57:52.760 | So for example,
00:57:54.000 | TensorFlow's always been a fast-moving project,
00:57:56.960 | and we want people who are comfortable with that.
00:57:59.420 | But at the same time now, for example,
00:58:02.680 | we are at a place where we are also a very full-fledged
00:58:04.820 | product and we want to make sure things that work
00:58:07.800 | really, really work, right?
00:58:09.320 | You can't cut corners all the time.
00:58:11.680 | So balancing that out and finding the people
00:58:14.320 | who are the right fit for those is important.
00:58:17.600 | And I think those kind of things do vary a bit
00:58:19.720 | across projects and teams and product areas across Google,
00:58:23.200 | and so you'll see some differences there
00:58:25.240 | in the final checklist.
00:58:27.720 | But a lot of the core culture,
00:58:29.400 | it comes along with just engineering excellence and so on.
00:58:32.820 | - What is the hardest part of your job?
00:58:37.600 | Take your pick, I guess.
00:58:41.040 | - It's fun, I would say, right?
00:58:44.440 | Hard, yes.
00:58:45.560 | I mean, lots of things at different times.
00:58:47.280 | I think that does vary.
00:58:49.200 | - So let me clarify that difficult things are fun
00:58:52.680 | when you solve them, right?
00:58:53.980 | (laughing)
00:58:55.440 | - Yes.
00:58:56.280 | - It's fun in that sense.
00:58:57.520 | - I think the key to a successful thing across the board,
00:59:02.520 | and in this case, it's a large ecosystem now,
00:59:05.360 | but even a small product, is striking that fine balance
00:59:09.840 | across different aspects of it.
00:59:12.040 | Sometimes it's how fast you go versus how perfect it is.
00:59:17.040 | Sometimes it's how do you involve this huge community?
00:59:21.480 | Who do you involve?
00:59:22.440 | Or do you decide, okay, now is not a good time
00:59:24.820 | to involve them because it's not the right fit?
00:59:28.640 | And sometimes it's saying no to certain kinds of things.
00:59:33.680 | Those are often the hard decisions.
00:59:35.760 | Some of them you make quickly
00:59:39.620 | because you don't have the time.
00:59:41.040 | Some of them you get time to think about them,
00:59:43.240 | but they're always hard.
00:59:44.520 | - So on both, both choices are pretty good,
00:59:46.880 | it's those decisions.
00:59:49.200 | What about deadlines?
00:59:50.400 | Is this, do you find TensorFlow to be driven
00:59:54.840 | by deadlines to a degree that a product might?
01:00:00.400 | Or is there still a balance to where,
01:00:03.920 | I mean, it's less deadline.
01:00:04.920 | You had the Dev Summit, they came together incredibly.
01:00:08.920 | Looked like there's a lot of moving pieces and so on.
01:00:11.480 | So did that deadline make people rise to the occasion,
01:00:15.120 | releasing TensorFlow 2.0 Alpha?
01:00:18.040 | - Yeah.
01:00:18.880 | - I think that was done last minute as well.
01:00:20.400 | I mean, like up to the last point.
01:00:25.400 | - Again, it's one of those things that's,
01:00:28.440 | you need to strike the good balance.
01:00:29.920 | There's some value that deadlines bring
01:00:32.080 | that does bring a sense of urgency
01:00:33.960 | to get the right things together
01:00:35.760 | instead of getting the perfect thing out.
01:00:38.320 | You need something that's good and works well.
01:00:41.320 | And the team definitely did a great job
01:00:43.280 | in putting that together.
01:00:44.120 | So I was very amazed and excited by everything,
01:00:46.600 | how that came together.
01:00:48.760 | That said, across the year,
01:00:49.880 | we try not to put out official deadlines.
01:00:52.560 | We focus on key things that are important,
01:00:57.000 | figure out how much of it's important.
01:01:00.640 | And we are developing in the open,
01:01:03.920 | what internally and externally,
01:01:05.800 | everything's available to everybody.
01:01:07.960 | So you can pick and look at where things are.
01:01:11.240 | We do releases at a regular cadence.
01:01:13.240 | So fine, if something doesn't necessarily end up
01:01:16.080 | at this month, it'll end up in the next release
01:01:17.800 | in a month or two.
01:01:18.760 | And that's okay, but we want to get,
01:01:22.200 | like keep moving as fast as we can in these different areas.
01:01:25.200 | Because we can iterate and improve on things.
01:01:29.680 | Sometimes it's okay to put things out
01:01:32.000 | that aren't fully ready.
01:01:32.960 | We'll make sure it's clear that, okay,
01:01:34.640 | this is experimental, but it's out there
01:01:36.560 | if you want to try and give feedback.
01:01:38.000 | That's very, very useful.
01:01:39.440 | I think that quick cycle and quick iteration is important.
01:01:42.580 | That's what we often focus on
01:01:46.160 | rather than here's a deadline where you get everything else.
01:01:49.240 | - Is 2.0, is there pressure to make that stable?
01:01:52.880 | Or like, for example, WordPress 5.0 just came out
01:01:56.680 | and there was no pressure to,
01:01:59.920 | it was a lot of build updates,
01:02:01.800 | it delivered way too late.
01:02:04.000 | But, and they said, okay, well,
01:02:06.000 | but we're going to release a lot of updates
01:02:07.480 | really quickly to improve it.
01:02:09.720 | Do you see TensorFlow 2.0 in that same kind of way?
01:02:12.280 | Or is there this pressure to once it hits 2.0,
01:02:15.280 | once you get to the release candidate
01:02:16.800 | and then you get to the final,
01:02:19.000 | that's going to be the stable thing?
01:02:22.520 | - So it's going to be stable in just like when NodeX was,
01:02:26.960 | where every API that's there is going to remain and work.
01:02:31.140 | It doesn't mean we can't change things under the covers.
01:02:34.840 | It doesn't mean we can't add things.
01:02:36.780 | So there's still a lot more for us to do
01:02:39.240 | and we're going to need to have more releases.
01:02:41.120 | So in that sense, there's still,
01:02:42.680 | I don't think we'd be done in like two months
01:02:44.760 | when we release this.
01:02:46.200 | - I don't know if you can say,
01:02:47.600 | but is there, you know,
01:02:49.920 | there's not external deadlines for TensorFlow 2.0,
01:02:53.800 | but is there internal deadlines,
01:02:57.120 | the artificial or otherwise,
01:02:58.600 | that you're trying to set for yourself?
01:03:00.920 | Or is it whenever it's ready?
01:03:03.120 | - So we want it to be a great product, right?
01:03:05.680 | And that's a big, important piece for us.
01:03:08.880 | TensorFlow's already out there.
01:03:11.200 | We have, you know, 41 million downloads for 1.x.
01:03:13.760 | So it's not like we have to have this.
01:03:16.440 | Yeah, exactly.
01:03:17.280 | So it's not like,
01:03:18.520 | a lot of the features that we've, you know,
01:03:20.200 | really polishing and putting them together are there.
01:03:23.600 | We don't have to rush that just because.
01:03:26.200 | So in that sense,
01:03:27.040 | we want to get it right and really focus on that.
01:03:29.940 | That said, we have said that we are looking to get this out
01:03:32.600 | in the next few months, in the next quarter.
01:03:34.480 | And we, you know, as far as possible,
01:03:37.100 | we'll definitely try to make that happen.
01:03:39.960 | - Yeah, my favorite line was,
01:03:41.880 | spring is a relative concept.
01:03:44.360 | I love it.
01:03:45.200 | - Yes.
01:03:46.040 | - Spoken like a true developer.
01:03:47.720 | So, you know, something I'm really interested in,
01:03:50.240 | and your previous line of work is,
01:03:53.000 | before TensorFlow, you led a team at Google on search ads.
01:03:56.680 | I think this is a very interesting topic,
01:04:01.880 | on every level, on a technical level.
01:04:04.040 | Because at their best,
01:04:06.120 | ads connect people to the things they want and need.
01:04:09.160 | - Yep.
01:04:10.120 | - And at their worst, they're just these things
01:04:12.320 | that annoy the heck out of you,
01:04:14.960 | to the point of ruining the entire user experience
01:04:17.360 | of whatever you're actually doing.
01:04:19.060 | So they have a bad rep, I guess.
01:04:22.200 | And on the other end,
01:04:26.240 | so that this connecting users to the thing they need and want
01:04:29.680 | is a beautiful opportunity for machine learning to shine.
01:04:34.080 | Like huge amounts of data that's personalized,
01:04:36.360 | and you kind of map to the thing
01:04:37.880 | they actually want, won't get annoyed.
01:04:40.440 | So what have you learned from this,
01:04:43.240 | Google that's leading the world in this aspect?
01:04:45.200 | What have you learned from that experience?
01:04:47.600 | And what do you think is the future of ads?
01:04:51.600 | Take you back to the--
01:04:52.560 | - Yeah.
01:04:53.400 | (laughing)
01:04:54.240 | Yes, it's been a while,
01:04:55.280 | but I totally agree with what you said.
01:04:58.400 | I think the search ads,
01:05:01.480 | the way it was always looked at,
01:05:03.240 | and I believe it still is,
01:05:05.280 | it's an extension of what search is trying to do.
01:05:08.280 | The goal is to make the information
01:05:10.600 | and make the world's information accessible.
01:05:14.720 | With ads, it's not just information,
01:05:17.120 | but it may be products or other things
01:05:19.160 | that people care about.
01:05:20.800 | And so it's really important for them to align
01:05:23.840 | with what the users need.
01:05:26.480 | And in search ads,
01:05:29.160 | there's a minimum quality level
01:05:30.960 | before that ad would be shown.
01:05:32.320 | If you don't have an ad that hits that quality,
01:05:34.040 | but it will not be shown even if we have it,
01:05:36.000 | and okay, maybe we lose some money there, that's fine.
01:05:39.640 | That is really, really important.
01:05:41.280 | And I think that is something I really liked
01:05:43.440 | about being there.
01:05:45.080 | Advertising is a key part.
01:05:48.200 | I mean, as a model, it's been around for ages, right?
01:05:51.720 | It's not a new model.
01:05:53.440 | It's been adapted to the web
01:05:54.880 | and became a core part of search
01:05:57.480 | and many other search engines across the world.
01:06:02.160 | I do hope, like I said,
01:06:04.440 | there are aspects of ads that are annoying
01:06:06.720 | and I go to a website
01:06:08.000 | and if it just keeps popping an ad in my face,
01:06:11.120 | not to let me read, that's gonna be annoying clearly.
01:06:13.880 | So I hope we can strike that balance
01:06:18.800 | between showing a good ad
01:06:23.120 | where it's valuable to the user
01:06:25.120 | and provides the monetization to the service.
01:06:31.040 | And this might be search,
01:06:32.080 | this might be a website, all of these.
01:06:34.840 | They do need the monetization
01:06:36.960 | for them to provide that service.
01:06:39.680 | But if it's done in that good balance
01:06:42.480 | between showing just some random stuff that's distracting
01:06:47.480 | versus showing something that's actually valuable.
01:06:50.960 | - So do you see it moving forward
01:06:53.480 | as to continue being a model
01:06:57.560 | that funds businesses like Google,
01:07:01.000 | that's a significant revenue stream?
01:07:05.200 | 'Cause that's one of the most exciting things,
01:07:08.160 | but also limiting things in the internet
01:07:09.720 | is nobody wants to pay for anything.
01:07:12.240 | And advertisements, again, coupled at their best,
01:07:15.400 | are actually really useful and not annoying.
01:07:17.600 | Do you see that continuing and growing and improving
01:07:22.360 | or do you see more Netflix-type models
01:07:26.720 | where you have to start to pay for content?
01:07:29.000 | - I think it's a mix.
01:07:30.360 | I think it's gonna take a long while
01:07:32.280 | for everything to be paid on the internet, if at all.
01:07:35.360 | Probably not.
01:07:36.200 | I mean, I think there's always gonna be things
01:07:37.840 | that are sort of monetized with things like ads.
01:07:40.800 | But over the last few years, I would say,
01:07:42.800 | we've definitely seen that transition
01:07:44.800 | towards more paid services across the web
01:07:48.640 | and people are willing to pay for them
01:07:50.400 | because they do see the value.
01:07:51.720 | I mean, Netflix is a great example.
01:07:53.640 | I mean, we have YouTube doing things.
01:07:56.560 | People pay for the apps they buy.
01:07:58.760 | More people, I find, are willing to pay
01:08:01.080 | for newspaper content,
01:08:03.120 | for the good news websites across the web.
01:08:07.240 | That wasn't the case even a few years ago, I would say.
01:08:11.040 | And I just see that change in myself as well
01:08:13.320 | and just lots of people around me.
01:08:14.840 | So definitely hopeful that we'll transition
01:08:17.160 | to that mixed model where maybe you get
01:08:20.840 | to try something out for free, maybe with ads,
01:08:24.160 | but then there's a more clear revenue model
01:08:27.400 | that sort of helps go beyond that.
01:08:29.360 | - So speaking of revenue,
01:08:33.440 | how is it that a person can use the TPU
01:08:37.160 | in a Google Colab for free?
01:08:39.440 | So what's the--
01:08:40.640 | (laughing)
01:08:42.000 | I guess the question is, what's the future of TensorFlow
01:08:47.000 | in terms of empowering, say, a class of 300 students
01:08:51.920 | and then amassed by MIT,
01:08:55.400 | what is going to be the future of them being able
01:08:57.840 | to do their homework in TensorFlow?
01:09:00.040 | Like, where are they going to train these networks, right?
01:09:02.880 | What's that future look like with TPUs,
01:09:06.480 | with cloud services, and so on?
01:09:08.960 | - I think a number of things there.
01:09:10.280 | I mean, TensorFlow, open source, you can run it wherever.
01:09:13.680 | You can run it on your desktop
01:09:15.040 | and your desktops always keep getting more powerful,
01:09:17.520 | so maybe you can do more.
01:09:19.560 | My phone is, like, I don't know how many times
01:09:21.440 | more powerful than my first desktop.
01:09:23.680 | - You'll probably train it on your phone, though.
01:09:25.240 | Yeah, that's true.
01:09:26.280 | - Right, so in that sense,
01:09:27.720 | the power you have in your hands is a lot more.
01:09:30.640 | Clouds are actually very interesting from, say,
01:09:34.440 | students' or courses' perspective
01:09:36.960 | because they make it very easy to get started.
01:09:40.080 | I mean, Colab, the great thing about it
01:09:42.080 | is go to a website and it just works.
01:09:45.160 | No installation needed, nothing,
01:09:47.600 | you're just there and things are working.
01:09:50.040 | That's really the power of cloud as well.
01:09:52.320 | And so I do expect that to grow.
01:09:55.360 | Again, Colab is a free service.
01:09:57.960 | It's great to get started, to play with things,
01:10:00.880 | to explore things.
01:10:02.200 | That said, with free you can only get so much.
01:10:06.200 | So just like we were talking about,
01:10:10.160 | free versus paid, and yeah,
01:10:12.240 | there are services you can pay for and get a lot more.
01:10:15.320 | - Great, so if I'm a complete beginner
01:10:17.720 | interested in machine learning and TensorFlow,
01:10:20.000 | what should I do?
01:10:21.640 | - Probably start with going to our website
01:10:23.560 | and playing there.
01:10:24.400 | - So just go to TensorFlow.org and start clicking on things?
01:10:26.600 | - Yep, check our tutorials and guides.
01:10:28.480 | There's stuff you can just click there
01:10:29.840 | and go to Colab and do things.
01:10:31.360 | No installation needed, you can get started right there.
01:10:34.080 | - Okay, awesome, Rajat, thank you so much for talking today.
01:10:36.760 | - Thank you, Lex, it was great.
01:10:38.360 | (upbeat music)
01:10:40.940 | (upbeat music)
01:10:43.520 | (upbeat music)
01:10:46.100 | (upbeat music)
01:10:48.680 | (upbeat music)
01:10:51.260 | (upbeat music)
01:10:53.840 | [BLANK_AUDIO]