Back to Index

Jeremy Howard: Deep Learning Frameworks - TensorFlow, PyTorch, fast.ai | AI Podcast Clips


Transcript

From the perspective of deep learning frameworks, you work with fast AI, particularly this framework, and PyTorch and TensorFlow. What are the strengths of each platform, your perspective? So in terms of what we've done our research on and taught in our course, we started with Theano and Keras, and then we switched to TensorFlow and Keras, and then we switched to PyTorch, and then we switched to PyTorch and fast AI.

And that kind of reflects a growth and development of the ecosystem of deep learning libraries. Theano and TensorFlow were great, but were much harder to teach and do research and development on because they define what's called a computational graph up front, a static graph, where you basically have to say, "Here are all the things that I'm going to eventually do in my model." And then later on you say, "Okay, do those things with this data." And you can't debug them, you can't do them step by step, you can't program them interactively in a Jupyter notebook and so forth.

PyTorch was not the first, but PyTorch was certainly the strongest entrant to come along and say, "Let's not do it that way. Let's just use normal Python. And everything you know about in Python is just going to work, and we'll figure out how to make that run on the GPU as and when necessary." That turned out to be a huge leap in terms of what we could do with our research and what we could do with our teaching.

Because it wasn't limiting. Yeah, I mean, it was critical for us for something like DawnBench to be able to rapidly try things. It's just so much harder to be a researcher and practitioner when you have to do everything up front and you can't inspect it. Problem with PyTorch is it's not at all accessible to newcomers because you have to write your own training loop and manage the gradients and all this stuff.

And it's also not great for researchers because you're spending your time dealing with all this boilerplate and overhead rather than thinking about your algorithm. So we ended up writing this very multi-layered API that at the top level you can train a state-of-the-art neural network and three lines of code, and which kind of talks to an API, which talks to an API, which talks to an API, which you can dive into at any level and get progressively closer to the machine kind of levels of control.

And this is the FastAI library. That's been critical for us and for our students and for lots of people that have won big learning competitions with it and written academic papers with it. It's made a big difference. We're still limited though by Python and particularly this problem with things like recurrent neural nets, say, where you just can't change things unless you accept it going so slowly that it's impractical.

So in the latest incarnation of the course and with some of the research we're now starting to do, we're starting to do stuff, some stuff in Swift. I think we're three years away from that being super practical, but I'm in no hurry. I'm very happy to invest the time to get there.

But, you know, with that, we actually already have a nascent version of the FastAI library for vision running on Swift and TensorFlow. Because Python for TensorFlow is not going to cut it. It's just a disaster. What they did was they tried to replicate the bits that people were saying they like about PyTorch, this kind of interactive computation, but they didn't actually change their foundational runtime components.

So they kind of added this like syntax sugar they call TF eager, TensorFlow eager, which makes it look a lot like PyTorch, but it's 10 times slower than PyTorch to actually do a step. So because they didn't invest the time in like retooling the foundations because their code base is so horribly complex.

Yeah, I think it's probably very difficult to do that kind of retooling. Yeah, well, particularly the way TensorFlow was written. It was written by a lot of people very quickly in a very disorganized way. So like when you actually look in the code, as I do often, I'm always just like, oh, God, what were they thinking?

It's just, it's pretty awful. So I'm really extremely negative about the potential future for Python for TensorFlow. But Swift for TensorFlow can be a different beast altogether. It can be like, it can basically be a layer on top of MLIR that takes advantage of, you know, all the great compiler stuff that Swift builds on with LLVM.

And yeah, it could be, I think it will be absolutely fantastic. Well, you're inspiring me to try. I haven't truly felt the pain of TensorFlow 2.0 Python. It's fine by me. But yeah, I mean, it does the job if you're using like predefined things that somebody's already written. But if you actually compare, you know, like I've had to do, because I've been having to do a lot of stuff with TensorFlow recently, you actually compare like, okay, I want to write something from scratch.

And I just keep finding it's like, oh, it's running 10 times slower than PyTorch. So is the biggest cost, let's throw running time out the window, how long it takes you to program? That's not too different now. Thanks to TensorFlow Eager, that's not too different. But because so many things take so long to run, you wouldn't run it at 10 times slower.

Like you just go like, oh, this is taking too long. And also, there's a lot of things which are just less programmable, like tf.data, which is the way data processing works in TensorFlow is just this big mess. It's incredibly inefficient. And they kind of had to write it that way because of the TPU problems I described earlier.

So I just, you know, I just feel like they've got this huge technical debt, which they're not going to solve without starting from scratch. So here's an interesting question. If there's a new student starting today, what would you recommend they use? Well, I mean, we obviously recommend Fast.ai and PyTorch because we teach new students.

And that's what we teach with. So we would very strongly recommend that because it will let you get on top of the concepts much more quickly. So then you'll become an actual and you'll also learn the actual state of the art techniques, you know, so you actually get world class results.

Honestly, it doesn't much matter what library you learn, because switching from China to MXNet to TensorFlow to PyTorch is going to be a couple of days work as long as you understand the foundation as well. But you think will Swift creep in there as a thing that people start using?

Not for a few years, particularly because like Swift has no data science community, libraries, tooling. And the Swift community has a total lack of appreciation and understanding of numeric computing. So like they keep on making stupid decisions, you know, for years they've just done dumb things around performance and prioritization.

That's clearly changing now because the developer of Swift, Chris Latner, is working at Google on Swift for TensorFlow. So like that's a priority. It'll be interesting to see what happens with Apple because like Apple hasn't shown any sign of caring about numeric programming in Swift. So I mean, hopefully they'll get off their ass and start appreciating this because currently all of their low-level libraries are not written in Swift.

They're not particularly Swifty at all. Stuff like Core ML, they're really pretty rubbish. So yeah, so there's a long way to go. But at least one nice thing is that Swift for TensorFlow can actually directly use Python code and Python libraries in a literally the entire lesson one notebook of fast.ai runs in Swift right now in Python mode.

So that's a nice intermediate thing. 1 Page 1 of 3