TensorFlow Tutorial (Sherry Moore, Google Brain)

So I'm going to take a picture so I remember how many of you are here. Smile. Like Sami says, my name is Sherry Moore. I work in the Google Brain team. So today I'll be giving a tutorial on TensorFlow. First I'll talk a little bit about what TensorFlow is and how it works, how we use it at Google.

And then the important part is that I'm going to work with you together to build a couple models to solve the most classic machine learning problems, so-called get your feet wet for those of you from New Zealand. Anybody from New Zealand? So hopefully at the end, you'll be going home with all the tools that you have to build all the wonderful things that you have watched today, like all the image recognition, the training of different colors, arts, making music.

So that's the goal. So before I go any further, has everybody installed TensorFlow? Yay, brilliant. Thank you. And I would like to acknowledge-- so I know the link here says Sherry-Ann, but if you have Wolf G, TF tutorial is perfectly fine. Wolf is actually my colleague who spent all the time verifying installation on every single platform.

So I would really like to thank him. Thanks, Wolf, if you're watching. And also, I have my wonderful product boss or product manager in the audience somewhere. So if you guys have any request for TensorFlow, make sure that you go find him and tell him why TensorFlow must support this feature.

Or is Zach somewhere? All right, so there he is. So with that, we can move forward to talk about TensorFlow. So what exactly is TensorFlow? TensorFlow is a machine learning library that we developed at Google. And we open sourced it last November. And ever since then, we have become the most, most popular machine learning library on GitHub.

How do we know? Because we have over 32,000 stars. Those of you who track GitHub, you know how hard it is to get one of those acknowledgments. And we also have over 14,000 forks. And we have over 8,000 contributions from 400 individual developers. And we designed this specifically for machine learning.

However, as you'll see later, because of its really flexible data flow infrastructure, it makes it really suitable for pretty much any application that can fit into that model. Basically, if your model can be asynchronous and fire on when data is ready, it can probably use TensorFlow. Originally, we worked alongside with other researchers.

As a matter of fact, I was really fortunate. When I joined the team, I sat right next to Alex, the person who invented AlexNet. So that's how closely we worked together. As we developed TensorFlow, they would tell us, no, this is not how we use it. Yes, when you do this, it makes our lives a lot easier.

And this is why we believe that we have developed an infrastructure that will work really well for researchers. And also, being Google, we also always have in mind that we would like to take from research to prototyping to a production in no time. We don't want you to write all the code that's typically just throwing away.

We want you to write code that can literally cut and paste and save in a file and productize it immediately. So TensorFlow is really designed with that in mind. So we are halfway into your deep learning school. So can anybody tell me, if you want to build a neural net, what must you have?

What are the primitives? What are the-- yeah, primitive, I think, is the word I'm looking for. What must you have to build a neural net? Anybody? What is in the neural net? That's a very good answer. So in a neural net, you have neurons. That's right. So all these neurons, what do they operate on?

What do all these neurons do? They process data, and they operate on data. And they do something, such as convolution, matrix multiplication, max pooling average, pooling dropout, whatever that is. So in TensorFlow, all the data is held in something called a tensor. Tensor is nothing more than a multidimensional array.

For those of you who are familiar with NumPy arrays, it's very similar to the ND array. And the graph, I think one of the gentlemen earlier this morning described, there is this concept of the graph, which is a composition of all these neurons that do different functions. And all these neurons are connected to each other through their inputs and outputs.

So as data become available, they would fire-- by fire, I mean they do what they're designed to do, such as doing matrix multiplication or convolution. And then they will produce output for the next computation node that's connected to the output. So by doing this-- so I don't know how many of you can actually see this animation.

Yeah? So this is to really visualize how TensorFlow works. All these nodes, the oval ones are computation. The rectangle ones are stateful nodes. So all these nodes, they would generate output, or they take input. And as soon as all the inputs for a particular node are available, it would do its thing, produce output.

And then the tensor, all the data, which are held in tensors, will flow through your network. Therefore, tensor flow. Yeah? So everybody's like, wow, this sounds like magic. How does it work? So who said-- is it Sir Arthur Clark that says, any sufficiently-- what's the word? Any sufficiently advanced technology is indistinguishable from magic.

So that's what this is. It's just really awesome. Excuse me for a second. I know I want to get through this as quickly as possible so we can actually do the lab that you're all dying to do. So as any good infrastructure-- so this is-- I want to give you a little image of how we design this tensor flow.

Just like any well-designed infrastructure, it has to be really modular. Because being modular allows you to innovate, to upgrade, to improve, to modify, to do whatever you want with any piece, as long as you keep the APIs consistent. Everybody can work in parallel. It's really empowering. I think that's one of the wonderful things that's done at Google.

Pretty much any infrastructure at Google is really modular. They talk really well to each other. All you need to maintain is the API stability. So in this case, we have a front end. I think you guys must have seen some examples of how you construct a graph. So we have the front end libraries written in your favorite language.

And if C++ and Python is not your favorite language, feel free to contribute. We always welcome contribution. So you construct your graph in your favorite language. And this graph will be sent to-- we call it the core TensorFlow execution system. That's your runtime. And that's what you all will be running today on your laptop when you open your Python notebook or Jupyter notebook.

So the execution runtime, depending on where you are going to run this application, it will send the kernel to the corresponding device. So it could be a CPU, could be a GPU, could be a phone, could be TPU. Anybody knows what TPU is? Brilliant, very nice. I was at Strata.

I said, anybody knows what TPU is? And everybody's like, hmm, translation? So this is good. So just to highlight our portability, today you'll be running TensorFlow on your laptop. We run it in our data center. Everybody can run it on your iPhone, your Android phone. I would love to see people putting it on Raspberry Pi, because can you imagine, you can just write your own TensorFlow application.

It could be your security system, because somebody just stole my bike and my security camera, capture all this grainy stuff that I cannot tell. Wouldn't it be nice if you do machine learning on this thing and they just start taking high-resolution pictures when things are moving, rather than constantly capturing all those grainy images, which is totally useless?

So I think the application-- literally, applications are limitless. Your imagination is the limit. So we talked about what TensorFlow is, how it works. How do we use it at Google? We use it everywhere. I think you have seen some of the examples. We use it to recognize pictures. This is actually done with Inception.

They can recognize out of the box 1,000 images. You have to retrain it if you want it to recognize, say, all your relatives or your pets. But it's not difficult. And I have links for you to-- actually, if you want to train on your own images, it's really easy.

They should totally try it. Wouldn't it be fun if you go to your 40-year reunion and you just go, I know who you are. Just show off a little. It would be brilliant. And we also use it to do Google Voice Search. This is one that's super awesome. So how many of you use Smart Reply?

Have you ever used Smart Reply? Yeah, yeah, this is awesome, especially for those of you who are doing what you're not supposed to do-- texting while driving, you saw an email coming in, and you can just say, oh, yes, I'll be there. So based on the statistics that we collected in February, over 10% of all the responses sent on mobile is actually done by our Smart Reply.

I believe if we have-- maybe Zach can collect some stats for me later. And maybe by now, it'll be like 80%. It's actually really funny. At the very beginning, when we train it, the first answer is always, I love you. We're like, that's probably not the right answer. We also play games.

All of you, I'm sure, have followed this. There are all kinds of games that are being developed. It's really fun to watch if you watch it. Literally, come up with scenarios for you to play as well. It not only learns to play the game, but learns how to make a game for you.

It's fascinating. And of course, art. I think many of you have done this deep dream. If we have time in the end of the lab, we can try this. So if we are super fast, we can all try to make some art. And all those, what I just talked about, of course, Google being this wonderful, generous company, wants to share our knowledge.

So we have actually published all our models. So if you go to that link, you'll find all these inception and captioning, language model on a billion words, the latest ResNet on CIFAR-10, sequence to sequence, which I think Kwok will be talking about tomorrow. And we have many other high-level libraries.

So today, my lab, the lab that we will do, will be on the core TensorFlow APIs. But there are tons of new higher-level APIs, such as some of the mentioned Keras. And we have SLIM. We have PrettyTensor. We have TF Learn. We have many libraries that's developed on top of the core TensorFlow APIs.

Then we encourage people to do so. If whatever is out there does not fit your needs perfectly, go for it. Develop your own. And we welcome the contribution. And we published a lot of that here. I might have blurred some of the boundaries. But these are basically all the models and libraries that we have produced.

And we really love contribution. If you have developed a really cool model, please do send to us. And we will showcase your work. So that's the introduction of TensorFlow. How does everybody feel? Are you all ready to get started? All right, so OK, before you bring up your Python notebook, I want to say what we are going to do first.

So as I mentioned, there are two classic machine learning problems that everybody does. One is linear regression. The other is classification. So we are going to do two simple labs to cover those. I do have a lot of small exercises you can play with. I encourage you to play with it to be a lot more comfortable.

So the first one is linear regression. So I'm sure it has been covered, yeah, in today's lectures. Somebody must have covered linear regression. Can anybody give me a one-line summary? What is a linear regression problem? Anybody? The professors-- Well, if you don't know, go Google it. So I didn't know the audience when Sammy asked me to do this.

So I wrote this for one of the high schools. So I think it still kind of makes sense, right? Because all of us have played this game at one point of our lives. Like, if you tell me 5 or tell you 10, and you try to guess what the equation is, we must have all done this.

I think my friends are still doing on Facebook saying, oh, only genius can solve this kind of equation. And then they would be like, yeah, I solved it. I was like, my god, if anybody-- I will unfriend you guys if you click on another one of those. But basically, this is what we are trying to do in the first lab.

So we will have a mystery equation. It's really simple. It's just a linear-- literally a line. And then I will tell you that this is the formula. But I'm not going to give you a weight, w and b. All of you have learned by now, w stands for weight and b stands for bias.

So the idea is that if you are given enough samples, if you are given enough x and y values, you should be able to make a pretty good guess what w and b is. So that's what we are going to do. So now you can bring up your Jupyter Notebook if you don't have it up already.

Yeah, everybody have it up? Yes? Can I see a show of hands, everybody? Those of-- yeah, brilliant. All right. So for pretty much any models, these are going to come up over and over again. And just to make sure that you're all paying attention, I do have-- I asked Sammy if I was supposed to bring Shrek, and he said no.

But I do have a lot of TensorFlow stickers, and I have all kinds of little toys. So later, I'm going to ask this question. Whoever can answer will get some mystery present. So really pay attention, OK? So pretty much whenever you build any model, there are, I would say, four things that you will need.

You need input. You need data. So you're going to see in both labs, we're going to be defining some data. You're going to be building an inference graph. I think in other lectures, it's also called a Fourier graph, to the point that it produces logits, the logistic outputs. And then you're going to have training operations, which is where you would define a loss, an optimizer.

And I think that's pretty much it. Hang on. And there's a fourth thing. Yeah, and then you will basically run the graph. So the three important things, OK, you'll always have your data, your inference graph. You always have to define your loss and your optimizer. And the training is basically to minimize your loss.

So I'm going to be asking that later. All right. So now we know what we're going to do. So you can go to that lab. Yeah, everybody have it? So Shift, Return. We'll run the first one. You say, I have no idea what's happening. Here, we turn again. Still nothing.

However, let's see what we are producing here. So you can also do the same on your laptop. You can uncomment that plot. You're going to say, so you know what kind of data you're generating. So in this case, when here we turn, what are we seeing? This is your input data.

This is when you try to make a guess, when your friend tell me, oh, give me x and y. So this is when your x is 0.2, your y is 0.32. So this is basically your input data. Yeah, everybody following? If at any point you're kind of lost, raise your hand, and your buddy next to you will be able to help you.

So now-- oh, OK, I want to say one more thing. So today, the labs are all on really core TensorFlow APIs. The reason I want to do that-- I know there are a lot of people who use Keras use another thing that we heavily advertise, which is contript TF, contript TF learn.

So I feel like I'm giving you all the ingredients. So even though you could go to Whole Foods and buy the package meal, maybe one day you don't like the way they cook it. So I'm giving you all your lobsters, your Kobe beef, so that you can actually assemble whatever you want to build yourself.

So this next one is very key. It's a very key concept. Here you'll see variables. So variable in TensorFlow is how-- it's corresponding to the square. Any of you remember this slide? OK, I'm going to switch quickly. Don't freak out. So actually, I wanted you all to commit this little graph to your memory, because you'll be seeing this over and over again.

And it makes a lot more sense when you have this visual representation. So in TensorFlow, the way we hold all the data, the weights and the biases associated with your network is using something called variable. It's a stateful operation. I'm going to switch back, OK? So this is what we are doing in section 1.3.

We are building those square nodes in your network to hold these weights and variables. And they are the ones when you train. That's where the gradients will be applied to, so that they will eventually resemble the target network that you are trying to train for. So now you have built it.

Wonderful. So you can shift return. Do you see anything? Nope. So exactly what have we built? That's uncommon. Take a look. So these are called the variable objects. So at the bottom of the slide for this lab, I have a link, which is our Google 3 docs, the API docs, which is available in GitHub.

I think you should always have that up, so whenever you want to do something, you would know what kind of operations are possible with this object. For example, I can say here, what's the name of this? Oh, it's called variable 6. Why is it called variable 6? Oh, it's because when I create this variable, I didn't give it a name.

So I can say Sherry's-- Sherry weight. I hope that's not-- but so see, now my variable is called Sherry weight. Same thing with my-- so this would be a good practice, because later-- Sherry by is-- oh, because I ran this so many times. Every single time you run, if you don't restart, that is going to continue to grow your current path.

So to avoid that confusion, let me restart it. Restart. I had to wait. Sorry. So now, so we have done-- built our input, built our inference graph. Now we can actually build our training graph. And as you have all learned, we need to define a loss function. We need to define an optimizer.

I think it's also called something else-- regularizer, maybe some other terms. And your ultimate goal is to minimize your loss. So I'm not going to do it here, but you can do it at your leisure. You can uncomment all these things that you have created and see what they are.

And I can tell you these are different operations. So that's how you actually get to learn about the network that you have built really well. In the next line, I'm also not going to uncomment, but you should at one point. This is how you can see what you have built.

So actually, why don't we do that? Because this is really critical. And as you debug, this would become-- so this is the network that you have built. They have names, different names. They have inputs and outputs. They have attributes. And this is how we connect all these nodes together.

This is your neural net. So what you're seeing right now is your neural net that you have just built. Yeah? Everybody following? So now, the next step-- now you're done. You build your network. You build all your training. Now, let's do some training. So in TensorFlow, do you remember in the architecture that I showed, you have the front end, C++ and Python front end.

You use that to build your graphs. And then you send a graph to your runtime. And this is exactly what we're doing here. This is how we talk to the runtime. We create something called a session. You get a handle to the session. And then when you say run, you're basically sending this session, your graph.

So this is different from the other machine learning libraries. I forgot which one. Those are so-called imperative. It happens as you type. TensorFlow is different. You have to construct your graph. And then you create a session to talk to your runtime so that it knows how to run on your different devices.

That's a very important concept because people constantly compare. And it's just different. So now you can also comment to see what the initial values are. But we're not going to do that. We're just going to run it. And now we're going to train. The data is not so-- what do you think of the data?

Did we succeed in guessing? Is everybody following what we are trying to do? Yeah? Yes? No? So what was our objective before I started the lab? What did I say our objective was? Find the mic. Yes, to guess the mystery function. So have we succeeded? It's really hard to tell.

All right, so now all of you can go to the end and comment this part. Let's see how successful we are. So the green line was what we have initialized our weight and bias to. Yeah? The blue dots were the initial value, the target values. And the red dots is our trained value.

Make sense? So how successful are we? Great big success? Yeah, I would say so. So any questions? Any questions so far? So what are the things? So everybody should play with this. You're not going to break it. This is a notebook, Python notebook. The worst that happens is they would just say, OK, clear all.

Like, well, I just did, and change it. So what can you play with? Since today you learned all these concepts about different loss functions, different optimizers, all this crazy different inputs, different data. So now you can play with it. How about instead of-- let's pick one. So instead of gradient descent, what are the other optimizers?

How can you find out? I guess that's a better question. If I want to know what other optimizers are available in TensorFlow, how can I find out? Very good. Yes, the GitHub, Google 3, the G3 doc link with the APIs. I'm going to switch one more tab. Bear with me.

So this is-- when you go there, this is what you can find. You can find all the-- let me make it bigger. So you can find all the different optimizers. So you can play with that. So maybe gradient descent is not the best optimizer you can use. So you go there and say, what are the other optimizers?

And then you can literally come here and search optimizer. Well, you can say, wow, I have add the delta, add a grad, add them. I'm sure there are more-- a momentum. So we also welcome contribution. If you don't like any of these, please do go contribute. A new optimizer, send a pull request.

We would love to have it. So I would like to say this over and over again. We love contribution. It's an open source project. So keep that in mind. We would love to see your code or your models on GitHub. So back to this one. How is everybody feeling?

This is too simple? Yeah? Should we go register? Yes? Can I say that one? Yeah. I mean, the gist is not good. The optimization rule says that it's up to all the Oh, is that right? Hit Tab to see all the other optimizers you meant? Oh, brilliant. See, I didn't even know that.

Learn something new every day. Let me go there. Tab. Here? Oh, yay. So this is even easier. Thank you. Clearly, I don't program in Notebook as often as I should have. So this is where you can-- all the wonderful things that you can do. Thank you. This is probably a little too low level.

I think it has everything. But that's a very good tip. Thank you. So anything else you would like to see with linear regression? It's too simple. You guys all want to recognize some digits. All right. So that sounds like a consensus to me. So let's move. If you just go to the bottom, you can say-- click on this one.

So this is our MNIST model. So before we start the lab, so once again, what are we trying to do? So we have all these handwritten digits. What does MNIST stand for? Does anybody know? What does MNIST stand for? Very good. See, somebody can Google. Very good. So it stands for, I think, Mixed National Institute of Standards and Technology, something like that.

So they have this giant collection of digits. So if you go to the post office, you already know that it's a trivia. It's a solved problem. But I don't know if they actually use machine learning. But our goal today is to build a little network using TensorFlow that can recognize these digits.

Once again, we will not have all the answers. So all we know is that the network, the input will give us a 1. And then we'll say it's a 9. And then we have the so-called ground truth. And then they will look at it and say, no, you're wrong.

And then we'll have to say, OK, fine. This is the difference. We are going to train the network that way. So that's our goal. Yeah? Everybody see the network on the slide? So now we can go to the lab. So can anybody tell me what are the three or four things that's really important whenever you build a network?

What's the first one? Your data. Second one? Inference graph. Third one? Your train graph. And with this lab, I'm going to teach you a little bit more. They are like the rock. Like when you go to a restaurant, I not only give you your lobster or your Kobe beef, I'm also going to give you a little rock so you can cook it.

So in this lab, I've also teach some absolutely critical additional infrastructure pieces, such as how to save a checkpoint, how to load from a checkpoint, and how do you evaluate your network. I think somebody at one point asked, how do you know the network is enough? You evaluate it to see if it's good enough.

So those are the three new pieces of information that I'll be teaching you. And also, I'll teach you a really, really useful concept. It's called placeholder. That was requested by all the researchers. We didn't used to have it, but they all came to us and say, when I train, I want to be able to feed my network any data we want.

So that's a really key concept that's really useful for any practical training. Whenever you start writing real training code, I think that will come in handy. So those are the, I think, four concepts now that I will introduce in this lab that's slightly different from the previous one-- how to save checkpoint, how to load from checkpoint, how to run evaluation, and how to use placeholders.

I think the placeholder is actually going to be the first one. So once again, we have our typical boilerplate stuff. So you hit Return, you import a bunch of libraries. The second one, this is just for convenience. I define a set of constants. Some of them you can play with, such as the maximum number of steps, where you're going to save all your data, how big the batch sizes are, but some other things that you cannot change because of the data that I'm providing you.

For example, the MNIST pictures. Any questions so far? So now we'll read some data. Is everybody there in 2.3? I'm at 2.3 right now. So now I use-- if you don't have /tmp, it might be an issue, but hopefully you do. If you don't have /tmp, change the directory name.

So the next one is where we build inference. So can anybody just glance and then tell me what we're building? What kind of network? How many layers am I building? I have two hidden layers. You have all learned hidden layers today. And I also have a linear layer, which will produce logits.

That's correct. So that's what all the inference graphs will always do. They always construct your graph, and they produce logistic outputs. So once again, here you can uncomment it and see what kind of graph you have built. Once you have done the whole tutorial by yourself, you can actually run TensorBoard, and you can actually load this graph that you have saved.

And you can visualize it, like what I have shown in the slide. I didn't draw that slide by hand. It's actually produced by TensorBoard. So you can see the connection of all your nodes. So I feel that that visual representation is really important. Also, it's very easy for you to validate that you have indeed built a graph that you thought.

Sometimes people call something repeatedly, and they have generated this gigantic graph. They're like, oh, that wasn't what I meant. So being able to visualize is really important. Any questions so far? See here, I have good habits. I actually gave all my variables names. Once again, the hidden layer 1, hidden layer 2.

They all have weights and biases, weights and biases, et cetera. So now we're going to build our train graph. So here is-- actually, here, there's no new concept. Once again, you define the loss function. We once again pick gradient descent as our optimizer. We added a global step variable.

That's what we will use later when we save our checkpoints. So you actually know at which point, what checkpoint this corresponds to. Otherwise, if you always save it to the same name, then later you say, wow, this result is so wonderful. But how long did it take? You have no idea.

So that's a training concept that we introduced. It's called global step, basically how long you have trained. And we usually save that with the checkpoint so you know which checkpoint has the best information. Yeah, everybody is good at 2.5? So now the next one is the additional stuff that I just mentioned.

That piece of rock that I'm giving you now to cook your stuff. So one is a placeholder. So we are going to define two, one to hold your image and the other to hold your labels. We build it this way so that we only need to build a graph once.

And we will be able to use it for both training, inference, and evaluation later. It's very handy. You don't have to do it this way. And one of the exercises I put in my slide is to try to do it differently. But this is a very handy way and get you very far with minimum work.

So as I said in the slides, I know I don't have any highlighters, beams. But you see there it says, after you create your placeholders, I said, add to collection and remember this up. And later we'll see how we're going to call this up and how we're going to use it.

And the next one, we're going to call our inference, build our inference. Is everybody following this part OK? And once again, we remember our logits. And then we create our train op and our loss op, just like with linear regression. Just like with the linear regression, we're going to initialize all our variables.

And now at the bottom of this cell, that's the second new concept that I'm introducing, which is the saver. This is what you will use to do checkpoints, to save the states of your network so that later you can evaluate it. Or if your training was interrupted, you can load from a previous checkpoint and continue training from there, rather than always reinitialize all your variables and start from scratch.

When you're training really big networks, such as Inception, it's absolutely critical. Because I think when I first trained Inception, it took probably six days. And then later, when we have 50 replicas, it took still-- like, stay of the hour is still 2 and 1/2 days. You don't want to have to start from scratch every single time.

So yeah, everybody got that? The placeholder and the saver. So now it's 2.7. We're going to go to 2.7. Lots of code. Can anybody tell me what it's trying to do? So this is an-- yes. So it's trying to minimize loss. We can actually see this. So we'll run it once, OK?

Where did I go? OK. Very fast. It's done. But what if I really want to see what it's doing? So Python is wonderful. So I would like to actually see-- did somebody show how you know your training is going well? They show the loss going down, going down. Oh, I think my training is going really well.

So we're going to do something similar. Sorry. So I'm going to create a variable. What do you call it? Losses? Which is just an array. So here, I'm actually going to remember it. Pinned. So what am I collecting? Matplotlib. Anybody remember this? It's a plot. Let's try this. Oh, look at that.

Now, do you see your loss going down? So as you train, your loss actually goes down. So this is how, when you do large-scale training, this is what we typically do. We have a gazillion of these jobs running. In the morning, we would just glance at it, and we know, oh, which one is doing really, really well.

So of course, that's just when you are prototyping. That's a really, really handy tool. But I'm going to show you something even better. Oh, that's part of the exercise. Man, I don't have it. So as one of the exercises, I also put the answers in the backup slides that you guys are welcome to cut and paste into a cell.

Then you can actually run all the evaluation sets against your checkpoint so that you know how well you're performing. So you don't have to rely on your eyes, glancing, oh, my loss is going down, or relying on validating a single image. But see, this is how easy it is.

This is how easy the prototype. And you can learn it. Very often, our researchers will cut and paste their Colab code and put it in a file, and that's basically their algorithm. And they will publish that with their paper. They would send it to our data scientists or production people.

We would actually prototype some of their research. This is how easy, literally, from research to prototyping to production. Really streamlined, and you can do it in no time. So for those of you who have run this step, can you do an LS in your data path, wherever you saved that, wherever you declare your trainer to be?

What do you see in there? Checkpoints. That's right. That's the money. That's after all this work, all this training on all these gazillion machines. That's where all your ways, your biases are stored, so that later you can load this network up and do your inception to recognize images, to reply to email, to do art, et cetera, et cetera.

So that's really critical. But how do we use it? Have no fear. All right, let's move on to 2.8, if you are not already there. So can somebody tell me what we are trying to do first? That's right. First, we load the checkpoint. And you remember all the things that we told our program to remember, the logits, and the image placeholder, and the label placeholder.

How are we going to use it now? We're going to feed it some images from our evaluation and see what it thinks. So now if you hit Return, what's the ground truth? Five. What's our prediction? Three. What's the actual image? Could be three, could be five. But so the machine is getting pretty close.

I would say that's a three. OK, let's try a different one. So you can hit Return again in the same cell. Oh, I need to somehow move this. So what's the ground truth this time? Yeah, I got it right. So you can keep hitting. You can keep hitting Return and see how well it's doing.

But instead of validating, instead of hitting Return 100 times and count how many times it has gotten it wrong, as I said in one of the exercises, and I also put the answer in the slides, so you can cut and paste and actually do a complete validation on the whole validation set.

But what do you think? So you can actually handwrite a different digit. But the trick is that a lot of people actually tried that and told me it doesn't seem to work. So remember on the slide, I said this is what the machine sees. This is what your eye sees, and this is what the machine sees.

So in the MNIST data set, all the numbers are between 0 and 1, I believe. I could be wrong, but I believe it's between 0 and 1. So if you just use a random tool like your phone, you write a number and you upload it, number one, the picture might be too big and you need to scale it down.

Number two, it might have a different representation. Sometimes it's from 0 to 255, and you need to scale it to the range, that MNIST. That's how you have trained your network. If you train your network with those data, and then it should be able to recognize the same set of data, just like when we teach a baby, right?

If you have never been exposed to something, you are not going to be able to recognize it. Just like with the OREO, one of our colleagues captioned that program a while ago. Any time when it sees something that it doesn't recognize-- have anybody played with that captioning software? It's super fun.

So you can take a picture and say, two people eating pizza or dog surfing. But any time it sees something that it has never been trained on, it would say, man talking on a cell phone. So for a while, we had a lot of fun with it. We would put a watermelon on the post, and it would say, man talking on a cell phone.

You put a bunch of furniture in the room with nothing, and it would say, man talking on a cell phone. So it was really fun. But just like with your numbers, if you have never trained it with that style-- like if I write Chinese characters here, it's never going to recognize it.

But this is pretty fun, so you can play with it. You can see how well-- see every time. See, so far, it's 100% other than the first one, which I cannot tell either. So what are some of the exercises that we can do here? What do you want to do with this lab?

It's too easy, huh? Because I made this so easy, because I didn't know that you guys are all experts by now. Otherwise, I would have done a much harder lab. Let me see what things we can do. So you can uncomment all the graphs. Oh, so here's one. Actually, you already see it.

So try this. Can you guys try saving the checkpoint, say, every 100 steps? And you're going to have a gazillion, but they're tiny, tiny checkpoints, so it's OK. And try the run evaluation with a different checkpoint and see what you get. Do you know how to do that? Yeah, everybody know how to do that?

So the idea is that when you run the evaluation, it's very similar. So we typically run training and evaluation in parallel or validation. So as it trains, every so often, say, every half an hour, depending on your problem, so with the inception, every 10 minutes, we would also run evaluation to see how well our model is doing.

So if our model gets to, say, 78.6%, which I believe is the state of the art, it would be like, oh, my model's done training. So that's why you want to save checkpoints often and then validate them often. If you're done with that already, this is the last thing I want to show you.

If you're done with that already, did you notice anything? If you try to load from a really early checkpoint, how good is it when it tries to identify the digits? Just take a wild guess. Yeah, very bad. Maybe every other one is wrong. But this MNIST is such a small data set.

It's very easy to train. And we have such a deep network. If you only have one layer, maybe it won't get it right. So another exercise-- I think all these you can do after this session-- is really try to learn to run evaluation from scratch rather than-- actually, another part-- but run evaluation on the complete validation set.

That's a really necessary skill to develop as you build bigger models and you need to run validation. So I think this is the end of my lab. I do have bonus labs. But I want to cover this first. The bottom line is that TensorFlow is really-- it's for machine learning.

It's really from research to prototyping to production. It's really designed for that. And I really hope everybody in the audience can give it a try. And if there are any features that you find it lacking that you would like to see implemented, either send us pull requests. We always welcome contribution.

Or talk to my wonderful product manager, Zach, sitting over there. He is taking requests for features. So with that, yeah, thanks and have fun. Thank you, Sherry. We have time for questions for those who actually tried it. See, it's so well done. Everybody feel like they're experts. They're all ready to go make arts now, right?

Go deep dream. Cool. If there are no questions-- oh, there's one question, I think, someone who's trying desperately. Hi, my name is Pichin Lo. And first of all, thank you for introducing TensorFlow and for designing it. I have two questions. So the first question is, I know that TensorFlow have C++ API, right?

So let's say if I use Keras or any of the Python front end, I train a model. Does TensorFlow support that I can pull out the C++ model of it and then just use that? Yes, you can. So even if I use, for example, Keras custom layer that I code using Python, I still can get those things?

That's correct. Oh, there it goes. It's just the front end that's different, how you construct the graph. Nice. But we are not as complete on our C++ API design. For example, a lot of the training libraries are not complete yet. But for the simple models, yes, you can do it.

Well, let's say-- not the training, but let's say if I just want the testing part. Because I don't need to do-- I mean, the training I can always do in Python. We do have that already. Actually, if you go to our website, there's a label images, .cc. I think that's literally just loading from Checkpoint and run the inference in C.

That's all written in C++. So that's a good example to follow. A second one. So another thing that I noticed that you support almost everything except Windows. Everything except what? I mean, iOS, Android, everything. Oh, have no fear. Actually, we are actively doing that. But when I first joined the team, I think there were 10 of us.

And we have to do everything. Like before open sourcing, all of us were in the conference room together. We're all riding dogs. We're fixing everything. So now we have more people. That's like top of our list. We would love to support it. So I'm just curious, because I mean, when I look at the roadmap, I didn't see a clear timeline for Windows.

But the thing I know that just like the reason why you cannot support Windows is because of Bazel. Bazel doesn't support Windows. So let's say, theoretically, I mean, what you think, just like I know Bazel that just like you will get Window 5 at some point in November. That is what they say.

So once Bazel can run in Windows, can I expect like just like immediately do TensorFlow, or do you foresee some other problem? Maybe Zach would like to take that question. Offline. OK. OK, that's it. So yeah, let's talk offline. Yeah, sure. Thank you very much. Hi, great presentation and session.

My name is Yuri Zifoysh. I have a question about TPUs. Are they available right now for testing and playing for non-Google employees? Are we-- Is TPU-- are TPUs available outside? I don't think so at the moment. Do you know when it might be available in the Google Cloud? Zach, would you like to take that one?

It might be afterwards. I'm so glad we have a product boss here so that he can-- I'm sorry. OK, thank you. Hi, nice tutorial. I have a question. Are there any plans to integrate TensorFlow with the open source framework, like MySource and HDFS, to make the distributed TensorFlow run-- Easy.

So there are definitely plans. We are also always actively working on new features. But we cannot provide a solid timeline right now. So we do have plans. We do have projects in progress. But we cannot commit on a timeline. So I cannot give you a time saying, yes, by November, you have what.

So thank you. But if you have this type of question, I think Zach is the best person to answer. Oh, hi. I was wondering, does TensorFlow have any examples to load your own data? Of what-- which data? So the current example has a MNIST data set. Are there examples out there to load your own data set?

Yes. Yes, definitely. I think we have two. One is called the TensorFlow Poet. I think that one-- that example shows you how you can load your own data set. I think-- is there another one? Zach, are you aware of another one that might be loading your own data set?

I know we have retraining model. If you go to TensorFlow, we have an example to do retraining. Those you can download from anywhere. So in our example, we just downloaded a bunch of flowers. So you can definitely download whatever pictures that you want to retrain. Thank you. Hello. Thank you for your presentation.

I have a question concerning the training. You can't train using TensorFlow in any-- virtually in any system, like Android. And what about the model? Do you provide anything to move the model to Android? Because generally, you program in Java there. Yes. So that's a beautiful thing. You remember the architecture that I showed?

Yes. You build a model, and then just send it to the runtime. It's the same model running on any of the different platforms. It can be a laptop, Android. Do you have your own specific format for the model? Or it's just-- You build the same model. Because the model is just a bunch of matrix and values.

Is there any special format for your model? Because sometimes it is bigger. Yes. So I would not recommend training, say, Inception on your phone, because all the convolution and the backprop will probably kill it 10 times over. So definitely-- so there will be that type of limitation. I think you guys talked about the number of parameters.

If it blows the memory footprint on your phone, it's just not going to work. And if the compute-- especially for convolution, it uses a lot of compute. Yes. That's for training. But-- But for inference, you can run it anywhere. OK. Thank you. It's the same model. You just restore it.

There actually are examples like label image. That's the C++ version. I think I also wrote one. It's called classify image. It's in Python. That's also-- you can run it on your phone. So any of these, you can write your own and load a checkpoint and run it on your phone as well.

So definitely, I encourage you to do that. Thank you. Cool. Hi. I have a question related to TensorFlow serving. So I went through the online documentation and currently, I think it requires some coding in C++ and then combined with Python. Is there only going to be only Python solution that's going to be provided?

Or is it always going to be-- I think you need to do some first step to create a module and then just import it into Python. I am actually surprised to hear that because I'm pretty sure that you can write the model in just Python or just C++. You don't have to write it in one way or the other.

They might have a special exporter tool. At one point, that was the case. They wrote their exporter in C++. I think that's probably what you were talking about. But you don't have to build it in any specific way. The model is just-- you can write in whatever language you like, as long as it produces that graph.

And that's all it needs. So TensorFlow serving, the tutorial, actually, if you go on the site, it had those steps, actually. OK, so I will look into that. So maybe you can come find me later, and I'll see what the situation is. I do know that at one point, they were writing the exporter in C++ only.

But that should have changed by now because we are doing another version of TensorFlow serving. And is there any plan to provide APIs for other languages? Like MXNet has something called MXNetJS. You mean the front end? Front end, yes. Yeah, yeah, yeah. We have Go. I think we have Go.

We have some other languages. Maybe Zach can speak more to it. And once again, if those languages are not our favorite, please do contribute. And if you would like us to do it, talk to Zach. And maybe he can put that-- maybe-- I don't know. Because as somebody asked, for the Android, you need Java front end.

So I think that's going to help out in integrating these models with-- Yeah, that's great feedback. We'll definitely take note. Thank you. Thank you. I have a question. I'm having an embedded GPU board, the X1, which is an ARM processor. And I really wanted to work with the TensorFlow, but I got to know that it can only run on x86 boards.

So when can we expect the TensorFlow can support ARM processors? We will have to get back to you after I have consulted with my product boss, see when we can add that support. Thank you. Sorry. One last question. Thanks for the presentation, Sherry. I have a question regarding the-- when you have the model and you want to run inference, is it possible to make an executable out of it so they can drop it into a container or run it separately from serving?

Is that something that you guys are looking into? Just run the inference? Yeah, just have it as a binary. Yeah, you can definitely do that. Right now you can? Yeah, you are always able to do that. You mean just save the-- you want to-- What I mean is that if you can package it into a single binary source that you can just pass around.

Yes, yes. We actually do that today. That's how the label image works. It's just its own individual binary. OK. It actually converted all the checkpoints into constants. So it doesn't even need to do the slow, et cetera. It just reads a bunch of constants and runs it. So it's super fast.

Thank you. Cool. You're welcome. Thanks, Sherry, again. We're going to take a short break of 10 minutes. Let me remind you, for those who haven't noticed yet, but all the slides of all the talks will be available on the website. So do not worry. They will be available at some point, as soon as we get them from the speakers.

Oh, I forgot to ask my bonus question. But in any case, I have a lot of TensorFlow stickers up here. If you would like one to proudly display on your laptop, come get it.

TensorFlow Tutorial (Sherry Moore, Google Brain)

Chapters

Transcript