Neural networks learning spirals

00:00:00.000 | Let's use TensorFlow Playground to see what kind of neural network can learn to partition the space

00:00:04.960 | for the binary classification problem between the blue and the orange dots. First is an easier

00:00:10.560 | binary classification problem with a circle and a ring distribution around it. Second is a more

00:00:17.040 | difficult binary classification problem of two dueling spirals. This little visualization tool

00:00:24.160 | on playground.tensorflow.org is really useful for getting an intuition about how the size of the

00:00:29.680 | network and the various hyperparameters affect what kind of representations that network is

00:00:34.320 | able to learn. The input to the network is the position of the point in the 2d plane and the

00:00:39.520 | output of the network is the classification of whether it's an orange or a blue dot. We'll hold

00:00:44.960 | all the hyperparameters constant for this little experiment and just vary the number of neurons and

00:00:50.000 | hidden layers. The hyperparameters are batch size of one, learning rate of 0.03, the activation

00:00:56.720 | function is ReLU and L1 regularization with a rate of 0.001. So let's start with one hidden layer and

00:01:03.600 | one neuron and gradually increase the size of the network to see what kind of representation it's

00:01:07.760 | able to learn. Keep your eye on the right side of the screen that shows the test loss and the

00:01:11.920 | training loss and the plot that shows sample points from the two distributions and then the

00:01:16.960 | shading in the background of the plot shows the partitioning function that the neural network is

00:01:21.680 | learning. So a successful function is able to separate the orange and the blue dots. One hidden

00:01:27.520 | layer with one neuron, two neurons, three neurons,

00:01:37.120 | four neurons,

00:01:41.680 | eight neurons.

00:01:49.760 | Now let's take a look at the trickier spiral dataset keeping most of the hyperparameters the

00:01:54.800 | same but decreasing the learning rate to 0.01 and adding to the input to the neural network

00:02:02.960 | extra features than just the coordinate of the point but also the squares of the coordinates,

00:02:08.400 | the multiplication, and the sign of each coordinate. Let's start with one hidden layer,

00:02:13.760 | one neuron, two neurons, four neurons,

00:02:24.640 | six neurons,

00:02:32.160 | eight neurons.

00:02:40.160 | Two hidden layers, two neurons in the second layer,

00:02:59.040 | four neurons,

00:03:06.080 | six neurons,

00:03:34.080 | eight neurons.

00:03:40.960 | There you go. That's a basic illustration with the playground.tensorflow.org that I recommend

00:04:00.960 | you try that shows the connection between neural network architecture, dataset characteristics,

00:04:07.360 | and different training hyperparameters. It's important to note that the initialization of

00:04:12.240 | the neural network has a big impact in many of the cases but the purpose of this video was not

00:04:17.200 | to show the minimal neural network architecture that's able to represent the spiral dataset but

00:04:23.600 | rather to provide a visual intuition about which kind of networks are able to learn which kinds of

00:04:28.960 | datasets. There you go. I hope you enjoyed these quick little videos, whether they make you think,

00:04:34.480 | give you a new kind of insights, or just fun and inspiring. See you next time,

00:04:39.360 | and remember, try to challenge yourself and learn something new every day.

00:04:44.160 | you

00:04:44.960 | you

00:04:45.040 | you

00:04:46.500 | you

00:04:47.380 | you

00:04:47.460 | you

00:04:48.920 | you

00:04:49.000 | you

00:04:50.220 | you

00:04:51.260 | you

00:04:51.340 | you

00:04:52.800 | you

00:04:52.880 | you

00:04:53.380 | you

00:04:53.880 | you

00:04:54.380 | you

00:04:54.880 | you

00:04:55.380 | you

00:04:55.880 | you

00:04:56.380 | you

00:04:56.880 | you

00:04:57.380 | you

00:04:57.880 | you

00:04:58.380 | you

00:04:58.880 | [BLANK_AUDIO]