Back to Index

Neural networks learning spirals


Transcript

Let's use TensorFlow Playground to see what kind of neural network can learn to partition the space for the binary classification problem between the blue and the orange dots. First is an easier binary classification problem with a circle and a ring distribution around it. Second is a more difficult binary classification problem of two dueling spirals.

This little visualization tool on playground.tensorflow.org is really useful for getting an intuition about how the size of the network and the various hyperparameters affect what kind of representations that network is able to learn. The input to the network is the position of the point in the 2d plane and the output of the network is the classification of whether it's an orange or a blue dot.

We'll hold all the hyperparameters constant for this little experiment and just vary the number of neurons and hidden layers. The hyperparameters are batch size of one, learning rate of 0.03, the activation function is ReLU and L1 regularization with a rate of 0.001. So let's start with one hidden layer and one neuron and gradually increase the size of the network to see what kind of representation it's able to learn.

Keep your eye on the right side of the screen that shows the test loss and the training loss and the plot that shows sample points from the two distributions and then the shading in the background of the plot shows the partitioning function that the neural network is learning. So a successful function is able to separate the orange and the blue dots.

One hidden layer with one neuron, two neurons, three neurons, four neurons, eight neurons. Now let's take a look at the trickier spiral dataset keeping most of the hyperparameters the same but decreasing the learning rate to 0.01 and adding to the input to the neural network extra features than just the coordinate of the point but also the squares of the coordinates, the multiplication, and the sign of each coordinate.

Let's start with one hidden layer, one neuron, two neurons, four neurons, six neurons, eight neurons. Two hidden layers, two neurons in the second layer, four neurons, six neurons, eight neurons. There you go. That's a basic illustration with the playground.tensorflow.org that I recommend you try that shows the connection between neural network architecture, dataset characteristics, and different training hyperparameters.

It's important to note that the initialization of the neural network has a big impact in many of the cases but the purpose of this video was not to show the minimal neural network architecture that's able to represent the spiral dataset but rather to provide a visual intuition about which kind of networks are able to learn which kinds of datasets.

There you go. I hope you enjoyed these quick little videos, whether they make you think, give you a new kind of insights, or just fun and inspiring. See you next time, and remember, try to challenge yourself and learn something new every day. you you you you you you you you you you you you you you you you you you you you you you you you