Neural networks: Nodes and hidden layers

To build a neural network that learns nonlinearities, begin with the following familiar model structure: a linear model of the form $y' = b + w_1x_1 + w_2x_2 + w_3x_3$.

We can visualize this equation as shown below, where $x_1$, $x_2$, and $x_3$ are our three input nodes (in blue), and $y'$ is our output node (in green).

Exercise 1

In the model above, the weight and bias values have been randomly initialized. Perform the following tasks to familiarize yourself with the interface and explore the linear model. You can ignore the Activation Function dropdown for now; we'll discuss this topic later on in the module.

  1. Click the Play (▶️) button above the network to calculate the value of the output node for the input values $x_1 = 1.00$, $x_2 = 2.00$, and $x_3 = 3.00$.

  2. Click the second node in the input layer, and increase the value from 2.00 to 2.50. Note that the value of the output node changes. Select the output nodes (in green) and review the Calculations panel to see how the output value was calculated.

  3. Click the output node (in green) to see the weight ($w_1$, $w_2$, $w_3$) and bias ($b$) parameter values. Decrease the weight value for $w_3$ (again, note that the value of the output node and the calculations below have changed). Then, increase the bias value. Review how these changes have affected the model output.

Adding layers to the network

Note that when you adjusted the weight and bias values of the network in Exercise 1, that didn't change the overall mathematical relationship between input and output. Our model is still a linear model.

But what if we add another layer to the network, in between the input layer and the output layer? In neural network terminology, additional layers between the input layer and the output layer are called hidden layers, and the nodes in these layers are called neurons.

The value of each neuron in the hidden layer is calculated the same way as the output of a linear model: take the sum of the product of each of its inputs (the neurons in the previous network layer) and a unique weight parameter, plus the bias. Similarly, the neurons in the next layer (here, the output layer) are calculated using the hidden layer's neuron values as inputs.

This new hidden layer allows our model to recombine the input data using another set of parameters. Can this help our model learn nonlinear relationships?

Exercise 2

We've added a hidden layer containing four neurons to the model.

Click the Play (▶️) button above the network to calculate the value of the four hidden-layer nodes and the output node for the input values $x_1 = 1.00$, $x_2 = 2.00$, and $x_3 = 3.00$.

Then explore the model, and use it to answer the following questions.

How many parameters (weights and biases) does this neural network model have?
4
Our original model in Exercise 1 had four parameters: w11, w21, w31, and b. Because this model contains a hidden layer, there are more parameters.
12
Note that the total number of parameters includes both the parameters used to calculate the node values in the hidden layer from the input values, and the parameters used to calculate the output value from the node values in the hidden layer.
16
Note that the total number of parameters includes both the weight parameters and the bias parameters.
21
There are 4 parameters used to calculate each of the 4 node values in the hidden layer—3 weights (one for each input value) and a bias—which sums to 16 parameters. Then there are 5 parameters used to calculate the output value: 4 weights (one for each node in the hidden layer) and a bias. In total, this neural network has 21 parameters.

Try modifying the model parameters, and observe the effect on the hidden-layer node values and the output value (you can review the Calculations panel below to see how these values were calculated).

Can this model learn nonlinearities?

Yes
Click on each of the nodes in the hidden layer and the output node, and review the calculations below. What do you notice about all these calculations?
No

If you click on each of the nodes in the hidden layer and review the calculations below, you'll see that all of them are linear (comprising multiplication and addition operations).

If you then click on the output node and review the calculation below, you'll see that this calculation is also linear. Linear calculations performed on the output of linear calculations are also linear, which means this model cannot learn nonlinearities.