Linear regression: Gradient descent exercise

In this exercise, you'll revisit the graph of fuel-efficiency data from the Parameters exercise. But this time, you'll use gradient descent to learn the optimal weight and bias values for a linear model that minimizes loss.

Complete the three tasks below the graph.

Task #1: Adjust the Learning Rate slider below the graph to set a learning rate of 0.03. Click the Start button to run gradient descent.

How long does the model training take to converge (reach a stable minimum loss value)? What is the MSE value at model convergence? What weight and bias values produce this value?

Click the plus icon to see our solution

When we set a learning rate of 0.03, the model converged after approximately 30 seconds, achieving a MSE of just under 3 with weight and bias values of –2.08 and 23.098, respectively. This indicates we've picked a good learning rate value.

Task #2: Click the Reset button below the graph to reset the Weight and Bias values in the graph. Adjust the Learning Rate slider to a value around 1.10e^–5. Click the Start button to run gradient descent.

What do you notice about how long it takes the model training to converge this time?

Click the plus icon to see the solution

After several minutes, model training still hasn't converged. Small updates to Weight and Bias values continue to result in slightly lower loss values. This suggests that picking a higher learning rate would enable gradient descent to find the optimal weight and bias values more quickly.

Task #3: Click the Reset button below the graph to reset the Weight and Bias values in the graph. Adjust the Learning Rate slider up to 1. Click the Start button to run gradient descent.

What happens to the loss values as gradient descent runs? How long will model training take to converge this time?

Click the plus icon to see the solution

Loss values fluctuate wildly at high values (MSE over 300). This indicates that the learning rate is too high, and model training will never reach convergence.

Hyperparameters (10 min)

Programming exercise (10 min)