Categorical data: Feature cross exercises

Playground is an interactive application that lets you manipulate various aspects of training and testing a machine learning model. With Playground, you can select features and adjust hyperparameters, and then discover how your choices influence a model.

This page contains two Playground exercises.

Exercise 1: A basic feature cross

For this exercise, focus on the following parts of the Playground user interface:

Underneath FEATURES, notice the three potential model features:
- x₁
- x₂
- x₁x₂
Underneath OUTPUT, you'll see a square containing orange and blue dots. Imagine that you're looking at a square forest, where orange dots mark the position of sick trees and blue dots mark the position of healthy trees.
Between FEATURES and OUTPUT, if you look very closely, you'll see three faint dashed lines connecting each feature to the output. The width of each dashed line symbolizes the weight currently associated with each feature. These lines are very faint because the starting weight for each feature is initialized to 0. As the weight grows or shrinks, so will the thickness of these lines.

Task 1: Explore Playground by doing the following:

Click on the faint line that connects feature x₁ to the output. A popup appears.
In the popup, enter the weight 1.0.
Press Enter.

Notice the following:

The dashed line for x₁ becomes thicker as the weight increases from 0 to 1.0.
An orange and blue background now appears.
- The orange background is the model's guesses as to where the sick trees are.
- The blue background is the model's guesses as to where the healthy trees are. The model is doing a terrible job; about half of the model's guesses are wrong.
Because the weight is 1.0 for x₁ and 0 for the other features, the model matches x₁'s values exactly.

Task 2: Change the weights of any or all of the three features so that the model (the background colors) successfully predicts sick and healthy trees. The solution appears just below Playground.

Click here for the solution to Task 2

w₁ = 0
w₂ = 0
x₁ x₂ = any positive value

Just for fun, what happens if you input a negative value for the feature cross?

Exercise 2: A more sophisticated feature cross

For the second exercise, look at the arrangement of orange dots (sick trees) and blue dots (healthy trees) in the output model, noticing the following:

The dots form roughly spherical patterns.
The arrangement of dots is noisy; for example, notice the occasional blue dots in the outer sphere of orange dots. Consequently, even a great model is unlikely to correctly predict each dot.

Task 1: Explore the Playground UI by doing the following:

Click the Run/Pause button, which is a white triangle inside a black circle. Playground will begin training the model; observe the Epochs counter increasing.
After the system has trained for at least 300 epochs, press that same Run/Pause button to pause the training.
Look at the model. Is the model making good predictions? In other words, are the blue dots generally surrounded by a blue background, and are the orange dots generally surrounded by an orange background?
Examine the value of Test loss, which appears just below OUTPUT. Is this value closer to 1.0 (higher loss) or closer to 0 (lower loss)?
Reset Playground by pressing the curvy arrow to the left of the Run/Pause button.

Task 2: Build a better model by doing the following:

Select or deselect any combination of the five possible features.
Adjust the learning rate.
Train the system for at least 500 epochs.
Examine the value of Test loss. Can you get a Test loss less than 0.2?

Solutions appear below Playground.

Click here for the solution to Task 1

The model is terrible. Notice, for example, that many of the orange dots are swimming in a sea of blue. Furthermore, Test loss is very high.

Click here for the solution to Task 2

You can improve model performance by doing the following:

Select both polynomial transforms (x₁² and x₂²) and unselect the other three possible features.
Reduce the learning rate to 0.001 or lower.

Feature crosses (5 min)

Test your knowledge (10 min)