You may recall from the Feature cross exercises in the Categorical data module, that the following classification problem is nonlinear:

"Nonlinear" means that you can't accurately predict a label with a model of the form . In other words, the "decision surface" is not a line.
However, if we perform a feature cross on our features and , we can then represent the nonlinear relationship between the two features using a linear model: where is the feature cross between and :

Now consider the following dataset:

You may also recall from the Feature cross exercises that determining the correct feature crosses to fit a linear model to this data took a bit more effort and experimentation.
But what if you didn't have to do all that experimentation yourself? Neural networks are a family of model architectures designed to find nonlinear patterns in data. During training of a neural network, the model automatically learns the optimal feature crosses to perform on the input data to minimize loss.
In the following sections, we'll take a closer look at how neural networks work.