Production ML Systems: Test Your Knowledge

You are using machine learning to build a classification model that predicts unicorn appearances. Your dataset details 10,000 unicorn appearances and 10,000 unicorn non-appearances. The dataset contains the location, time of day, elevation, temperature, humidity, tree cover, presence of a rainbow, and several other features.

After launching your unicorn appearance predictor, you will need to keep your model fresh by retraining on new data. Because you are gathering too much new data to train on, you decide to limit the training data by sampling the new data over a window of time. You also need to account for daily and annual patterns in unicorn appearances. What window of time do you choose?

One day, because a larger window would result in lots of data and your model would take too long to train.

One week, so that your dataset is not too large but you can still smooth out patterns.

One year, to ensure that your model is not biased by yearly patterns.

Training-serving skew: the format of the serving data gradually changed at some point after the model started serving.

You used accuracy as a metric during training.

Your model is stale.

None of the above.

You didn't have enough training examples for Antarctica.

You used dynamic training instead of static training.

Your model has become stale.

All of the above.

The high quality of your predictions lead users to easily find unicorns, affecting unicorn appearance behavior itself.

Urban areas are difficult to model.

Unicorn appearances are reported multiple times in heavily populated areas, skewing your training data.

Switch the model from dynamic training to static training.

Switch the model from dynamic inference to static inference.

Validate the model quality before serving.

None of the above solutions would help.