Datasets, generalization, and overfitting: Test Your Knowledge

  1. Which of the following is an example of a stationary dataset?

  2. You are preparing to train a model for predicting the sale price of used cars using a dataset that contains the following features: year, model, and mileage. When you examine the dataset, you discover that mileage values are missing for 150 out of 2,500 examples. Which of the following options would be reasonable actions to take? (Choose all that apply)

    Choose as many answers as you see fit.

  3. You are training a streaming service's movie-recommendation model to predict whether or not a user will enjoy a given movie. Which of the following would be reasonable proxy labels for "User enjoyed the movie"? (Choose all that apply)

    Choose as many answers as you see fit.

  4. True or False: Training your model until it achieves a low loss value on your test data is a good way to prevent overfitting.

  5. Fill in the blank in the following sentence:
    Regularization improves your model's ability to generalize to new data by penalizing ___ during training.