Working with Categorical Data: Test Your Knowledge

  1. Which of the following are examples of categorical data? (Choose all that apply)

    Choose as many answers as you see fit.

  2. True or False: Machine labels are generally considered more desirable than labels provided by human raters.

  3. You are training a model on a training dataset that includes the feature eye_color, which can be one of the following six values: amber, blue, brown, gray, green, hazel.
    Which of the following are valid encodings for an eye_color value of blue? (Choose all that apply)

    Choose as many answers as you see fit.

  4. In which of the following scenarios would it make sense to apply feature hashing?

  5. You are performing a feature cross of the following two categorical features:

    • apple_color, which takes one of these four values: green, red, white, or yellow
    • apple_texture, which takes one of these two values: crisp or mushy

    How many entries are in the resulting feature-cross vector?