As you prepare your data for model training and evaluation, it's important to keep issues of fairness in mind and audit for potential sources of bias, so you can proactively mitigate its effects before releasing your model into production.
Where might bias lurk? Here are some red flags to look out for in your dataset.
Missing feature values
If your dataset has one or more features that have missing values for a large number of examples, that could be an indicator that certain key characteristics of your dataset are under-represented.
Exercise: Check your understanding
You discover that 1,500 of the 5,000 examples in the training set are missing temperament values. Which of the following are potential sources of bias you should investigate?
Unexpected feature values
When exploring data, you should also look for examples that contain feature values that stand out as especially uncharacteristic or unusual. These unexpected feature values could indicate problems that occurred during data collection or other inaccuracies that could introduce bias.
Exercise: Check your understanding
Review the following hypothetical set of examples for training a rescue-dog adoptability model.
breed | age (yrs) | weight (lbs) | temperament | shedding_level |
---|---|---|---|---|
toy poodle | 2 | 12 | excitable | low |
golden retriever | 7 | 65 | calm | high |
labrador retriever | 35 | 73 | calm | high |
french bulldog | 0.5 | 11 | calm | medium |
unknown mixed breed | 4 | 45 | excitable | high |
basset hound | 9 | 48 | calm | medium |
breed | age (yrs) | weight (lbs) | temperament | shedding_level |
---|---|---|---|---|
toy poodle | 2 | 12 | excitable | low |
golden retriever | 7 | 65 | calm | high |
labrador retriever | 35 | 73 | calm | high |
french bulldog | 0.5 | 11 | calm | medium |
unknown mixed breed | 4 | 45 | excitable | high |
basset hound | 9 | 48 | calm | medium |
The oldest dog to have their age verified by Guinness World Records was Bluey, an Australian Cattle Dog who lived to be 29 years and 5 months. Given that, it seems quite implausible that the labrador retriever is actually 35 years old, and more likely that the dog's age was either calculated or recorded inaccurately (maybe the dog is actually 3.5 years old). This error could also be indicative of broader accuracy issues with age data in the dataset that merit further investigation.
Data skew
Any sort of skew in your data, where certain groups or characteristics may be under- or over-represented relative to their real-world prevalence, can introduce bias into your model.
When auditing model performance, it's important not only to look at results in aggregate, but to break out results by subgroup. For example, in the case of our rescue-dog adoptability model, to ensure fairness, it's not sufficient to simply look at overall accuracy. We should also audit performance by subgroup to ensure the model performs equally well for each dog breed, age group, and size group.
Later in this module, in Evaluating for Bias, we'll take a closer look at different methods for evaluating models by subgroup.