Fairness: Types of bias

Machine learning (ML) models are not inherently objective. ML practitioners train models by feeding them a dataset of training examples, and human involvement in the provision and curation of this data can make a model's predictions susceptible to bias.

When building models, it's important to be aware of common human biases that can manifest in your data, so you can take proactive steps to mitigate their effects.

Reporting bias

Historical bias

Automation bias

Selection bias

Selection bias occurs if a dataset's examples are chosen in a way that is not reflective of their real-world distribution. Selection bias can take many different forms, including coverage bias, non-response bias, and sampling bias.

Coverage bias

Non-Response bias

Sampling bias

Group attribution bias

Group attribution bias is a tendency to generalize what is true of individuals to the entire group to they belong to. Group attribution bias often manifests in the two following forms.

In-group bias

Out-group homogeneity bias

Implicit Bias

Confirmation bias

Experimenter's bias

Exercise: Check your understanding

Which of the following types of bias could have contributed to the skewed predictions in the college admissions model described in the introduction?
Historical bias
The admissions model was trained on student records from the past 20 years. If minority students were underrepresented in this data, the model could have reproduced the same historical inequities when making predictions on new student data.
In-group bias
The admissions model was trained by current university students, who could have had an unconscious preference for admitting students that came from backgrounds similar to their own, which could have affected how they curated or feature-engineered the data on which the model was trained.
Confirmation bias
The admissions model was trained by current university students, who likely had preexisting beliefs about what types of qualifications correlate with success in the computer science program. They could have inadvertently curated or feature-engineered the data so that the model affirmed these existing beliefs.
Automation bias
Automation bias might explain why the admissions committee chose to use an ML model to make admissions decisions; they might have believed an automated system would produce better results than decisions made by humans. However, automation bias doesn't provide any insight into why the model's predictions ended up being skewed.