Once a source of bias has been identified in the training data, we can take proactive steps to mitigate its effects. There are two main strategies that machine learning (ML) engineers typically employ to remediate bias:
- Augmenting the training data.
- Adjusting the model's loss function.
Augmenting the training data
If an audit of the training data has uncovered issues with missing, incorrect, or skewed data, the most straightforward way to address the problem is often to collect additional data.
However, while augmenting the training data can be ideal, the downside of this approach is that it can also be infeasible, either due to a lack of available data or resource constraints that impede data collection. For example, gathering more data might be too costly or time-consuming, or not viable due to legal/privacy restrictions.
Adjusting the model's optimization function
In cases where collecting additional training data is not viable, another approach for mitigating bias is to adjust how loss is calculated during model training. We typically use an optimization function like log loss to penalize incorrect model predictions. However, log loss does not take subgroup membership into consideration. So instead of using log loss, we can choose an optimization function designed to penalize errors in a fairness-aware fashion that counteracts the imbalances we've identified in our training data.
The TensorFlow Model Remediation Library provides utilities for applying two different bias-mitigation techniques during model training:
MinDiff: MinDiff aims to balance the errors for two different slices of data (male/female students versus nonbinary students) by adding a penalty for differences in the prediction distributions for the two groups.
Counterfactual Logit Pairing: Counterfactual Logit Pairing (CLP) aims to ensure that changing a sensitive attribute of a given example doesn't alter the model's prediction for that example. For example, if a training dataset contains two examples whose feature values are identical, except one has a
gender
value ofmale
and the other has agender
value ofnonbinary
, CLP will add a penalty if the predictions for these two examples are different.
The techniques you choose for adjusting the optimization function are dependent on the use cases for the model. In the next section, we'll take a closer look at how to approach the task of evaluating a model for fairness by considering these use cases.