Data Wrangling for Fair Classification

Lacramioara Mazilu, Norman Paton, Nikolaos Konstantinou, Alvaro Fernandes

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Whenever decisions that affect people are informed by classifiers, there is a risk that the decisions can discriminate against certain groups as a result of bias in the training data. There has been significant work to address this, based on pre-processing the inputs to the classifier, changing the classifier itself, or postprocessing the results of the classifier. However, upstream from these steps, there may be a variety of data wrangling processes that select and integrate the data that is used to train the classifier, and these steps could themselves lead to bias. In this paper, we propose an approach that generates schema mappings in ways that take into account bias observed in classifiers trained on the results of different mappings. The approach searches a space of candidate interventions in the mapping generation process, which change how these mappings are generated, informed by a bias-aware fitness function. The resulting approach is evaluated using Adult Census and German Credit data sets.
Original languageEnglish
Title of host publicationProceedings of the Workshops of the EDBT/ICDT 2021 Joint Conference
Publication statusPublished - 23 Mar 2021


Dive into the research topics of 'Data Wrangling for Fair Classification'. Together they form a unique fingerprint.

Cite this