Fairness-Aware Data Integration

Lacramioara Mazilu, Norman Paton, Nikolaos Konstantinou, Alvaro Fernandes

Research output: Contribution to journalArticlepeer-review

Abstract

Machine learning can be applied in applications that take decisions that impact people’s lives. Such techniques have the potential to make decision making more objective, but there also is a risk that the decisions can discriminate against certain groups as a result of bias in the underlying data. Reducing bias, or promoting fairness, has been a focus of significant investigation in machine learning, for example based on pre-processing the training data, changing the learning algorithm, or post-processing the results of the learning. However, prior to these activities, data integration discovers and integrates the data that is used for training, and data integration processes have the potential to produce data that leads to biased conclusions. In this paper, we propose an approach that generates schema mappings in ways that take into account: (i) properties that are intrinsic to mapping results that may give rise to bias in analyses; and (ii) bias observed in classifiers trained on the results of different sets of mappings. The approach explores a space of different ways of integrating the data, using a tabu search algorithm, guided by bias-aware objective functions that represent different types of bias.The resulting approach is evaluated using Adult Census and German Credit datasets, to explore the extent to which and the circumstances in which the approach can increase the fairness of the results of the data integration process.
Original languageEnglish
JournalJournal of Data and Information Quality
Publication statusAccepted/In press - 17 Feb 2022

Fingerprint

Dive into the research topics of 'Fairness-Aware Data Integration'. Together they form a unique fingerprint.

Cite this