Publication
Title
Real-life performance of fairness interventions : introducing a new benchmarking dataset for fair ML
Author
Abstract
Some researchers evaluate their fair Machine Learning (ML) algorithms by simulating data with a fair and biased version of its labels. The fair labels reflect what labels individuals deserve, while the biased labels reflect labels obtained through a biased decision process. Given such data, fair algorithms are evaluated by measuring how well they can predict the fair labels, after being trained on the biased ones. The big problem with these approaches is, that they are based on simulated data, which is unlikely to capture the full complexity and noise of real-life decision problems. In this paper, we show how we created a new, more realistic dataset with both fair and biased labels. For this purpose, we started with an existing dataset containing information about high school students and whether they passed an exam or not. Through a human experiment, where participants estimated the school performance given some description of these students, we collect a biased version of these labels. We show how this new dataset can be used to evaluate fair ML algorithms, and how some fairness interventions, that perform well in the traditional evaluation schemes, do not necessarily perform well with respect to the unbiased labels in our dataset, leading to new insights into the performance of debiasing techniques.
Language
English
Source (book)
Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing (SAC '23), March 27 – March 31, 2023, Tallinn, Estonia
Publication
ACM , 2023
ISBN
978-1-4503-9517-5
DOI
10.1145/3555776.3577634
Volume/pages
p. 350-357
ISI
001124308100049
Full text (Publisher's DOI)
Full text (publisher's version - intranet only)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Web of Science
Record
Identifier
Creation 16.10.2023
Last edited 08.05.2024
To cite this reference