Title
|
|
|
|
Top-K contextual bandits with equity of exposure
| |
Author
|
|
|
|
| |
Abstract
|
|
|
|
The contextual bandit paradigm provides a general framework for decision-making under uncertainty. It is theoretically well-defined and well-studied, and many personalisation use-cases can be cast as a bandit learning problem. Because this allows for the direct optimisation of utility metrics that rely on online interventions (such as click-through-rate (CTR)), this framework has become an attractive choice to practitioners. Historically, the literature on this topic has focused on a one-sided, user-focused notion of utility, overall disregarding the perspective of content providers in online marketplaces (for example, musical artists on streaming services). If not properly taken into account – recommendation systems in such environments are known to lead to unfair distributions of attention and exposure, which can directly affect the income of the providers. Recent work has shed a light on this, and there is now a growing consensus that some notion of “equity of exposure” might be preferable to implement in many recommendation use-cases. We study how the top-K contextual bandit problem relates to issues of disparate exposure, and how this disparity can be minimised. The predominant approach in practice is to greedily rank the top-K items according to their estimated utility, as this is optimal according to the well-known Probability Ranking Principle. Instead, we introduce a configurable tolerance parameter that defines an acceptable decrease in utility for a maximal increase in fairness of exposure. We propose a personalised exposure-aware arm selection algorithm that handles this relevance-fairness trade-off on a user-level, as recent work suggests that users’ openness to randomisation may vary greatly over the global populace. Our model-agnostic algorithm deals with arm selection instead of utility modelling, and can therefore be implemented on top of any existing bandit system with minimal changes. We conclude with a case study on carousel personalisation in music recommendation: empirical observations highlight the effectiveness of our proposed method and show that exposure disparity can be significantly reduced with a negligible impact on user utility. |
| |
Language
|
|
|
|
English
| |
Source (book)
|
|
|
|
RecSys '21 : Fifteenth ACM Conference on Recommender Systems, September 2021
| |
Publication
|
|
|
|
Association for Computing Machinery
,
2021
| |
ISBN
|
|
|
|
978-1-4503-8458-2
| |
DOI
|
|
|
|
10.1145/3460231.3474248
| |
Volume/pages
|
|
|
|
p. 310-320
| |
ISI
|
|
|
|
000744461300031
| |
Full text (Publisher's DOI)
|
|
|
|
| |
Full text (publisher's version - intranet only)
|
|
|
|
| |
|