Regularization oversampling for classification tasks : to exploit what you do not know

Van der Schraelen, Lennert; Stouthuysen, Kristof; Vanden Broucke, Seppe; Verdonck, Tim

doi:10.1016/J.INS.2023.03.146

Title

Regularization oversampling for classification tasks : to exploit what you do not know

Author

Van der Schraelen, Lennert

Stouthuysen, Kristof

Vanden Broucke, Seppe

Verdonck, Tim

Abstract

In numerous binary classification tasks, the two groups of instances are not equally represented, which often implies that the training data lack sufficient information to model the minority class correctly. Furthermore, many traditional classification models make arbitrarily overconfident predictions outside the range of the training data. These issues severely impact the deployment and usefulness of these models in real life. In this paper, we propose the boundary regularizing out-of-distribution (BROOD) sampler, which adds artificial data points on the edge of the training data. By exploiting these artificial samples, we are able to regularize the decision surface of discriminative machine learning models and make more prudent predictions. Next, it is crucial to correctly classify many positive instances in a limited pool of instances that can be investigated with the available resources. By smartly assigning predetermined nonuniform class probabilities outside the training data, we can emphasize certain data regions and improve classifier performance on various material classification metrics. The good performance of the proposed methodology is illustrated in a case study that consists of both benchmark balanced and imbalanced classification data sets.

Language

English

Source (journal)

Information sciences. - New York

Publication

New York : 2023

ISSN

0020-0255

DOI

10.1016/J.INS.2023.03.146

Volume/pages

635 (2023) , p. 169-194

Full text (Publisher's DOI)

https://doi.org/10.1016/J.INS.2023.03.146

Full text (open access)

https://repository.uantwerpen.be/docstore/d:irua:20546

Faculty/Department				Faculty of Sciences. Mathematics and Computer Science

Research group				Applied mathematics

Publication type				A1 Journal article

Subject				Economics Mathematics

Affiliation				Publications with a UAntwerp address

Identifier

c:irua:201371

Creation

05.12.2023

Last edited

06.12.2023

To cite this reference

https://hdl.handle.net/10067/2013710151162165141