Publication
Title
Predicting and analyzing injury severity : a machine learning-based approach using class-imbalanced proactive and reactive data
Author
Abstract
Although the utility of the machine learning (ML) techniques is established in occupational accident domain using reactive data, its exploration in predicting injury severity using both reactive and proactive data is new. This necessitates the investigation of the significance of both types of data in prediction of injury severity using ML techniques. In addition, the unstructured texts, and class-imbalance in data often create difficulty in analysis. Therefore, to address the above-mentioned issues, two types of data, namely investigation report (i.e., reactive data) and inspection report (i.e., proactive data), collected from a steel plant, are used in this study. The datasets are merged together for generating mixed dataset. Topic modeling is used to handle the unstructured texts. A total of four oversampling algorithms, namely Synthetic Minority Over-sampling Technique (SMOTE), borderline SMOTE (BLSMOTE), Majority Weighted Minority Oversampling Technique (MWMOTE), and k-means SMOTE (KMSMOTE) have been used separately to handle the class imbalance issue. Thereafter, a set of six prediction algorithms, namely support vector machine, artificial neural network, Naíve Bayes, k-nearest neighbour, classification and regression tree analysis, and random forest have been used on reactive and mixed datasets separately for injury severity prediction. The results reveal that KMSMOTE performs better than others in balancing datasets and therefore, helps in achieving higher prediction in terms of average recall, F1-score and geometric mean. In addition, it is also statistically shown that prediction of injury severity is significantly higher using mixed dataset than reactive dataset only. Finally, a set of 19 crisp safety decision rules are generated using tolerance rough set approach (TRSA), which can explain the factors responsible for injury severity outcomes, namely ‘Fatal’, ‘Medical case’, and ‘First-aid’.
Language
English
Source (journal)
Safety science. - Amsterdam, 1991, currens
Publication
Amsterdam : 2020
ISSN
0925-7535
DOI
10.1016/J.SSCI.2020.104616
Volume/pages
125 (2020) , p. 1-23
ISI
000524359800023
Full text (Publisher's DOI)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
External links
Web of Science
Record
Identifier
Creation 19.02.2020
Last edited 22.08.2024
To cite this reference