Automatic detection of cyberbullying in social media text

Van Hee, Cynthia; Jacobs, Gilles; Emmery, Chris; Desmet, Bart; Lefever, Els; Verhoeven, Ben; De Pauw, Guy; Daelemans, Walter; Hoste, Veronique

doi:10.1371/JOURNAL.PONE.0203794

Title

Automatic detection of cyberbullying in social media text

Author

Van Hee, Cynthia

Jacobs, Gilles

Emmery, Chris

Desmet, Bart

Lefever, Els

Verhoeven, Ben

De Pauw, Guy

Daelemans, Walter

Hoste, Veronique

Abstract

While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to identify potential risks automatically. The focus of this paper is on automatic cyberbullying detection in social media text by modelling posts written by bullies, victims, and bystanders of online bullying. We describe the collection and fine-grained annotation of a cyberbullying corpus for English and Dutch and perform a series of binary classification experiments to determine the feasibility of automatic cyberbullying detection. We make use of linear support vector machines exploiting a rich feature set and investigate which information sources contribute the most for the task. Experiments on a hold-out test set reveal promising results for the detection of cyberbullying-related posts. After optimisation of the hyperparameters, the classifier yields an F-1 score of 64% and 61% for English and Dutch respectively, and considerably outperforms baseline systems.

Language

English

Source (journal)

PLoS ONE

Publication

2018

ISSN

1932-6203

DOI

10.1371/JOURNAL.PONE.0203794

Volume/pages

13 :10 (2018) , p. 1-22

Article Reference

e0203794

ISI

000446632700008

Pubmed ID

30296299

Medium

E-only publicatie

Full text (Publisher's DOI)

https://doi.org/10.1371/JOURNAL.PONE.0203794

Full text (open access)

https://repository.uantwerpen.be/docman/irua/88787a/154735.pdf

Faculty/Department				Faculty of Arts. Linguistics

Research group				Centre for Computational Linguistics, Psycholinguistics and Sociolinguistics (CLiPS)
Project info				Automatic Monitoring for Cyberspace Applications (AMiCA).
Publication type				A1 Journal article

Subject				Engineering sciences. Technology

Affiliation				Publications with a UAntwerp address

Web of Science

View record in Web of Science®

View citing articles in Web of Science®

Identifier

Creation

09.11.2018

Last edited

09.10.2023

To cite this reference

https://hdl.handle.net/10067/1547350151162165141