Impact Of Content Features For Automatic Online Abuse Detection

Papegnies, Etienne; Labatut, Vincent; Dufour, Richard; Linares, Georges

doi:10.1007/978-3-319-77116-8_30

Computer Science > Information Retrieval

arXiv:1704.03289 (cs)

[Submitted on 11 Apr 2017]

Title:Impact Of Content Features For Automatic Online Abuse Detection

Authors:Etienne Papegnies (LIA), Vincent Labatut (LIA), Richard Dufour (LIA), Georges Linares (LIA)

View PDF

Abstract:Online communities have gained considerable importance in recent years due to the increasing number of people connected to the Internet. Moderating user content in online communities is mainly performed manually, and reducing the workload through automatic methods is of great financial interest for community maintainers. Often, the industry uses basic approaches such as bad words filtering and regular expression matching to assist the moderators. In this article, we consider the task of automatically determining if a message is abusive. This task is complex since messages are written in a non-standardized way, including spelling errors, abbreviations, community-specific codes... First, we evaluate the system that we propose using standard features of online messages. Then, we evaluate the impact of the addition of pre-processing strategies, as well as original specific features developed for the community of an online in-browser strategy game. We finally propose to analyze the usefulness of this wide range of features using feature selection. This work can lead to two possible applications: 1) automatically flag potentially abusive messages to draw the moderator's attention on a narrow subset of messages ; and 2) fully automate the moderation process by deciding whether a message is abusive without any human intervention.

Subjects:	Information Retrieval (cs.IR); Social and Information Networks (cs.SI)
Cite as:	arXiv:1704.03289 [cs.IR]
	(or arXiv:1704.03289v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.1704.03289
Journal reference:	International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), Apr 2017, Budapest, Hungary
Related DOI:	https://doi.org/10.1007/978-3-319-77116-8_30

Submission history

From: Etienne Papegnies [view email] [via CCSD proxy]
[v1] Tue, 11 Apr 2017 13:59:33 UTC (457 KB)

Computer Science > Information Retrieval

Title:Impact Of Content Features For Automatic Online Abuse Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Impact Of Content Features For Automatic Online Abuse Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators