Article

Using KNN and SVM Based One-Class Classifier for Detecting Online Radicalization on Twitter

Authors:

ICDCIT 2015: Proceedings of the 11th International Conference on Distributed Computing and Internet Technology - Volume 8956

Pages 431 - 442

https://doi.org/10.1007/978-3-319-14977-6_47

Published: 05 February 2015 Publication History

Abstract

Twitter is the largest and most popular micro-blogging website on Internet. Due to low publication barrier, anonymity and wide penetration, Twitter has become an easy target or platform for extremists to disseminate their ideologies and opinions by posting hate and extremism promoting tweets. Millions of tweets are posted on Twitter everyday and it is practically impossible for Twitter moderators or an intelligence and security analyst to manually identify such tweets, users and communities. However, automatic classification of tweets into pre-defined categories is a non-trivial problem problem due to short text of the tweet the maximum length of a tweet can be 140 characters and noisy content incorrect grammar, spelling mistakes, presence of standard and non-standard abbreviations and slang. We frame the problem of hate and extremism promoting tweet detection as a one-class or unary-class categorization problem by learning a statistical model from a training set containing only the objects of one class . We propose several linguistic features such as presence of war, religious, negative emotions and offensive terms to discriminate hate and extremism promoting tweets from other tweets. We employ a single-class SVM and KNN algorithm for one-class classification task. We conduct a case-study on Jihad, perform a characterization study of the tweets and measure the precision and recall of the machine-learning based classifier. Experimental results on large and real-world dataset demonstrate that the proposed approach is effective with F-score of 0.60 and 0.83 for the KNN and SVM classifier respectively.

References

[1]

Agrawal, S., Sureka, A.: Learning to classify hate and extremism promoting tweets. JISIC 2014

Digital Library

Google Scholar

[2]

Berger, J., Strathearn, B.: Who matters online: Measuring influence, evaluating content and countering violent extremism in online social networks. The international centre for the study of radicalization and political violence 2013

Google Scholar

[3]

Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1---27:27 2011

Digital Library

Google Scholar

[4]

Kwok, I., Wang, Y.: Locate the hate: Detecting tweets against blacks. In: Twenty-Seventh AAAI Conference on Artificial Intelligence 2013

Digital Library

Google Scholar

[5]

Li, R., Wang, S., Chang, K.C.C.: Towards social data platform: automatic topic-focused monitor for twitter stream. Proceedings of the VLDB Endowment 614, 1966---1977 2013

Digital Library

Google Scholar

[6]

Li, R., Wang, S., Deng, H., Wang, R., Chang, K.C.C.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1023---1031. ACM 2012

Digital Library

Google Scholar

[7]

Liebrecht, C., Kunneman, F., van den Bosch, A.: The perfect solution for detecting sarcasm in tweets# not. Computational Approaches to Subjectivity, Sentiment and Social Media Analysis 2013

Google Scholar

[8]

Martinez-Romo, J., Araujo, L.: Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Systems with Applications 408, 2992---3000 2013

Digital Library

Google Scholar

[9]

O'Callaghan, D., Greene, D., Conway, M., Carthy, J., Cunningham, P.: Uncovering the wider structure of extreme right communities spanning popular online networks. In: Web Science Conference, pp. 276---285 2013

Digital Library

Google Scholar

[10]

Reyes, A., Rosso, P., Buscaldi, D.: From humor recognition to irony detection: The figurative language of social media. Data & Knowledge Engineering 74, 1---12 2012

Digital Library

Google Scholar

[11]

Wadhwa, P., Bhatia, M.P.S.: Tracking on-line radicalization using investigative data mining. In: NCC, pp. 1---5 2013

Crossref

Google Scholar

[12]

Xiang, G., Fan, B., Wang, L., Hong, J., Rose, C.: Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1980---1984. ACM 2012

Digital Library

Google Scholar

[13]

Yang, M.C., Lee, J.T., Lee, S.W., Rim, H.C.: Finding interesting posts in twitter based on retweet graph analysis. In: SIGIR, pp. 1073---1074 2012

Digital Library

Google Scholar

Cited By

View all

Gaikwad MAhirrao SPhansalkar SKotecha KRani S(2023)Multi-Ideology, Multiclass Online Extremism Dataset, and Its Evaluation Using Machine LearningComputational Intelligence and Neuroscience10.1155/2023/45631452023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/4563145
Zade HWoodruff MJohnson EStanley MZhou ZHuynh MAcheson AHsieh GStarbird K(2023)Tweet Trajectory and AMPS-based Contextual Cues can Help Users Identify MisinformationProceedings of the ACM on Human-Computer Interaction10.1145/35795367:CSCW1(1-27)Online publication date: 16-Apr-2023
https://dl.acm.org/doi/10.1145/3579536
Chhabra AVishwakarma D(2023)A literature survey on multimodal and multilingual automatic hate speech identificationMultimedia Systems10.1007/s00530-023-01051-829:3(1203-1230)Online publication date: 20-Jan-2023
https://dl.acm.org/doi/10.1007/s00530-023-01051-8
Show More Cited By

Index Terms

Using KNN and SVM Based One-Class Classifier for Detecting Online Radicalization on Twitter
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification

Recommendations

Learning to Classify Hate and Extremism Promoting Tweets
JISIC '14: Proceedings of the 2014 IEEE Joint Intelligence and Security Informatics Conference

Research shows that Twitter is being misused as a platform for online radicalization and contains several hate and extremism promoting users and tweets violating the community guidelines of the website. Manual identification of such tweets is ...
Twitter under crisis: can we trust what we RT?
SOMA '10: Proceedings of the First Workshop on Social Media Analytics

In this article we explore the behavior of Twitter users under an emergency situation. In particular, we analyze the activity related to the 2010 earthquake in Chile and characterize Twitter in the hours and days following this disaster. Furthermore, we ...
Using topic models for Twitter hashtag recommendation
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Since the introduction of microblogging services, there has been a continuous growth of short-text social networking on the Internet. With the generation of large amounts of microposts, there is a need for effective categorization and search of the ...

Comments

Information & Contributors

Information

Published In

ICDCIT 2015: Proceedings of the 11th International Conference on Distributed Computing and Internet Technology - Volume 8956

February 2015

460 pages

ISBN:9783319149769

Editors:
Raja Natarajan
School of Technology and Computer Science, Tata Institute of Fundamental Research, Mumbai, India
,
Gautam Barua
Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, India
,
Manas Patra
Department of Computer Science,, Berhampur University, Berhampur, India

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 05 February 2015

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Gaikwad MAhirrao SPhansalkar SKotecha KRani S(2023)Multi-Ideology, Multiclass Online Extremism Dataset, and Its Evaluation Using Machine LearningComputational Intelligence and Neuroscience10.1155/2023/45631452023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/4563145
Zade HWoodruff MJohnson EStanley MZhou ZHuynh MAcheson AHsieh GStarbird K(2023)Tweet Trajectory and AMPS-based Contextual Cues can Help Users Identify MisinformationProceedings of the ACM on Human-Computer Interaction10.1145/35795367:CSCW1(1-27)Online publication date: 16-Apr-2023
https://dl.acm.org/doi/10.1145/3579536
Chhabra AVishwakarma D(2023)A literature survey on multimodal and multilingual automatic hate speech identificationMultimedia Systems10.1007/s00530-023-01051-829:3(1203-1230)Online publication date: 20-Jan-2023
https://dl.acm.org/doi/10.1007/s00530-023-01051-8
Mathew BIllendula ASaha PSarkar SGoyal PMukherjee A(2020)Hate begets HateProceedings of the ACM on Human-Computer Interaction10.1145/34151634:CSCW2(1-24)Online publication date: 15-Oct-2020
https://dl.acm.org/doi/10.1145/3415163
Roy SSuman BChandra JDandapat SVarma VKambhampati SBhattacharya ANatarajan SRoy R(2020)Forecasting the FutureProceedings of the 7th ACM IKDD CoDS and 25th COMAD10.1145/3371158.3371190(219-223)Online publication date: 5-Jan-2020
https://dl.acm.org/doi/10.1145/3371158.3371190
Ashraf NMustafa RSidorov GGelbukh A(2020)Individual vs. Group Violent Threats Classification in Online DiscussionsCompanion Proceedings of the Web Conference 202010.1145/3366424.3385778(629-633)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1145/3366424.3385778
Lima LReis JMelo PMurai FBenevenuto FAlhajj R(2020)Characterizing (un)moderated textual data in social systemsProceedings of the 12th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1109/ASONAM49781.2020.9381327(430-434)Online publication date: 7-Dec-2020
https://dl.acm.org/doi/10.1109/ASONAM49781.2020.9381327
Kursuncu UGaur MCastillo CAlambo AThirunarayan KShalin VAchilov DArpinar ISheth A(2019)Modeling Islamist Extremist Communications on Social Media using Contextual DimensionsProceedings of the ACM on Human-Computer Interaction10.1145/33592533:CSCW(1-22)Online publication date: 7-Nov-2019
https://dl.acm.org/doi/10.1145/3359253
Nizzoli LAvvenuti MCresci STesconi MBoldi PWelles BKinder-Kurlanda KWilson CPeters IMeira W(2019)Extremist Propaganda Tweet Classification with Deep Learning in Realistic ScenariosProceedings of the 10th ACM Conference on Web Science10.1145/3292522.3326050(203-204)Online publication date: 26-Jun-2019
https://dl.acm.org/doi/10.1145/3292522.3326050
Zahra KAzam FButt WIlyas F(2018)A Framework for User Characterization based on Tweets Using Machine Learning AlgorithmsProceedings of the 2018 VII International Conference on Network, Communication and Computing10.1145/3301326.3301373(11-16)Online publication date: 14-Dec-2018
https://dl.acm.org/doi/10.1145/3301326.3301373
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations

Learning to Classify Hate and Extremism Promoting Tweets

Twitter under crisis: can we trust what we RT?

Using topic models for Twitter hashtag recommendation

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations