Best of Touché 2023 Task 4: Testing Data Augmentation and Label Propagation for Multilingual Multi-target Stance Detection

Avila, Jorge; Rodrigo, Álvaro; Centeno, Roberto

doi:10.1007/978-3-031-71736-9_13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14958))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

215 Accesses

Abstract

Touché 2023 task 4 evaluated stance detection in a Multilingual multi-target setting with a reduced annotated dataset. This is why we have tested different approaches focused on increasing training data by (1) including new samples from back-translating original training data and (2) adding samples from unlabeled data using label propagation. The results showed that back-translation was successful, while label propagation worsened the performance. We obtained the best results with a transformer-based model fine-tuned in two steps: the first on a related, more extensive dataset and the second on the development data. This shows the usefulness of including related data in our approach and suggests additional research based on taking advantage of other datasets and data augmentation. Besides, given that the current results were close to a 0.35 f1 score, there is still room for improvement in this task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Target-Phrase Zero-Shot Stance Detection: Where Do We Stand?

Chain of Stance: Stance Detection with Large Language Models

Bridging the Domain Gap for Stance Detection for the Zulu Language

Notes

1.
https://futureu.europa.eu/?locale=en.
2.
https://en.wikipedia.org/wiki/Sardines_movement.
3.
https://pypi.org/project/deep-translator/.
4.
https://huggingface.co/Helsinki-NLP/opus-mt-ine-en.
5.
https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2.
6.
https://scikit-learn.org/stable/modules/generated/sklearn.semi_supervised.LabelSpreading.html.
7.
The final model selected in each case was decided based on the results obtained during previous tests.
8.
https://huggingface.co/roberta-base.
9.
https://huggingface.co/xlm-roberta-large.
10.
https://huggingface.co/bert-base-uncased.
11.
https://www.tira.io/.
12.
It is important to note that the actual labels of the test set have never been made public and therefore, no checks or additional experiments could be performed.

References

Agerri, R., Centeno, R., Espinosa, M.S., de Landa, J.F., Rodrigo, Á.: VaxxStance@iberLEF 2021: overview of the task on going beyond text in cross-lingual stance detection. Proces. del Leng. Nat. 67, 173–181 (2021)
Google Scholar
Anand, P., Walker, M., Abbott, R., Fox Tree, J.E., Bowmani, R., Minor, M.: Cats rule and dogs drool!: Classifying stance in online debate. In: Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2011) (2011)
Google Scholar
Barriere, V., Balahur, A.: Multilingual multi-target stance recognition in online public consultations. Mathematics 11(9), 2161 (2023)
Article Google Scholar
Barriere, V., Jacquet, G.G., Hemamou, L.: CoFE: a new dataset of intra-multilingual multi-target stance classification from an online European participatory democracy platform. In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, pp. 418–422 (2022)
Google Scholar
Bondarenko, A., et al.: Overview of Touché 2023: argument and causal retrieval. In: Arampatzis, A., et al. (eds.) CLEF 2023. LNCS, vol. 14163, pp. 507–530. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-42448-9_31
Chapter Google Scholar
Cignarella, A.T., Lai, M., Bosco, C., Patti, V., Paolo, R., et al.: SardiStance@ EVALITA2020: overview of the task on stance detection in Italian tweets. In: EVALITA 2020 Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, pp. 1–10. CEUR (2020)
Google Scholar
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451 (2020)
Google Scholar
Dey, K., Shrivastava, R., Kaushik, S.: Topical stance detection for Twitter: a two-phase LSTM model using attention. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 529–536. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_40
Chapter Google Scholar
Faulkner, A.: Automated classification of stance in student essays: an approach using stance target information and the Wikipedia link-based measure. The Florida AI Research Society (2014)
Google Scholar
Hardalov, M., Arora, A., Nakov, P., Augenstein, I.: Cross-domain label-adaptive stance detection. In: Moens, M.F., Huang, X., Specia, L., Yih, S.W. (eds.) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 9011–9028 (2021)
Google Scholar
Hardalov, M., Arora, A., Nakov, P., Augenstein, I.: A survey on stance detection for mis- and disinformation identification. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 1259–1277. Association for Computational Linguistics, Seattle (2022)
Google Scholar
Hercig, T., Krejzl, P., Hourová, B., Steinberger, J., Lenc, L.: Detecting stance in Czech news commentaries. In: Conference on Theory and Practice of Information Technologies (2017). https://api.semanticscholar.org/CorpusID:35923394
Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Label propagation for deep semi-supervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5070–5079 (2019)
Google Scholar
Küçük, D., Can, F.: Stance detection: a survey. ACM Comput. Surv. 53(1), 1–37 (2020)
Article Google Scholar
Küçük, D., Can, F.: Stance detection on tweets: an SVM-based approach. arXiv preprint arXiv:1803.08910 (2018)
Lai, M., Cignarella, A.T., Hernández Farías, D.I., Bosco, C., Patti, V., Rosso, P.: Multilingual stance detection in social media political debates. Comput. Speech Lang. 63, 101075 (2020)
Article Google Scholar
Lai, M., Patti, V., Ruffo, G., Rosso, P.: Stance evolution and Twitter interactions in an Italian political debate. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 15–27. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91947-8_2
Chapter Google Scholar
Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: A dataset for detecting stance in tweets. In: Calzolari, N., et al. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia (2016)
Google Scholar
Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: SemEval-2016 task 6: detecting stance in tweets. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 31–41 (2016)
Google Scholar
Patel, H., Verma, J.P.: Community detection using label propagation algorithm with random walk approach. In: Dhavse, R., Kumar, V., Monteleone, S. (eds.) Emerging Technology Trends in Electronics, Communication and Networking. LNEE, vol. 952, pp. 307–320. Springer, Singapore (2023). https://doi.org/10.1007/978-981-19-6737-5_25
Chapter Google Scholar
Schäfer, K.: Queen of swords at Touché 2023: intra-multilingual multi-target stance classification using BERT. In: Working Notes of CLEF (2023)
Google Scholar
Sugiyama, A., Yoshinaga, N.: Data augmentation using back-translation for context-aware neural machine translation. In: Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019) (2019)
Google Scholar
Taulé, M., Martí, M.A., Pardo, F.M.R., Rosso, P., Bosco, C., Patti, V.: Overview of the task on stance and gender detection in tweets on Catalan independence. In: IberEval@SEPLN (2017)
Google Scholar
Taulé, M., Pardo, F.M.R., Martí, M.A., Rosso, P.: Overview of the task on multimodal stance detection in tweets on Catalan #1oct referendum. In: IberEval@SEPLN (2018)
Google Scholar
Vamvas, J., Sennrich, R.: X-stance: a multilingual multi-target dataset for stance detection. In: Proceedings of SwissText/KONVENS 2020 (2020)
Google Scholar
Van Dyk, D.A., Meng, X.L.: The art of data augmentation. J. Comput. Graph. Stat. 10(1), 1–50 (2001)
Article MathSciNet Google Scholar
Vaswani, A., et al.: Attention is all you need (2023)
Google Scholar
Wang, F., Zhang, C.: Label propagation through linear neighborhoods. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 985–992 (2006)
Google Scholar
Wei, W., Zhang, X., Liu, X., Chen, W., Wang, T.: pkudblab at SemEval-2016 task 6: a specific convolutional neural network system for effective stance detection. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016) (2016)
Google Scholar
Xu, R., Zhou, Y., Wu, D., Gui, L., Du, J., Xue, Y.: Overview of NLPCC shared task 4: stance detection in Chinese microblogs. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 907–916. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_85
Chapter Google Scholar
Zotova, E., Agerri, R., Rigau, G.: Semi-automatic generation of multilingual datasets for stance detection in Twitter. Expert Syst. Appl. 170, 114547 (2021)
Article Google Scholar

Download references

Acknowledgments

This work has been partially funded by the Spanish Research Agency (Agencia Estatal de Investigación), DeepInfo project PID2021-127777OB-C22 (MCIU/AEI/FEDER, UE) and the HOLISTIC ANALYSIS OF ORGANISED MISINFORMATION ACTIVITY IN SOCIAL NETWORKS project (PCI2022-135026-2).

Author information

Authors and Affiliations

NLP & IR Group at UNED, Madrid, Spain
Jorge Avila, Álvaro Rodrigo & Roberto Centeno

Authors

Jorge Avila
View author publications
You can also search for this author in PubMed Google Scholar
Álvaro Rodrigo
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Centeno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jorge Avila or Roberto Centeno .

Editor information

Editors and Affiliations

Université Grenoble Alpes, CNRS, Grenoble, France
Lorraine Goeuriot
Université Grenoble Alpes, CNRS, Grenoble, France
Philippe Mulhem
Université Grenoble Alpes, CNRS, Grenoble, France
Georges Quénot
Univ.Grenoble Alpes, CNRS, Grenoble, France
Didier Schwab
University of Padova, Padua, Italy
Giorgio Maria Di Nunzio
Sorbonne University, Paris, France
Laure Soulier
University of Stavanger, Stavanger, Norway
Petra Galuščáková
University of Essex, Colchester, UK
Alba García Seco de Herrera
University of Padova, Padua, Italy
Guglielmo Faggioli
University of Padova, Padua, Italy
Nicola Ferro

A Hyperparameters

1.1 A.1 Run 1

subsample: 0.5
min_child_weight: 5
max_depth: 7
learning rate: 0.01
colsample_bytree: 0.5

1.2 A.2 Runs 2, 3, 4, 5 and 6

(See Table 3).

Table 3. Hyperparameters for runs 2, 3, 4 and 5

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Avila, J., Rodrigo, Á., Centeno, R. (2024). Best of Touché 2023 Task 4: Testing Data Augmentation and Label Propagation for Multilingual Multi-target Stance Detection. In: Goeuriot, L., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2024. Lecture Notes in Computer Science, vol 14958. Springer, Cham. https://doi.org/10.1007/978-3-031-71736-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-71736-9_13
Published: 14 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71735-2
Online ISBN: 978-3-031-71736-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Best of Touché 2023 Task 4: Testing Data Augmentation and Label Propagation for Multilingual Multi-target Stance Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Target-Phrase Zero-Shot Stance Detection: Where Do We Stand?

Chain of Stance: Stance Detection with Large Language Models

Bridging the Domain Gap for Stance Detection for the Zulu Language

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

A Hyperparameters

1.1 A.1 Run 1

1.2 A.2 Runs 2, 3, 4, 5 and 6

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Best of Touché 2023 Task 4: Testing Data Augmentation and Label Propagation for Multilingual Multi-target Stance Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Target-Phrase Zero-Shot Stance Detection: Where Do We Stand?

Chain of Stance: Stance Detection with Large Language Models

Bridging the Domain Gap for Stance Detection for the Zulu Language

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

A Hyperparameters

A Hyperparameters

1.1 A.1 Run 1

1.2 A.2 Runs 2, 3, 4, 5 and 6

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation