Abstract
Touché 2023 task 4 evaluated stance detection in a Multilingual multi-target setting with a reduced annotated dataset. This is why we have tested different approaches focused on increasing training data by (1) including new samples from back-translating original training data and (2) adding samples from unlabeled data using label propagation. The results showed that back-translation was successful, while label propagation worsened the performance. We obtained the best results with a transformer-based model fine-tuned in two steps: the first on a related, more extensive dataset and the second on the development data. This shows the usefulness of including related data in our approach and suggests additional research based on taking advantage of other datasets and data augmentation. Besides, given that the current results were close to a 0.35 f1 score, there is still room for improvement in this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
The final model selected in each case was decided based on the results obtained during previous tests.
- 8.
- 9.
- 10.
- 11.
- 12.
It is important to note that the actual labels of the test set have never been made public and therefore, no checks or additional experiments could be performed.
References
Agerri, R., Centeno, R., Espinosa, M.S., de Landa, J.F., Rodrigo, Á.: VaxxStance@iberLEF 2021: overview of the task on going beyond text in cross-lingual stance detection. Proces. del Leng. Nat. 67, 173–181 (2021)
Anand, P., Walker, M., Abbott, R., Fox Tree, J.E., Bowmani, R., Minor, M.: Cats rule and dogs drool!: Classifying stance in online debate. In: Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2011) (2011)
Barriere, V., Balahur, A.: Multilingual multi-target stance recognition in online public consultations. Mathematics 11(9), 2161 (2023)
Barriere, V., Jacquet, G.G., Hemamou, L.: CoFE: a new dataset of intra-multilingual multi-target stance classification from an online European participatory democracy platform. In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, pp. 418–422 (2022)
Bondarenko, A., et al.: Overview of Touché 2023: argument and causal retrieval. In: Arampatzis, A., et al. (eds.) CLEF 2023. LNCS, vol. 14163, pp. 507–530. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-42448-9_31
Cignarella, A.T., Lai, M., Bosco, C., Patti, V., Paolo, R., et al.: SardiStance@ EVALITA2020: overview of the task on stance detection in Italian tweets. In: EVALITA 2020 Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, pp. 1–10. CEUR (2020)
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451 (2020)
Dey, K., Shrivastava, R., Kaushik, S.: Topical stance detection for Twitter: a two-phase LSTM model using attention. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 529–536. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_40
Faulkner, A.: Automated classification of stance in student essays: an approach using stance target information and the Wikipedia link-based measure. The Florida AI Research Society (2014)
Hardalov, M., Arora, A., Nakov, P., Augenstein, I.: Cross-domain label-adaptive stance detection. In: Moens, M.F., Huang, X., Specia, L., Yih, S.W. (eds.) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 9011–9028 (2021)
Hardalov, M., Arora, A., Nakov, P., Augenstein, I.: A survey on stance detection for mis- and disinformation identification. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 1259–1277. Association for Computational Linguistics, Seattle (2022)
Hercig, T., Krejzl, P., Hourová, B., Steinberger, J., Lenc, L.: Detecting stance in Czech news commentaries. In: Conference on Theory and Practice of Information Technologies (2017). https://api.semanticscholar.org/CorpusID:35923394
Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Label propagation for deep semi-supervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5070–5079 (2019)
Küçük, D., Can, F.: Stance detection: a survey. ACM Comput. Surv. 53(1), 1–37 (2020)
Küçük, D., Can, F.: Stance detection on tweets: an SVM-based approach. arXiv preprint arXiv:1803.08910 (2018)
Lai, M., Cignarella, A.T., Hernández Farías, D.I., Bosco, C., Patti, V., Rosso, P.: Multilingual stance detection in social media political debates. Comput. Speech Lang. 63, 101075 (2020)
Lai, M., Patti, V., Ruffo, G., Rosso, P.: Stance evolution and Twitter interactions in an Italian political debate. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 15–27. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91947-8_2
Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: A dataset for detecting stance in tweets. In: Calzolari, N., et al. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia (2016)
Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: SemEval-2016 task 6: detecting stance in tweets. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 31–41 (2016)
Patel, H., Verma, J.P.: Community detection using label propagation algorithm with random walk approach. In: Dhavse, R., Kumar, V., Monteleone, S. (eds.) Emerging Technology Trends in Electronics, Communication and Networking. LNEE, vol. 952, pp. 307–320. Springer, Singapore (2023). https://doi.org/10.1007/978-981-19-6737-5_25
Schäfer, K.: Queen of swords at Touché 2023: intra-multilingual multi-target stance classification using BERT. In: Working Notes of CLEF (2023)
Sugiyama, A., Yoshinaga, N.: Data augmentation using back-translation for context-aware neural machine translation. In: Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019) (2019)
Taulé, M., Martí, M.A., Pardo, F.M.R., Rosso, P., Bosco, C., Patti, V.: Overview of the task on stance and gender detection in tweets on Catalan independence. In: IberEval@SEPLN (2017)
Taulé, M., Pardo, F.M.R., Martí, M.A., Rosso, P.: Overview of the task on multimodal stance detection in tweets on Catalan #1oct referendum. In: IberEval@SEPLN (2018)
Vamvas, J., Sennrich, R.: X-stance: a multilingual multi-target dataset for stance detection. In: Proceedings of SwissText/KONVENS 2020 (2020)
Van Dyk, D.A., Meng, X.L.: The art of data augmentation. J. Comput. Graph. Stat. 10(1), 1–50 (2001)
Vaswani, A., et al.: Attention is all you need (2023)
Wang, F., Zhang, C.: Label propagation through linear neighborhoods. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 985–992 (2006)
Wei, W., Zhang, X., Liu, X., Chen, W., Wang, T.: pkudblab at SemEval-2016 task 6: a specific convolutional neural network system for effective stance detection. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016) (2016)
Xu, R., Zhou, Y., Wu, D., Gui, L., Du, J., Xue, Y.: Overview of NLPCC shared task 4: stance detection in Chinese microblogs. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 907–916. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_85
Zotova, E., Agerri, R., Rigau, G.: Semi-automatic generation of multilingual datasets for stance detection in Twitter. Expert Syst. Appl. 170, 114547 (2021)
Acknowledgments
This work has been partially funded by the Spanish Research Agency (Agencia Estatal de Investigación), DeepInfo project PID2021-127777OB-C22 (MCIU/AEI/FEDER, UE) and the HOLISTIC ANALYSIS OF ORGANISED MISINFORMATION ACTIVITY IN SOCIAL NETWORKS project (PCI2022-135026-2).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
A Hyperparameters
A Hyperparameters
1.1 A.1 Run 1
-
subsample: 0.5
-
min_child_weight: 5
-
max_depth: 7
-
learning rate: 0.01
-
colsample_bytree: 0.5
1.2 A.2 Runs 2, 3, 4, 5 and 6
(See Table 3).
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Avila, J., Rodrigo, Á., Centeno, R. (2024). Best of Touché 2023 Task 4: Testing Data Augmentation and Label Propagation for Multilingual Multi-target Stance Detection. In: Goeuriot, L., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2024. Lecture Notes in Computer Science, vol 14958. Springer, Cham. https://doi.org/10.1007/978-3-031-71736-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-71736-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71735-2
Online ISBN: 978-3-031-71736-9
eBook Packages: Computer ScienceComputer Science (R0)