Abstract
Scientific claim verification can help the researchers easily find the target scientific papers with the sentence evidence from a large corpus for the given claim. Because there are a huge amount of papers in the corpus, most of the existing scientific claim verification solutions are always in a two-stage manner that first roughly detects a set of candidate related papers by some naïve but fast methods such as some similarity measures, and then utilizes the large but relatively slow deep neural models for accurate classification. To improve the recall of the overall system by improving the recall of the rough abstract retrieval stage, we propose an approach that also utilizes the neural classification model for the rough retrieval stage. To improve the scalability of the proposal, we propose a distillation-based method to obtain a lightweight model for the rough retrieval stage. The experimental results on the benchmark dataset SciFact show that our approach outperforms the existing works.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, pp. 2623–2631. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3292500.3330701
Chen, J., Zhang, R., Guo, J., Fan, Y., Cheng, X.: Gere: generative evidence retrieval for fact verification. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022, pp. 2184–2189. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3477495.3531827
Chen, Q., Peng, Y., Lu, Z.: BioSentVec: creating sentence embeddings for biomedical texts. In: 2019 IEEE International Conference on Healthcare Informatics (ICHI), pp. 1–5 (2019). https://doi.org/10.1109/ICHI.2019.8904728
Ferreira, W., Vlachos, A.: Emergent: a novel data-set for stance classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1163–1168 (2016)
Hanselowski, A., et al.: UKP-Athene: multi-sentence textual entailment for claim verification. In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pp. 103–108 (2018)
Hidey, C., et al.: DeSePtion: dual sequence prediction and adversarial examples for improved fact-checking. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8593–8606. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.761. https://www.aclweb.org/anthology/2020.acl-main.761
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015). http://arxiv.org/abs/1503.02531
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2019). https://doi.org/10.1093/bioinformatics/btz682
Li, X., Burns, G.A., Peng, N.: A paragraph-level multi-task learning model for scientific fact-verification. In: Veyseh, A.P.B., Dernoncourt, F., Nguyen, T.H., Chang, W., Celi, L.A. (eds.) Proceedings of the Workshop on Scientific Document Understanding co-located with 35th AAAI Conference on Artificial Intelligence, SDU@AAAI 2021, Virtual Event, 9 February 2021. CEUR Workshop Proceedings, vol. 2831. CEUR-WS.org (2021). http://ceur-ws.org/Vol-2831/paper8.pdf
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692
Liu, Z., Xiong, C., Sun, M., Liu, Z.: Fine-grained fact verification with kernel graph attention network. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7342–7351. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.655. https://aclanthology.org/2020.acl-main.655
Lu, Y.J., Li, C.T.: GCAN: graph-aware co-attention networks for explainable fake news detection on social media. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 505–514. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.48. https://www.aclweb.org/anthology/2020.acl-main.48
Nie, Y., Chen, H., Bansal, M.: Combining fact extraction and verification with neural semantic matching networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6859–6866 (2019)
Pradeep, R., Ma, X., Nogueira, R., Lin, J.: Scientific claim verification with VerT5erini. In: Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis, pp. 94–103. Association for Computational Linguistics, Online (2021). https://www.aclweb.org/anthology/2021.louhi-1.11
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020). http://jmlr.org/papers/v21/20-074.html
Thorne, J., Vlachos, A., Christodoulopoulos, C., Mittal, A.: FEVER: a large-scale dataset for fact extraction and VERification. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana (Volume 1: Long Papers), pp. 809–819. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/N18-1074. https://www.aclweb.org/anthology/N18-1074
Vlachos, A., Riedel, S.: Fact checking: task definition and dataset construction. In: Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pp. 18–22 (2014)
Wadden, D., et al.: Fact or fiction: verifying scientific claims. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7534–7550. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-main.609. https://www.aclweb.org/anthology/2020.emnlp-main.609
Wadden, D., Lo, K., Wang, L., Cohan, A., Beltagy, I., Hajishirzi, H.: MultiVerS: improving scientific claim verification with weak supervision and full-document context. In: Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, USA, pp. 61–76. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.findings-naacl.6. https://aclanthology.org/2022.findings-naacl.6
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, pp. 1480–1489. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/N16-1174. https://aclanthology.org/N16-1174
Zeng, X., Zubiaga, A.: QMUL-SDS at SCIVER: step-by-step binary classification for scientific claim verification. In: Proceedings of the Second Workshop on Scholarly Document Processing, pp. 116–123. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.sdp-1.15. https://aclanthology.org/2021.sdp-1.15
Zhang, Z., Li, J., Fukumoto, F., Ye, Y.: Abstract, rationale, stance: a joint model for scientific claim verification. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, pp. 3580–3586. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.emnlp-main.290. https://aclanthology.org/2021.emnlp-main.290
Acknowledgements
This works was partially supported by 23H03402.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Z., Li, J., Fukumoto, F. (2023). An Efficient Approach for Improving the Recall of Rough Abstract Retrieval in Scientific Claim Verification. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14261. Springer, Cham. https://doi.org/10.1007/978-3-031-44198-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-44198-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44197-4
Online ISBN: 978-3-031-44198-1
eBook Packages: Computer ScienceComputer Science (R0)