Automated Mining of Leaderboards for Empirical AI Research

Kabongo, Salomon; D’Souza, Jennifer; Auer, Sören

doi:10.1007/978-3-030-91669-5_35

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13133))

Included in the following conference series:

International Conference on Asian Digital Libraries

1383 Accesses
7 Altmetric

Abstract

With the rapid growth of research publications, empowering scientists to keep an oversight over scientific progress is of paramount importance. In this regard, the leaderboards facet of information organization provides an overview on the state-of-the-art by aggregating empirical results from various studies addressing the same research challenge. Crowdsourcing efforts like PapersWithCode among others are devoted to the construction of leaderboards predominantly for various subdomains in Artificial Intelligence. Leaderboards provide machine-readable scholarly knowledge that has proven to be directly useful for scientists to keep track of research progress – their construction could be greatly expedited with automated text mining.

This study presents a comprehensive approach for generating leaderboards for knowledge-graph-based scholarly information organization. Specifically, we investigate the problem of automated leaderboard construction using state-of-the-art transformer models, viz. Bert, SciBert, and XLNet. Our analysis reveals an optimal approach that significantly outperforms existing baselines for the task with evaluation scores above 90% in F1. This, in turn, offers new state-of-the-art results for leaderboard extraction. As a result, a vast share of empirical AI research can be organized in the next-generation digital libraries as knowledge graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

ORKG-Leaderboards: a systematic workflow for mining leaderboards as a knowledge graph

Article Open access 15 June 2023

OpenAIRE’s DOIBoost - Boosting Crossref for Research

SciSpace Literature Review: Harnessing AI for Effortless Scientific Discovery

Notes

1.
https://paperswithcode.com.
2.
https://www.orkg.org/orkg/benchmarks.
3.
https://github.com/Kabongosalomon/task-dataset-metric-nli-extraction, http://doi.org/10.5281/zenodo.5105798.
4.
They also evaluated the extracting the best score as an automated task which proved very challenging owing to inconsistency with which the best scores are reported and thereby the inability of pdf-to-text extractors to mine the data effectively.
5.
https://paperswithcode.com/.
6.
Our corpus was downloaded from the PwC Github repository https://github.com/paperswithcode/paperswithcode-data and was constructed by combining the information in the files All papers with abstracts and Evaluation tables which included article urls and TDM crowdsourced annotation metadata.
7.
https://semanticscholar.org.

References

AI metrics. https://www.eff.org/ai/metrics. Accessed 26 Apr 2021
Natural Language Inference. https://paperswithcode.com/task/natural-language-inference. Accessed 22 Apr 2021
Nlp-progress. http://nlpprogress.com/. Accessed 26 Apr 2021
paperswithcode.com. https://paperswithcode.com/. Accessed 26 Apr 2021
Reddit sota. https://github.com/RedditSota/state-of-the-art-result-for-machine-learning-problems. Accessed 26 Apr 2021
Squad explorer. https://rajpurkar.github.io/SQuAD-explorer/. Accessed 26 Apr 2021
Anteghini, M., D’Souza, J., Dos Santos, V.A.M., Auer, S.: SciBERT-based semantification of bioassays in the open research knowledge graph. In: EKAW-PD 2020, pp. 22–30 (2020)
Google Scholar
Anteghini, M., D’Souza, J., Martins dos Santos, V.A.P., Auer, S.: Representing semantified biological assays in the open research knowledge graph. In: Ishita, E., Pang, N.L.S., Zhou, L. (eds.) ICADL 2020. LNCS, vol. 12504, pp. 89–98. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64452-9_8
Chapter Google Scholar
Auer, S.: Towards an open research knowledge graph, January 2018. https://doi.org/10.5281/zenodo.1157185
Augenstein, I., Das, M., Riedel, S., Vikraman, L., McCallum, A.: SemEval 2017 task 10: ScienceIE - extracting keyphrases and relations from scientific publications. In: SemEval@ACL (2017)
Google Scholar
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019)
Brack, A., D’Souza, J., Hoppe, A., Auer, S., Ewerth, R., et al.: Domain-independent extraction of scientific concepts from research articles. In: Jose, J.M. (ed.) ECIR 2020. LNCS, vol. 12035, pp. 251–266. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45439-5_17
Chapter Google Scholar
Chiarelli, A., Johnson, R., Richens, E., Pinfield, S.: Accelerating scholarly communication: the transformative role of preprints (2019)
Google Scholar
Dai, Z., Yang, Z., Yang, Y., Carbonell, J.G., Le, Q., Salakhutdinov, R.: Transformer-XL: attentive language models beyond a fixed-length context. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2978–2988 (2019)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
D’Souza, J., Auer, S., Pedersen, T.: SemEval-2021 task 11: NLPcontributiongraph - structuring scholarly NLP contributions for a research knowledge graph. In: Proceedings of the Fifteenth Workshop on Semantic Evaluation. Association for Computational Linguistics, Bangkok, August 2021
Google Scholar
D’Souza, J., Auer, S., Pederson, T.: SemEval-2021 task 11: NLPContributionGraph - structuring scholarly NLP contributions for a research knowledge graph, May 2021. https://zenodo.org/record/4737071
D’Souza, J., Hoppe, A., Brack, A., Jaradeh, M.Y., Auer, S., Ewerth, R.: The STEM-ECR dataset: grounding scientific entity references in stem scholarly content to authoritative encyclopedic and lexicographic sources. In: LREC, Marseille, France, pp. 2192–2203, May 2020
Google Scholar
D’Souza, J., Auer, S.: Sentence, phrase, and triple annotations to build a knowledge graph of natural language processing contributions–a trial dataset. J. Data Inf. Sci. 20210429 (2021)
Google Scholar
Gábor, K., Buscaldi, D., Schumann, A.K., QasemiZadeh, B., Zargayouna, H., Charnois, T.: SemEval-2018 task 7: semantic relation extraction and classification in scientific papers. In: Proceedings of The 12th International Workshop on Semantic Evaluation, pp. 679–688 (2018)
Google Scholar
Ghasemi-Gol, M., Szekely, P.: TabVec: table vectors for classification of web tables. arXiv preprint arXiv:1802.06290 (2018)
Handschuh, S., QasemiZadeh, B.: The ACL RD-TEC: a dataset for benchmarking terminology extraction and classification in computational linguistics. In: COLING 2014: 4th International Workshop on Computational Terminology (2014)
Google Scholar
Herzig, J., Nowak, P.K., Mueller, T., Piccinno, F., Eisenschlos, J.: TaPas: weakly supervised table parsing via pre-training. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4320–4333 (2020)
Google Scholar
Hou, Y., Jochim, C., Gleize, M., Bonin, F., Ganguly, D.: Identification of tasks, datasets, evaluation metrics, and numeric scores for scientific leaderboards construction. arXiv preprint arXiv:1906.09317 (2019)
Hou, Y., Jochim, C., Gleize, M., Bonin, F., Ganguly, D.: TDMSci: a specialized corpus for scientific literature entity tagging of tasks datasets and metrics. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 707–714 (2021)
Google Scholar
Jain, S., van Zuylen, M., Hajishirzi, H., Beltagy, I.: SciREX: a challenge dataset for document-level information extraction. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7506–7516 (2020)
Google Scholar
Jaradeh, M.Y., et al.: Open research knowledge graph: next generation infrastructure for semantic scholarly knowledge. In: Proceedings of the 10th International Conference on Knowledge Capture, pp. 243–246 (2019)
Google Scholar
Jiang, M., D’Souza, J., Auer, S., Downie, J.S.: Improving scholarly knowledge representation: evaluating BERT-based models for scientific relation classification. In: Ishita, E., Pang, N.L.S., Zhou, L. (eds.) ICADL 2020. LNCS, vol. 12504, pp. 3–19. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64452-9_1
Chapter Google Scholar
Jinha, A.E.: Article 50 million: an estimate of the number of scholarly articles in existence. Learn. Publ. 23(3), 258–263 (2010)
Article Google Scholar
Kardas, M., et al.: AxCell: automatic extraction of results from machine learning papers. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 8580–8594 (2020)
Google Scholar
Kononova, O., et al.: Text-mined dataset of inorganic materials synthesis recipes. Sci. Data 6(1), 1–11 (2019)
MathSciNet Google Scholar
Kulkarni, C., Xu, W., Ritter, A., Machiraju, R.: An annotated corpus for machine reading of instructions in wet lab protocols. In: NAACL: HLT, Volume 2 (Short Papers), New Orleans, Louisiana, pp. 97–106, June 2018. https://doi.org/10.18653/v1/N18-2016
Kuniyoshi, F., Makino, K., Ozawa, J., Miwa, M.: Annotating and extracting synthesis process of all-solid-state batteries from scientific literature. In: LREC, pp. 1941–1950 (2020)
Google Scholar
Liu, Y., Bai, K., Mitra, P., Giles, C.L.: TableSeer: automatic table metadata extraction and searching in digital libraries. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 91–100 (2007)
Google Scholar
Lopez, P.: GROBID: combining automatic bibliographic data recognition and term extraction for scholarship publications. In: Agosti, M., Borbinha, J., Kapidakis, S., Papatheodorou, C., Tsakonas, G. (eds.) ECDL 2009. LNCS, vol. 5714, pp. 473–474. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04346-8_62
Chapter Google Scholar
Luan, Y., He, L., Ostendorf, M., Hajishirzi, H.: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. arXiv preprint arXiv:1808.09602 (2018)
Luan, Y., He, L., Ostendorf, M., Hajishirzi, H.: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. In: EMNLP (2018)
Google Scholar
Manning, C.D.: Computational linguistics and deep learning. Comput. Linguist. 41(4), 701–707 (2015)
Article MathSciNet Google Scholar
Milosevic, N., Gregson, C., Hernandez, R., Nenadic, G.: A framework for information extraction from tables in biomedical literature. Int. J. Doc. Anal. Recognit. 22(1), 55–78 (2019). https://doi.org/10.1007/s10032-019-00317-0
Article Google Scholar
Mondal, I., Hou, Y., Jochim, C.: End-to-end NLP knowledge graph construction. arXiv preprint arXiv:2106.01167 (2021)
Mysore, S., et al.: The materials science procedural text corpus: annotating materials synthesis procedures with shallow semantic structures. In: Proceedings of the 13th Linguistic Annotation Workshop, pp. 56–64 (2019)
Google Scholar
Oelen, A., Stocker, M., Auer, S.: Crowdsourcing scholarly discourse annotations. In: 26th International Conference on Intelligent User Interfaces, pp. 464–474 (2021)
Google Scholar
Renear, A.H., Palmer, C.L.: Strategic reading, ontologies, and the future of scientific publishing. Science 325(5942), 828–832 (2009)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Ware, M., Mabe, M.: The STM report: an overview of scientific and scholarly journal publishing, March 2015
Google Scholar
Wei, X., Croft, B., Mccallum, A.: Table extraction for answer retrieval. Inf. Retr. 9(5), 589–611 (2006). https://doi.org/10.1007/s10791-006-9005-5
Article Google Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 (2019)

Download references

Acknowledgements

This work was co-funded by the Federal Ministry of Education and Research (BMBF) of Germany for the project LeibnizKILabor (grant no. 01DD20003) and by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536).

Author information

Authors and Affiliations

L3S Research Center, Leibniz University of Hannover, Hannover, Germany
Salomon Kabongo & Sören Auer
TIB Leibniz Information Centre for Science and Technology, Hannover, Germany
Jennifer D’Souza & Sören Auer

Authors

Salomon Kabongo
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer D’Souza
View author publications
You can also search for this author in PubMed Google Scholar
Sören Auer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Salomon Kabongo , Jennifer D’Souza or Sören Auer .

Editor information

Editors and Affiliations

National Taiwan Normal University, Taipei, Taiwan
Hao-Ren Ke
Nanyang Technological University, Singapore, Singapore
Chei Sian Lee
Kyoto University, Kyoto, Japan
Kazunari Sugiyama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kabongo, S., D’Souza, J., Auer, S. (2021). Automated Mining of Leaderboards for Empirical AI Research. In: Ke, HR., Lee, C.S., Sugiyama, K. (eds) Towards Open and Trustworthy Digital Societies. ICADL 2021. Lecture Notes in Computer Science(), vol 13133. Springer, Cham. https://doi.org/10.1007/978-3-030-91669-5_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-91669-5_35
Published: 30 November 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91668-8
Online ISBN: 978-3-030-91669-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automated Mining of Leaderboards for Empirical AI Research

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

ORKG-Leaderboards: a systematic workflow for mining leaderboards as a knowledge graph

OpenAIRE’s DOIBoost - Boosting Crossref for Research

SciSpace Literature Review: Harnessing AI for Effortless Scientific Discovery

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Automated Mining of Leaderboards for Empirical AI Research

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

ORKG-Leaderboards: a systematic workflow for mining leaderboards as a knowledge graph

OpenAIRE’s DOIBoost - Boosting Crossref for Research

SciSpace Literature Review: Harnessing AI for Effortless Scientific Discovery

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation