Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2034691.2034703acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
demonstration

The art of mathematics retrieval

Published: 19 September 2011 Publication History

Abstract

The design and architecture of MIaS (Math Indexer and Searcher), a system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a similarity of math subformulae. The system was implemented as a math-aware search engine based on the state-of-the-art system Apache Lucene.
Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a Solr-compatible Lucene.

References

[1]
\c S. Anca. Natural Language and Mathematics Processing for Applicable Theorem Search. Master's thesis, Jacobs University, Bremen, Aug. 2009. https://svn.eecs.jacobs-university.de/svn/eecs/archive/msc-2009/aanca.pdf.
[2]
D. Archambault and V. Moco. Canonical MathML to Simplify Conversion of MathML to Braille Mathematical Notations. In K. Miesenberger, J. Klaus, W. Zagler, and A. Karshmer, editors, Computers Helping People with Special Needs, volume 4061 of Lecture Notes in Computer Science, pages 1191--1198. Springer Berlin / Heidelberg, 2006. http://dx.doi.org/10.1007/11788713_172.
[3]
M. Líaka. Vyhledávání v matematickém textu (in Slovak), Searching Mathematical Texts, 2010. Bachelor Thesis, Masaryk University, Brno, Faculty of Informatics (advisor: Petr Sojka), https://is.muni.cz/th/255768/fi_b/?lang=en.
[4]
M. Líaka, P. Sojka, M. R°u~icka, and P. Mravec. Web Interface and Collection for Mathematical Retrieval. In P. Sojka and T. Bouche, editors, Proceedings of DML 2011, pages 77--84, Bertinoro, Italy, July 2011. Masaryk University. http://www.fi.muni.cz/ sojka/dml-2011-program.html.
[5]
J. Miautka and L. Galamboa. Extending Full Text Search Engine for Mathematical Content. In P. Sojka, editor, Proceedings of DML 2008, pages 55--67, Birmingham, UK, July 2008. Masaryk University. http://dml.cz/dmlcz/702546.
[6]
R. Munavalli and R. Miner. MathFind: A Math-Aware Search Engine. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR,'06, pages 735--735, New York, NY, USA, 2006. ACM. http://doi.acm.org/10.1145/1148170.1148348.
[7]
P. Sojka and M. Líaka. Indexing and Searching Mathematics in Digital Libraries -- Architecture, Design and Scalability Issues. In J. H. Davenport, W.M. Farmer, J. Urban and F. Rabe, editors, Proceedings of CICM Conference 2011 (Calculemus/MKM), volume 6824 of Lecture Notes in Artificial Intelligence, LNAI, pages 228--243, Berlin, Germany, July 2011. Springer\discretionary-Verlag. http://dx.doi.org/10.1007/978-3-642-22673-1_16.
[8]
H. Stamerjohanns, M. Kohlhase, D. Ginev, C. David, and B. Miller. Transforming Large Collections of Scientific Publications to XML. Mathematics in Computer Science, 3:299--307, 2010. http://dx.doi.org/10.1007/s11786-010-0024-7.
[9]
W. Sylwestrzak, J. Borbinha, T. Bouche, A. Nowinski, and P. Sojka. EuDML--Towards the European Digital Mathematics Library. In P. Sojka, editor, Proceedings of DML 2010, pages 11--24, Paris, France, July 2010. Masaryk University. http://dml.cz/dmlcz/702569.

Cited By

View all
  • (2024)Mathematical Information Retrieval: A ReviewACM Computing Surveys10.1145/369995357:3(1-34)Online publication date: 9-Oct-2024
  • (2024)The Effectiveness of Graph Contrastive Learning on Mathematical Information RetrievalAdvances on Graph-Based Approaches in Information Retrieval10.1007/978-3-031-71382-8_5(60-72)Online publication date: 10-Oct-2024
  • (2023)Recognising formula entailment using long short-term memory networkJournal of Information Science10.1177/01655515231184826Online publication date: 20-Jul-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DocEng '11: Proceedings of the 11th ACM symposium on Document engineering
September 2011
296 pages
ISBN:9781450308632
DOI:10.1145/2034691
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 September 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. digital mathematics libraries
  2. information systems
  3. math indexing and retrieval
  4. mathematical content representation
  5. mias
  6. webmias

Qualifiers

  • Demonstration

Conference

DocEng '11
Sponsor:
DocEng '11: ACM Symposium on Document Engineering
September 19 - 22, 2011
California, Mountain View, USA

Acceptance Rates

Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mathematical Information Retrieval: A ReviewACM Computing Surveys10.1145/369995357:3(1-34)Online publication date: 9-Oct-2024
  • (2024)The Effectiveness of Graph Contrastive Learning on Mathematical Information RetrievalAdvances on Graph-Based Approaches in Information Retrieval10.1007/978-3-031-71382-8_5(60-72)Online publication date: 10-Oct-2024
  • (2023)Recognising formula entailment using long short-term memory networkJournal of Information Science10.1177/01655515231184826Online publication date: 20-Jul-2023
  • (2023)Math Information Retrieval with Contrastive Learning of Formula EmbeddingsWeb Information Systems Engineering – WISE 202310.1007/978-981-99-7254-8_8(97-107)Online publication date: 21-Oct-2023
  • (2022)Scientific document retrieval using structure encoded string with trie indexingInformation Services and Use10.3233/ISU-22015542:2(241-259)Online publication date: 1-Jan-2022
  • (2022)Embedding and generalization of formula with context in the retrieval of mathematical informationJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2021.05.01434:9(6624-6634)Online publication date: Oct-2022
  • (2022)A Fine-tuning Retrieval System for Mathematical InformationProceedings of the Seventh International Conference on Mathematics and Computing10.1007/978-981-16-6890-6_81(1085-1100)Online publication date: 6-Mar-2022
  • (2021)Mathematical Information Retrieval Trends and TechniquesDeep Natural Language Processing and AI Applications for Industry 5.010.4018/978-1-7998-7728-8.ch005(74-92)Online publication date: 2021
  • (2021)ARQMathACM SIGIR Forum10.1145/3483382.348338854:2(1-9)Online publication date: 20-Aug-2021
  • (2021)Learning to Rank for Mathematical Formula RetrievalProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462956(952-961)Online publication date: 11-Jul-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media