Pencils Down! Automatic Rubric-based Evaluation of Retrieve/Generate Systems
Abstract
References
Index Terms
- Pencils Down! Automatic Rubric-based Evaluation of Retrieve/Generate Systems
Recommendations
A Workbench for Autograding Retrieve/Generate Systems
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalThis resource paper addresses the challenge of evaluating Information Retrieval (IR) systems in the era of autoregressive Large Language Models (LLMs). Traditional methods relying on passage-level judgments are no longer effective due to the diversity of ...
Pooling-based continuous evaluation of information retrieval systems
AbstractThe dominant approach to evaluate the effectiveness of information retrieval (IR) systems is by means of reusable test collections built following the Cranfield paradigm. In this paper, we propose a new IR evaluation methodology based on pooled ...
Effort-based information retrieval evaluation with varied evaluation depth and topic sizes
ICBIM '19: Proceedings of the 3rd International Conference on Business and Information ManagementThe information retrieval accessed globally is a vital productivity boost for most organization. However, the outcome of information retrieval system evaluation does not agree with the real user's satisfaction. Information retrieval systems retrieving ...
Comments
Information & Contributors
Information
Published In
- General Chair:
- Harrie Oosterhuis,
- Program Chairs:
- Hannah Bast,
- Chenyan Xiong
Sponsors
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Author Tags
Qualifiers
- Research-article
Funding Sources
Conference
Acceptance Rates
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 450Total Downloads
- Downloads (Last 12 months)450
- Downloads (Last 6 weeks)89
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in