Retrieval Augmented Generation (RAG) has become prevalent in question-answering (QA) tasks due to its ability of utilizing search engine to enhance the quality of long-form question-answering (LFQA). Despite the emergence of various open source methods and web-enhanced commercial systems such as Bing Chat, two critical problems remain unsolved, i.e., the lack of factuality and clear logic in the generated long-form answers. In this paper, we remedy these issues via a systematic study on answer generation in web-enhanced LFQA. Specifically, we first propose a novel outline-enhanced generator to achieve clear logic in the generation of multifaceted answers and construct two datasets accordingly. Then we propose a factuality optimization method based on a carefully designed doubly fine-grained RLHF framework, which contains automatic evaluation and reward modeling in different levels of granularity. Our generic framework comprises conventional fine-grained RLHF methods as special cases. Extensive experiments verify the superiority of our proposed Factuality-optimized RAG (FoRAG) method on both English and Chinese benchmarks. In particular, when applying our method to Llama2-7B-chat, the derived model FoRAG-L-7B outperforms WebGPT-175B in terms of three commonly used metrics (i.e., coherence, helpfulness, and factuality), while the number of parameters is much smaller (only 1/24 of that of WebGPT-175B). Our datasets and models are made publicly available for better reproducibility.https://huggingface.co/forag łabelfootnote_dataset_url

Supplemental Material

MOV File - promotional video

The video introduces our research "FoRAG," an advancement in Retrieval Augmented Generation (RAG) for web-enhanced long-form question answering, which improves the use of search engines to provide relevant information to large language models. This enhancement addresses the issue of logical coherence and factual accuracy found in existing systems like Bing Chat. By proposing an outline-enhanced generator and creating two large datasets, the research aims to improve the answer quality. Additionally, a novel factuality optimization method using a doubly fine-grained RLHF framework has been developed. Our method has achieved leading results in both Chinese and English benchmarks, addressing critical challenges in current question-answering models.

Download
59.71 MB

References

[1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).

Abstract

Supplemental Material

References

Index Terms

Recommendations

Open-ended long-form video question answering via adaptive hierarchical reinforced networks

Long video question answering: A Matching-guided Attention Model

How Question Generation Can Help Question Answering over Knowledge Base

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations