Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3583780.3615312acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Workshop on Document Intelligence Understanding

Published: 21 October 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically. Recently, there has been a rising demand for developing document understanding among different domains, including business, law, and medicine, to boost the efficiency of work that is associated with a large number of documents. This workshop aims to bring together researchers and industry developers in the field of document intelligence and understanding diverse document types to boost automatic document processing and understanding techniques. We also release a data challenge on the recently introduced document-level VQA dataset, PDFVQA. The PDFVQA challenge examines the model's structural and contextual understandings on the natural full document level of multiple consecutive document pages by including questions with a sequence of answers extracted from multi-pages of the full document. This task helps to boost the document understanding step from the single-page level to the full document level understanding.

    References

    [1]
    Yihao Ding, Zhe Huang, Runlin Wang, YanHang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, and Soyeon Caren Han. 2022. V-Doc: Visual questions answers with Documents. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21492--21498.
    [2]
    Yihao Ding, Siqu Long, Jiabin Huang, Kaixuan Ren, Xingxiang Luo, Hyunsuk Chung, and Soyeon Caren Han. 2023 a. Form-NLU: Dataset for the Form Natural Language Understanding. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2807--2816.
    [3]
    Yihao Ding, Siwen Luo, Hyunsuk Chung, and Soyeon Caren Han. 2023 b. PDFVQA: A New Dataset for Real-World VQA on PDF Documents. arXiv preprint arXiv:2304.06447 (2023).
    [4]
    Zheng Huang, Kai Chen, Jianhua He, Xiang Bai, Dimosthenis Karatzas, Shijian Lu, and CV Jawahar. 2019. Icdar2019 competition on scanned receipt ocr and information extraction. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1516--1520.
    [5]
    Minghao Li, Yiheng Xu, Lei Cui, Shaohan Huang, Furu Wei, Zhoujun Li, and Ming Zhou. 2020. DocBank: A Benchmark Dataset for Document Layout Analysis. In Proceedings of the 28th International Conference on Computational Linguistics. 949--960.
    [6]
    Siwen Luo, Yihao Ding, Siqu Long, Josiah Poon, and Soyeon Caren Han. 2022. Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, 2906--2916.
    [7]
    Minesh Mathew, Dimosthenis Karatzas, and CV Jawahar. 2021. Docvqa: A dataset for vqa on document images. In Proceedings of the IEEE/CVF winter conference on applications of computer vision. 2200--2209.
    [8]
    Seunghyun Park, Seung Shin, Bado Lee, Junyeop Lee, Jaeheung Surh, Minjoon Seo, and Hwalsuk Lee. 2019. CORD: a consolidated receipt dataset for post-OCR parsing. In Workshop on Document Intelligence at NeurIPS 2019.
    [9]
    Ryota Tanaka, Kyosuke Nishida, and Sen Yoshida. 2021. Visualmrc: Machine reading comprehension on document images. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 13878--13888.
    [10]
    Zhihao Zhang, Siwen Luo, Junyi Chen, Sijia Lai, Siqu Long, Hyunsuk Chung, and Soyeon Caren Han. 2023. PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 1152--1155.
    [11]
    Xu Zhong, Jianbin Tang, and Antonio Jimeno Yepes. 2019. Publaynet: largest dataset ever for document layout analysis. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1015--1022.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
    October 2023
    5508 pages
    ISBN:9798400701245
    DOI:10.1145/3583780
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. document content understanding
    2. document intelligence
    3. document structure understanding

    Qualifiers

    • Short-paper

    Funding Sources

    • Google Research

    Conference

    CIKM '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 74
      Total Downloads
    • Downloads (Last 12 months)74
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media