research-article

One-shot Text Field labeling using Attention and Belief Propagation for Structure Information Extraction

Authors:

Wei LinAuthors Info & Claims

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 340 - 348

https://doi.org/10.1145/3394171.3413511

Published: 12 October 2020 Publication History

Get Access

Abstract

Structured information extraction from document images usually consists of three steps: text detection, text recognition, and text field labeling. While text detection and text recognition have been heavily studied and improved a lot in literature, text field labeling is less explored and still faces many challenges. Existing learning based methods for text labeling task usually require a large amount of labeled examples to train a specific model for each type of document. However, collecting large amounts of document images and labeling them is difficult and sometimes impossible due to privacy issues. Deploying separate models for each type of document also consumes a lot of resources. Facing these challenges, we explore one-shot learning for the text field labeling task. Existing one-shot learning methods for the task are mostly rule-based and have difficulty in labeling fields in crowded regions with few landmarks and fields consisting of multiple separate text regions. To alleviate these problems, we proposed a novel deep end-to-end trainable approach for one-shot text field labeling, which makes use of attention mechanism to transfer the layout information between document images. We further applied conditional random field on the transferred layout information for the refinement of field labeling. We collected and annotated a real-world one-shot field labeling dataset with a large variety of document types and conducted extensive experiments to examine the effectiveness of the proposed model. To stimulate research in this direction, the collected dataset and the one-shot model will be released (https://github.com/AlibabaPAI/one_shot_text_labeling).

Supplementary Material

MP4 File (3394171.3413511.mp4)

The talk for paper one-shot text field labeling using attention and belief propagation for structural information extraction.

Download
695.81 MB

References

[1]

David Aldavert, Marcc al Rusi nol, and Ricardo Toledo. 2017. Automatic static/variable content separation in administrative document images. In International Conference on Document Analysis and Recognition, Vol. 1.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Learning labeling functions in distantly supervised relation extraction

Image block augmentation for one-shot learning

Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations