research-article

Open access

Learning to Extract Attribute Value from Product via Question Answering: A Multi-task Approach

Authors:

Li Yang,

Bin Shu,

Zac Yu,

Jon ElsasAuthors Info & Claims

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 47 - 55

https://doi.org/10.1145/3394486.3403047

Published: 20 August 2020 Publication History

PDF eReader

Abstract

Attribute value extraction refers to the task of identifying values of an attribute of interest from product information. It is an important research topic which has been widely studied in e-Commerce and relation learning. There are two main limitations in existing attribute value extraction methods: scalability and generalizability. Most existing methods treat each attribute independently and build separate models for each of them, which are not suitable for large scale attribute systems in real-world applications. Moreover, very limited research has focused on generalizing extraction to new attributes.

In this work, we propose a novel approach for Attribute Value Extraction via Question Answering (AVEQA) using a multi-task framework. In particular, we build a question answering model which treats each attribute as a question and identifies the answer span corresponding to the attribute value in the product context. A unique BERT contextual encoder is adopted and shared across all attributes to encode both the context and the question, which makes the model scalable. A distilled masked language model with knowledge distillation loss is introduced to improve the model generalization ability. In addition, we employ a no-answer classifier to explicitly handle the cases where there are no values for a given attribute in the product context. The question answering, distilled masked language model and the no answer classification are then combined into a unified multi-task framework. We conduct extensive experiments on a public dataset. The results demonstrate that the proposed approach outperforms several state-of-the-art methods with large margin.

Supplementary Material

MP4 File (3394486.3403047.mp4)

We propose a novel approach for Attribute Value Extraction via Question Answering (AVEQA) using a multi-task framework. In particular, we build a question answering model which treats each attribute as a question and identifies the answer span corresponding to the attribute value in the product context. A unique BERT contextual encoder is adopted and shared across all attributes to encode both the context and the question, which makes the model scalable. A distilled masked language model with knowledge distillation loss is introduced to improve the model generalization ability. In addition, we employ a no-answer classifier to explicitly handle the cases where there are no values for a given attribute in the product context. The question answering, distilled masked language model and the no answer classification are then combined into a unified multi-task framework.

Download
47.76 MB

References

[1]

R. Baradaran, R. Ghiasi, and H. Amirkhani. A survey on machine reading comprehension systems. CoRR, abs/2001.01582, 2020.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

CAVE: Correcting Attribute Values in E-commerce Profiles

Fusing Attribute Type Features for Attribute Value Extraction from Product via Question Answering

QPAVE: A Multi-task Question Answering Approach for Fine-Grained Product Attribute Value Extraction

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations