Article

Hie-Transformer: A Hierarchical Hybrid Transformer for Abstractive Article Summarization

Authors:

Gongshen LiuAuthors Info & Claims

Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12–15, 2019, Proceedings, Part III

Pages 248 - 258

https://doi.org/10.1007/978-3-030-36718-3_21

Published: 12 December 2019 Publication History

Abstract

Abstractive summarization methods based on neural network models can generate more human-written and higher qualities summaries than extractive methods. However, there are three main problems for these abstractive models: inability to deal with long article inputs, out-of-vocabulary (OOV) words and repetition words in generated summaries. To tackle these problems, we proposes a hierarchical hybrid Transformer model for abstractive article summarization in this work. First, the proposed model is based on a hierarchical Transformer with selective mechanism. The Transformer has outperformed traditional sequence-to-sequence models in many natural language processing (NLP) tasks and the hierarchical structure can handle the very long article inputs. Second, the pointer-generator mechanism is applied to combine generating novel words with copying words from article inputs, which can reduce the probability of the OOV words. Additionally, we use the coverage mechanism to reduce the repetitions in summaries. The proposed model is applied to CNN-Daily Mail summarization task. The evaluation results and analyses can demonstrate that our proposed model has a competitively performance compared with the baselines.

References

[1]

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

[2]

Celikyilmaz, A., Bosselut, A., He, X., Choi, Y.: Deep communicating agents for abstractive summarization. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Long Papers, vol. 1, pp. 1662–1675 (2018)

[3]

Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the NAACL: Human Language Technologies, pp. 93–98 (2016)

[4]

Gehrmann, S., Deng, Y., Rush, A.: Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4098–4109 (2018)

[5]

Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: ACL, vol. 1, pp. 1631–1640 (2016)

[6]

Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)

[7]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)

[8]

Letarte, G., Paradis, F., Giguère, P., Laviolette, F.: Importance of self-attention for sentiment analysis. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 267–275 (2018)

[9]

Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the NAACL on Human Language Technology, vol. 1, pp. 71–78. ACL (2003)

[10]

Lin, J., Xu, S., Ma, S., Su, Q.: Global encoding for abstractive summarization. In: ACL, vol. 2, pp. 163–169 (2018)

[11]

Mani I Advances in Automatic Text Summarization 1999 Cambridge MIT Press

[12]

Nallapati, R., Zhou, B., dos Santos, C., Gulçehre, Ç., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. In: CoNLL 2016, p. 280 (2016)

[13]

Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: EMNLP, pp. 379–389 (2015)

[14]

See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: ACL, vol. 1, pp. 1073–1083 (2017)

[15]

Tao, C., Gao, S., Shang, M., Wu, W., Zhao, D., Yan, R.: Get the point of my utterance! learning towards effective responses with multi-head attention mechanism. In: IJCAI, pp. 4418–4424 (2018)

[16]

Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

[17]

Xing, C., Wu, Y., Wu, W., Huang, Y., Zhou, M.: Hierarchical recurrent attention network for response generation. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

[18]

Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)

[19]

Zhang, J., et al.: Improving the transformer translation model with document-level context. In: EMNLP, pp. 533–542 (2018)

[20]

Zhou, Q., Yang, N., Wei, F., Zhou, M.: Selective encoding for abstractive sentence summarization. In: ACL, Long Papers, vol. 1, pp. 1095–1104 (2017)

Cited By

Rao AAithal SSingh S(2024)Single-Document Abstractive Text Summarization: A Systematic Literature ReviewACM Computing Surveys10.1145/370063957:3(1-37)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3700639

Index Terms

Hie-Transformer: A Hierarchical Hybrid Transformer for Abstractive Article Summarization
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
  2. Machine learning
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Summarization
  2. Information systems applications

Index terms have been assigned to the content through auto-classification.

Recommendations

Extractive-abstractive summarization with pointer and coverage mechanism
ICBDT '18: Proceedings of the 1st International Conference on Big Data Technologies

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization. However, they are facing the challenges of low efficiency and accuracy when dealing with long text: their capability are not enough to handle very ...
Abstractive Summarization Improved by WordNet-Based Extractive Sentences
Natural Language Processing and Chinese Computing
Abstract
Recently, the seq2seq abstractive summarization models have achieved good results on the CNN/Daily Mail dataset. Still, how to improve abstractive methods with extractive methods is a good research direction, since extractive methods have their ...
Hybrid multi-document summarization using pre-trained language models
Abstract
Abstractive multi-document summarization is a type of automatic text summarization. It obtains information from multiple documents and generates a human-like summary from them. In this paper, we propose an abstractive multi-document ...
Highlights
- Introducing a multi-document summarizer, called HMSumm, based on pre-trained methods.

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12–15, 2019, Proceedings, Part III

Dec 2019

661 pages

ISBN:978-3-030-36717-6

DOI:10.1007/978-3-030-36718-3

Editors:
Tom Gedeon
Australian National University, Canberra, ACT, Australia
,
Kok Wai Wong
Murdoch University, Murdoch, WA, Australia
,
Minho Lee
Kyungpook National University, Daegu, Korea (Republic of)

© Springer Nature Switzerland AG 2019.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 12 December 2019

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rao AAithal SSingh S(2024)Single-Document Abstractive Text Summarization: A Systematic Literature ReviewACM Computing Surveys10.1145/370063957:3(1-37)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3700639

View Options

View options

Figures

Tables

Media

View Table of Conten