Abstract
Attention mechanism plays an important role in the perception and cognition of human beings. Among others, many machine learning models have been developed to memorize the sequential data, such as the Long Short-Term Memory (LSTM) network and its extensions. However, due to lack of the attention mechanism, they cannot pay special attention to the important parts of the sequences. In this paper, we present a novel machine learning method called attention-augmented machine memory (AAMM). It seamlessly integrates the attention mechanism into the memory cell of LSTM. As a result, it facilitates the network to focus on valuable information in the sequences and ignore irrelevant information during its learning. We have conducted experiments on two sequence classification tasks for pattern classification and sentiment analysis, respectively. The experimental results demonstrate the advantages of AAMM over LSTM and some other related approaches. Hence, AAMM can be considered as a substitute of LSTM in the sequence learning applications.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Corbetta M, Shulman GL. Control of Goal-Directed and Stimulus-Driven Attention in the Brain. Nat Rev Neurosci. 2002;3(3):201–15.
Posner MI. Cognitive Neuroscience of Attention. Guilford Press; 2011.
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In: ICML. 2015. pp. 2048–2057.
Gao F, Zhang Y, Wang J, Sun J, Yang E, Hussain A. Visual Attention Model Based Vehicle Target Detection in Synthetic Aperture Radar Images: A Novel Approach. Cogn Comput. 2015;7(4):434–44.
Hinton G, Salakhutdinov R. Reducing the Dimensionality of Data with Neural Networks. Science. 2006;313:
Shen T, Zhou T, Long G, Jiang J, Pan S, Zhang C. DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding. In: AAAI. 2018. pp. 5446–5455.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention Is All You Need. In: NIPS. 2017. pp. 6000–6010.
Luong T, Pham H, Manning CD. Effective approaches to attention-based neural machine translation. In: EMNLP. 2015. pp. 1412–1421.
Chung J, Gülçehre Ç, Cho K, Bengio Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. 2014. CoRR abs/1412.3555.
Hochreiter S, Schmidhuber J. Long Short-Term Memory. Neural computation. 1997;9(8):1735–80.
Lin Z, Feng M, dos Santos CN, Yu M, Xiang B, Zhou B, Bengio Y. A Structured Self-Attentive Sentence Embedding. In: ICLR. 2017.
Zhong G, Lin X, Chen K, Li Q, Huang K. Long Short-Term Attention. In: BICS. 2019. pp. 45–54.
Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR. ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis. Future Gener Comput Syst. 2021;115:279–94.
Cho K, Courville A, Bengio Y. Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks. IEEE Trans Multimedia. 2015;17(11):1875–86.
Sukhbaatar S, Weston J., Fergus, R., et al. End-to-End Memory Networks. In: NIPS. 2015. pp. 2440–2448.
Weston J, Chopra S, Bordes A. Memory networks. In: Y. Bengio, Y. LeCun (eds.) ICLR. 2015.
Kim Y, Denton C, Hoang L, Rush AM. Structured Attention Networks. In: ICLR. 2017.
Hsu WT, Lin C, Lee M, Min K, Tang J, Sun M. A unified model for extractive and abstractive summarization using inconsistency loss. In: I. Gurevych, Y. Miyao (eds.) ACL. 2018. pp. 132–141.
Gehrmann S, Deng Y, Rush AM. Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. 2018. pp. 4098–4109.
Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. In: ICLR. 2015.
Cho K, van Merrienboer B, Bahdanau D, Bengio Y. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. In: Proceedings of SSST@EMNLP. 2014. pp. 103–111.
Mikolov T, Karafiát M, Burget L, Černockỳ J, Khudanpur S. Recurrent Neural Network Based Language Model. In: INTERSPEECH. 2010.
Sutskever I, Martens J, Hinton GE. Generating Text with Recurrent Neural Networks. In: ICML. 2011. pp. 1017–1024.
Graves A, Jaitly N. Towards End-to-End Speech Recognition with Recurrent Neural Networks. In: ICML. 2014. pp. 1764–1772.
Graves A, Mohamed AR, Hinton G. Speech Recognition with Deep Recurrent Neural Networks. In: ICASSP. 2013. pp. 6645–6649.
Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. In: CVPR. 2015. pp. 2625–2634.
Li W, Shao W, Ji S, Cambria E. Bieru: Bidirectional emotional recurrent unit for conversational sentiment analysis. 2020. CoRR abs/2006.00492.
Wang Y, Long M, Wang J, Gao Z, Philip SY. PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs. In: NIPS. 2017. pp. 879–888.
He Z, Gao S, Xiao L, Liu D, He H, Barber D. Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence Learning. In: NIPS. 2017. pp. 1–11.
Neil D, Pfeiffer M, Liu SC. Phased LSTM: Accelerating Recurrent Network Training for Long or Event-Based Sequences. In: NIPS. 2016. pp. 3882–3890.
Shi X, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In: NIPS. 2015. pp. 802–810.
Liu J, Wang G, Hu P, Duan L, Kot AC. Global Context-Aware Attention LSTM Networks for 3D Action Recognition. In: CVPR. 2017. pp. 3671–3680.
Gao L, Guo Z, Zhang H, Xu X, Shen HT. Video Captioning With Attention-Based LSTM and Semantic Consistency. IEEE Trans Multimedia. 2017;19(9):2045–55.
Li Y, Zhu Z, Kong D, Han H, Zhao Y. EA-LSTM: Evolutionary Attention-Based LSTM for Time Series Prediction. Knowl Based Syst. 2019;181:104785.
Long X, Gan C, de Melo G, Wu J, Liu X, Wen S. Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification. In: CVPR. 2018. pp. 7834–7843.
Liu F, Zhou X, Wang T, Cao J, Wang Z, Wang H, Zhang Y. An Attention-based Hybrid LSTM-CNN Model for Arrhythmias Classification. In: IJCNN. 2019. pp. 1–8.
Liu Z, Zhou W, Li H. AB-LSTM: Attention-based Bidirectional LSTM Model for Scene Text Detection. ACM Transactions on Multimedia Computing, Communications, and Applications. 2019;15(4):1–23.
Guo Z, Gao L, Song J, Xu X, Shao, J., Shen, H.T. Attention-based LSTM with Semantic Consistency for Videos Captioning. In: ACMMM. 2016. pp. 357–361.
LeCun Y, Cortes C, Burges C. MNIST Handwritten Digit Database. AT&T Labs [Online]. 2010;2.
Xiao H, Rasul K, Vollgraf R. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. 2017. CoRR abs/1708.07747.
Graves A, Schmidhuber J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Neural Netw. 2005;18(5–6):602–10.
Moniz JRA, Krueger D. Nested LSTMs. In: ACML. 2017. pp. 530–544.
Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst. 2016;31(2):102–7.
Cambria E, Hussain A, Havasi C, Eckl C. Sentic computing: Exploitation of common sense for the development of emotion-sensitive systems. In: Development of Multimodal Interfaces: Active Listening and Synchrony, Second COST 2102 International Training School, Dublin, Ireland, March 23-27, 2009, Revised Selected Papers, Lecture Notes in Computer Science, vol. 5967. 2009. pp. 148–156.
Cambria E, Li Y, Xing FZ, Poria S, Kwok K. Senticnet 6: Ensemble application of symbolic and subsymbolic AI for sentiment analysis. In: M. d’Aquin, S. Dietze, C. Hauff, E. Curry, P. Cudré-Mauroux (eds.) CIKM. ACM 2020. pp. 105–114.
Dai AM, Le QV. Semi-supervised Sequence Learning. In: NIPS. 2015. pp. 3079–3087.
Dong L, Wei F, Tan C, Tang D, Zhou M, Xu K. Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification. In: ACL. 2014. pp. 49–54.
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C. Learning Word Vectors for Sentiment Analysis. In: ACL-HLT. 2011. pp. 142–150.
Pontiki M, Galanis D, Pavlopoulos J, Papageorgiou H, Androutsopoulos I, Manandhar S. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. In: SemEval@COLING. 2014. pp. 27–35.
Yan Y, Yin X, Li S, Yang M, Hao H. Learning Document Semantic Representation with Hybrid Deep Belief Network. Comp Int Neurosc. 2015. 650,527:1–650,527:9
Liu Q, Zhang H, Zeng Y, Huang Z, Wu Z. Content Attention Model for Aspect Based Sentiment Analysis. In: WWW. 2018. pp. 1023–1032.
Ma Y, Peng H, Khan T, Cambria E, Hussain A. Sentic LSTM: a hybrid network for targeted aspect-based sentiment analysis. Cogn. Comput. 2018;10(4):639–50.
Van Asch V. Macro-and micro-averaged evaluation measures [[basic draft]]. Belgium: CLiPS. 2013. pp. 1–27.
Tang D, Qin B, Liu T. Aspect Level Sentiment Classification with Deep Memory Network. In: EMNLP. 2016. pp. 214–224.
Fan F, Feng Y, Zhao D. Multi-grained Attention Network for Aspect-Level Sentiment Classification. In: EMNLP. 2018. pp. 3433–3442.
Ma D, Li S, Zhang X, Wang H. Interactive Attention Networks for Aspect-Level Sentiment Classification. In: IJCAI. 2017. pp. 4068–4074.
Chen P, Sun Z, Bing L, Yang W. Recurrent Attention Network on Memory for Aspect Sentiment Analysis. In: EMNLP. 2017. pp. 452–461.
Wang Y, Huang M, Zhu X, Zhao L. Attention-based LSTM for Aspect-level Sentiment Classification. In: EMNLP. 2016. pp. 606–615.
Acknowledgements
This work was partially supported by the Major Project for New Generation of AI under Grant No. 2018AAA0100400, the Joint Fund of the Equipments Pre-Research and Ministry of Education of China under Grant No. 6141A020337, the National Natural Science Foundation of China under Grant No. 61876155, the Natural Science Foundation of Shandong Province, China, under Grant No. ZR201911080230, the Jiangsu Science and Technology Programme (Natural Science Foundation of Jiangsu Province) under Grant No. BE2020006-4 and BK20181189, the Project for Graduate Student Education Reformation and Research of Ocean University of China under Grant No. HDJG19001, and the Key Program Special Fund in XJTLU under Grant No. KSF-T-06 and KSF-E-26. The authors would like to thank Zhaoyang Niu for his help in the revision of this paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Conflicts of Interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Lin, X., Zhong, G., Chen, K. et al. Attention-Augmented Machine Memory. Cogn Comput 13, 751–760 (2021). https://doi.org/10.1007/s12559-021-09854-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-021-09854-5