Fast Nearest-Neighbor Classification Using RNN in Domains with Large Number of Classes

Singh, Gautam; Dasgupta, Gargi; Deng, Yu

doi:10.1007/978-3-030-17642-6_26

Gautam Singh²⁴,
Gargi Dasgupta²⁵ &
Yu Deng²⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11434))

Included in the following conference series:

International Conference on Service-Oriented Computing

1602 Accesses

Abstract

In scenarios involving text classification where the number of classes is large (in multiples of 10000 s) and training samples for each class are few and often verbose, nearest neighbor methods are effective but very slow in computing a similarity score with training samples of every class. On the other hand, machine learning models are fast at runtime but training them adequately is not feasible using few available training samples per class. In this paper, we propose a hybrid approach that cascades (1) a fast but less-accurate recurrent neural network (RNN) model and (2) a slow but more-accurate nearest-neighbor model using bag of syntactic features.

Using the cascaded approach, our experiments, performed on data set from IT support services where customer complaint text needs to be classified to return top-N possible error codes, show that the query-time of the slow system is reduced to $1/6^{th}$ while its accuracy is being improved. Our approach outperforms an LSH-based baseline for query-time reduction. We also derive a lower bound on the accuracy of the cascaded model in terms of the accuracies of the individual models. In any two-stage approach, choosing the right number of candidates to pass on to the second stage is crucial. We prove a result that aids in choosing this cutoff number for the cascaded system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LexiFusedNet: A Unified Approach for Imbalanced Short-Text Classification Using Lexicon-Based Feature Extraction, Transfer Learning and One Class Classifiers

Nearest Neighbor Classifier with Margin Penalty for Active Learning

Deep Learning for Natural Language Processing

References

Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 2006 47th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2006, pp. 459–468. IEEE (2006)
Google Scholar
Asadi, N., Lin, J.: Effectiveness/efficiency tradeoffs for candidate generation in multi-stage retrieval architectures. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 997–1000. ACM (2013)
Google Scholar
Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. J. Comput. Syst. Sci. 60(3), 630–659 (2000)
Article MathSciNet Google Scholar
Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Clarke, C.L., Culpepper, J.S., Moffat, A.: Assessing efficiency-effectiveness tradeoffs in multi-stage retrieval systems without using relevance judgments. Inf. Retrieval J. 19(4), 351–377 (2016)
Article Google Scholar
Dumais, S., Chen, H.: Hierarchical classification of web content. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 256–263. ACM (2000)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
McCallum, A., Nigam, K., et al.: A comparison of event models for Naive Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, Madison, WI, vol. 752, pp. 41–48 (1998)
Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Interspeech, vol. 2, p. 3 (2010)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Schütze, H., Hull, D.A., Pedersen, J.O.: A comparison of classifiers and document representations for the routing problem. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 229–237. ACM (1995)
Google Scholar
Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernández, L.: Syntactic dependency-based n-grams as classification features. In: Batyrshin, I., Mendoza, M.G. (eds.) MICAI 2012. LNCS (LNAI), vol. 7630, pp. 1–11. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37798-3_1
Chapter Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective and efficient multilabel classification in domains with large number of labels. In: Proceedings of ECML/PKDD2008 Workshop on Mining Multidimensional Data (MMD 2008), pp. 30–44 (2008)
Google Scholar
Wang, J., Zhang, T., Sebe, N., Shen, H.T., et al.: A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. 40, 769–790 (2017)
Article Google Scholar
Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2, pp. 90–94. Association for Computational Linguistics (2012)
Google Scholar
Ye, X., Shen, H., Ma, X., Bunescu, R., Liu, C.: From word embeddings to document similarities for improved information retrieval in software engineering. In: Proceedings of the 38th International Conference on Software Engineering, pp. 404–415. ACM (2016)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar
Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., Xu, B.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. arXiv preprint arXiv:1611.06639 (2016)

Download references

Author information

Authors and Affiliations

IBM Research-India, New Delhi, 110070, DL, India
Gautam Singh
IBM Research-India, Bangalore, 560045, KA, India
Gargi Dasgupta
IBM T.J. Watson Research Center, Yorktown Heights, New York, 10598, NY, USA
Yu Deng

Authors

Gautam Singh
View author publications
You can also search for this author in PubMed Google Scholar
Gargi Dasgupta
View author publications
You can also search for this author in PubMed Google Scholar
Yu Deng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Deng .

Editor information

Editors and Affiliations

Deakin University, Melbourne, VIC, Australia
Xiao Liu
University of Pau and Pays, Pau Cedex, France
Michael Mrissa
Fudan University, Shanghai Shi, China
Liang Zhang
LIRIS Lab, University Lyon 1, IUT, Villeurbanne Cedex, France
Djamal Benslimane
School of IT and Computer Science, University of Wollongong, Wollongong, NSW, Australia
Aditya Ghose
Harbin Institute of Technology, Harbin, China
Zhongjie Wang
Scientific and Technological Hub, Fondazione Bruno Kessler (FBK), Trento, Italy
Antonio Bucchiarone
Macquarie University, Sydney, NSW, Australia
Wei Zhang
Queen’s University, Kingston, ON, Canada
Ying Zou
Rochester Institute of Technology, Rochester, NY, USA
Qi Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, G., Dasgupta, G., Deng, Y. (2019). Fast Nearest-Neighbor Classification Using RNN in Domains with Large Number of Classes. In: Liu, X., et al. Service-Oriented Computing – ICSOC 2018 Workshops. ICSOC 2018. Lecture Notes in Computer Science(), vol 11434. Springer, Cham. https://doi.org/10.1007/978-3-030-17642-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-17642-6_26
Published: 10 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17641-9
Online ISBN: 978-3-030-17642-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fast Nearest-Neighbor Classification Using RNN in Domains with Large Number of Classes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

LexiFusedNet: A Unified Approach for Imbalanced Short-Text Classification Using Lexicon-Based Feature Extraction, Transfer Learning and One Class Classifiers

Nearest Neighbor Classifier with Margin Penalty for Active Learning

Deep Learning for Natural Language Processing

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Fast Nearest-Neighbor Classification Using RNN in Domains with Large Number of Classes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

LexiFusedNet: A Unified Approach for Imbalanced Short-Text Classification Using Lexicon-Based Feature Extraction, Transfer Learning and One Class Classifiers

Nearest Neighbor Classifier with Margin Penalty for Active Learning

Deep Learning for Natural Language Processing

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation