Article

Learning to Navigate for Fine-Grained Classification

Authors:

Liwei WangAuthors Info & Claims

Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part XIV

Pages 438 - 454

https://doi.org/10.1007/978-3-030-01264-9_26

Published: 08 September 2018 Publication History

Abstract

Fine-grained classification is challenging due to the difficulty of finding discriminative features. Finding those subtle traits that fully characterize the object is not straightforward. To handle this circumstance, we propose a novel self-supervision mechanism to effectively localize informative regions without the need of bounding-box/part annotations. Our model, termed NTS-Net for Navigator-Teacher-Scrutinizer Network, consists of a Navigator agent, a Teacher agent and a Scrutinizer agent. In consideration of intrinsic consistency between informativeness of the regions and their probability being ground-truth class, we design a novel training paradigm, which enables Navigator to detect most informative regions under the guidance from Teacher. After that, the Scrutinizer scrutinizes the proposed regions from Navigator and makes predictions. Our model can be viewed as a multi-agent cooperation, wherein agents benefit from each other, and make progress together. NTS-Net can be trained end-to-end, while provides accurate fine-grained classification predictions as well as highly informative regions during inference. We achieve state-of-the-art performance in extensive benchmark datasets.

References

[1]

Arbelaez, P., Ponttuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: CVPR, pp. 328–335 (2014)

[2]

Berg, T., Belhumeur, P.N.: POOF: part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. In: CVPR (2013)

[3]

Branson, S., Horn, G.V., Belongie, S., Perona, P.: Bird species categorization using pose normalized deep convolutional nets. In: BMVC (2014)

[4]

Burges, C., et al.: Learning to rank using gradient descent. In: ICML, pp. 89–96 (2005)

[5]

Cai, S., Zuo, W., Zhang, L.: Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In: ICCV, October 2017

[6]

Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: ICML, pp. 129–136 (2007)

[7]

Carreira, J., Sminchisescu, C.: CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts. IEEE Computer Society (2012)

[8]

Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: ICCV, pp. 321–328 (2013)

[9]

Cossock D and Zhang T Statistical analysis of bayes optimal subset ranking IEEE Trans. Inf. Theory 2008 54 11 5140-5154

[10]

Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)

[11]

Endres I and Hoiem D Daniilidis K, Maragos P, and Paragios N Category independent object proposals Computer Vision – ECCV 2010 2010 Heidelberg Springer 575-588

[12]

Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR

[13]

Gavves, E., Fernando, B., Snoek, C.G.M., Smeulders, A.W.M., Tuytelaars, T.: Fine-grained categorization by alignments. In: ICCV, pp. 1713–1720 (2014)

[14]

Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

[15]

Gosselin PH, Murray N, Jgou H, and Perronnin F Revisiting the fisher vector for fine-grained classification Patt. Recogn. Lett. 2014 49 92-98

[16]

He K, Zhang X, Ren S, and Sun J Spatial pyramid pooling in deep convolutional networks for visual recognition TPAMI 2015 37 9 1904-1916

[17]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

[18]

Herbrich, R.: Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classifiers, vol. 88 (2000)

[19]

Jaderberg, M., Simonyan, K., Zisserman, A., kavukcuoglu, k.: Spatial transformer networks. In: NIPS, pp. 2017–2025 (2015)

[20]

Jie, Z., Liang, X., Feng, J., Jin, X., Lu, W., Yan, S.: Tree-structured reinforcement learning for sequential object localization. In: NIPS, pp. 127–135 (2016)

[21]

Konda VR Actor-critic algorithms SIAM J. Control Optim. 2002 42 4 1143-1166

[22]

Krause, J., Jin, H., Yang, J., Fei-Fei, L.: Fine-grained recognition without part annotations. In: CVPR, June 2015

[23]

Krause, J., Stark, M., Jia, D., Li, F.F.: 3D object representations for fine-grained categorization. In: ICCV Workshops, pp. 554–561 (2013)

[24]

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)

[25]

Lam, M., Mahasseni, B., Todorovic, S.: Fine-grained recognition as HSnet search for informative image parts. In: CVPR, July 2017

[26]

Li, Z., Yang, Y., Liu, X., Zhou, F., Wen, S., Xu, W.: Dynamic computational time for visual attention. In: ICCV, October 2017

[27]

Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, July 2017

[28]

Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV (2015)

[29]

Liu J, Kanazawa A, Jacobs D, and Belhumeur P Fitzgibbon A, Lazebnik S, Perona P, Sato Y, and Schmid C Dog breed classification using part localization Computer Vision – ECCV 2012 2012 Heidelberg Springer 172-185

[30]

Liu TY Learning to rank for information retrieval Found. Trends Inf. Retr. 2009 3 3 225-331

[31]

Liu W et al. Leibe B, Matas J, Sebe N, Welling M, et al. SSD: single shot multibox detector Computer Vision – ECCV 2016 2016 Cham Springer 21-37

[32]

Liu, X., Xia, T., Wang, J., Lin, Y.: Fully convolutional attention localization networks: efficient attention localization for fine-grained recognition. CoRR (2016)

[33]

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, November 2015

[34]

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV (2004)

[35]

Maji, S., Kannala, J., Rahtu, E., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. Technical report (2013)

[36]

Moghimi, M., Belongie, S., Saberian, M., Yang, J., Vasconcelos, N., Li, L.J.: Boosted convolutional neural networks. In: BMVC, pp. 24.1–24.13 (2016)

[37]

Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)

[38]

Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

[39]

Russakovsky O et al. ImageNet large scale visual recognition challenge IJCV 2015 115 3 211-252

[40]

Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., Lecun, Y.: OverFeat: integrated recognition, localization and detection using convolutional networks. Arxiv (2013)

[41]

Uijlings JR, Sande KE, Gevers T, and Smeulders AW Selective search for object recognition IJCV 2013 104 2 154-171

[42]

Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical report (2011)

[43]

Wang, D., Shen, Z., Shao, J., Zhang, W., Xue, X., Zhang, Z.: Multiple granularity descriptors for fine-grained categorization. In: ICCV, pp. 2399–2406 (2015)

[44]

Xia, F., Liu, T.Y., Wang, J., Li, H., Li, H.: Listwise approach to learning to rank: theory and algorithm. In: ICML, pp. 1192–1199 (2008)

[45]

Xie, L., Tian, Q., Hong, R., Yan, S.: Hierarchical part matching for fine-grained visual categorization. In: ICCV, pp. 1641–1648 (2013)

[46]

Zhang N, Donahue J, Girshick R, and Darrell T Fleet D, Pajdla T, Schiele B, and Tuytelaars T Part-based R-CNNs for fine-grained category detection Computer Vision – ECCV 2014 2014 Cham Springer 834-849

[47]

Zhang, X., Xiong, H., Zhou, W., Lin, W., Tian, Q.: Picking deep filter responses for fine-grained image recognition. In: CVPR, June 2016

[48]

Zhao B, Wu X, Feng J, Peng Q, and Yan S Diversified visual attention networks for fine-grained object classification Trans. Multi. 2017 19 6 1245-1256

[49]

Zheng, H., Fu, J., Mei, T., Luo, J.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: ICCV, October 2017

Cited By

Ding KYang CYang CZheng Z(2024)Multi-level Attention-enhanced Learning for Fine-Grained Visual ClassificationProceedings of the 2024 8th International Conference on Computer Science and Artificial Intelligence10.1145/3709026.3709034(105-111)Online publication date: 6-Dec-2024
https://dl.acm.org/doi/10.1145/3709026.3709034
Shen LWang S(2024)Fine-grained image recognition method based on enhanced multi-branch networkProceedings of the 2024 9th International Conference on Intelligent Information Processing10.1145/3696952.3696984(233-240)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3696952.3696984
Chen LWang QLi ZYin YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Hypergraph-guided Intra- and Inter-category Relation Modeling for Fine-grained Visual RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680589(8043-8052)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680589
Show More Cited By

Index Terms

Learning to Navigate for Fine-Grained Classification
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms

Index terms have been assigned to the content through auto-classification.

Recommendations

Weakly labeled fine-grained classification with hierarchy relationship of fine and coarse labels
Highlights
- We propose a fine-grained image classification model under a weakly supervised setting.
Abstract
The current work of fine-grained classification generally depends on a large number of fine labels of images. However, these fine labels are much more difficult to annotate than the coarse labels, which generalize fine labels based on ...
Leveraging Fine-Grained Labels to Regularize Fine-Grained Visual Classification
ICCMS '19: Proceedings of the 11th International Conference on Computer Modeling and Simulation

Fine-grained visual categorization (FGVC) is challenging mainly due to the large intra-class confusion and small inter-class variance in terms of shape, pose, and appearance. We propose the concept of fine-grained label and that any given label can be ...
Supervised spectral feature learning for fine-grained classification in small data set
Abstract
Fine-grained image classification is a challenging task due to the small inter-class variance, the large intra-class difference, and the small training data. Traditional methods typically rely on large-scale training samples with annotated part ...
Highlights
- Fully mine and exploit the discriminative potentials of region correlations for fine-grained image classification in weak supervision.
- Spectral graph captures the internal region structure to ensure the image’s comparison in a natural ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part XIV

Sep 2018

844 pages

ISBN:978-3-030-01263-2

DOI:10.1007/978-3-030-01264-9

Editors:
Vittorio Ferrari
Google Research, Zurich, Switzerland
,
Martial Hebert
Carnegie Mellon University, Pittsburgh, PA, USA
,
Cristian Sminchisescu
Google Research, Zurich, Switzerland
,
Yair Weiss
Hebrew University of Jerusalem, Jerusalem, Israel

© Springer Nature Switzerland AG 2018.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 08 September 2018

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

30
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ding KYang CYang CZheng Z(2024)Multi-level Attention-enhanced Learning for Fine-Grained Visual ClassificationProceedings of the 2024 8th International Conference on Computer Science and Artificial Intelligence10.1145/3709026.3709034(105-111)Online publication date: 6-Dec-2024
https://dl.acm.org/doi/10.1145/3709026.3709034
Shen LWang S(2024)Fine-grained image recognition method based on enhanced multi-branch networkProceedings of the 2024 9th International Conference on Intelligent Information Processing10.1145/3696952.3696984(233-240)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3696952.3696984
Chen LWang QLi ZYin YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Hypergraph-guided Intra- and Inter-category Relation Modeling for Fine-grained Visual RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680589(8043-8052)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680589
Jhaj BShukla ATuraga PKozicki M(2024)On the impact of pre-training datasets for matching dendritic identifiers using residual netsProceedings of the International Workshop on Artificial Intelligence for Signal, Image Processing and Multimedia10.1145/3643487.3662168(1-6)Online publication date: 10-Jun-2024
https://dl.acm.org/doi/10.1145/3643487.3662168
Yang CLiu QLi GPan HHe Z(2024)Learning diverse fine-grained features for thermal infrared trackingExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121577238:PCOnline publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121577
Wang JQiang WSu XZheng CSun FXiong H(2024)Towards Task Sampler Learning for Meta-LearningInternational Journal of Computer Vision10.1007/s11263-024-02145-0132:12(5534-5564)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s11263-024-02145-0
Xu QLiu PWang JHuang LTang J(2024)CNN-Transformer with Stepped Distillation for Fine-Grained Visual ClassificationPattern Recognition and Computer Vision10.1007/978-981-97-8692-3_26(364-377)Online publication date: 18-Oct-2024
https://dl.acm.org/doi/10.1007/978-981-97-8692-3_26
Dai QLu YWang CLu H(2024)MMAT: Multi-scale Multi-attention Transformer for Fine-Grained Wild Fungi Visual ClassificationPRICAI 2024: Trends in Artificial Intelligence10.1007/978-981-96-0122-6_5(41-53)Online publication date: 19-Nov-2024
https://dl.acm.org/doi/10.1007/978-981-96-0122-6_5
Liu YYang LWang Y(2024)Hierarchical Fine-Grained Visual Classification Leveraging Consistent Hierarchical KnowledgeMachine Learning and Knowledge Discovery in Databases. Research Track10.1007/978-3-031-70341-6_17(279-295)Online publication date: 8-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-70341-6_17
Yu DFang ZJiang Y(2024)Foreground Feature Enhancement and Peak & Background Suppression for Fine-Grained Visual ClassificationMultiMedia Modeling10.1007/978-3-031-53305-1_11(134-146)Online publication date: 29-Jan-2024
https://dl.acm.org/doi/10.1007/978-3-031-53305-1_11
Show More Cited By

View Options

View options

Figures

Tables

Media

View Table of Conten