research-article

Introducing shape priors in Siamese networks for image classification

Authors:

Damien Muselet, and

Christophe DucottetAuthors Info & Claims

Volume 568, Issue C

https://doi.org/10.1016/j.neucom.2023.127034

Published: 14 March 2024 Publication History

Abstract

The efficiency of deep neural networks is increasing, and so is the amount of annotated data required for training them. We propose a solution improving the learning process of a classification network with less labeled data. Our approach is to inform the classifier of the elements it should focus on to make its decision by supplying it with some shape priors. These shape priors are expressed as binary masks, giving a rough idea of the shape of the relevant elements for a given class. We resort to Siamese architecture and feed it with image/mask pairs. By inserting shape priors, only the relevant features are retained. This provides the network with significant generalization power without requiring a specific domain adaptation step. This solution is tested on some standard cross-domain digit classification tasks and on a real-world video surveillance application. Extensive tests show that our approach outperforms the classical classifier by generating a good latent space with less training data. Code is available at https://github.com/halqasir/MG-Siamese.

Highlights

•

We propose a solution to learn a classification network with less labeled data.

•

The principle is to provide the classifier a binary mask as a simple shape prior.

•

We resort to a Siamese architecture and feed it with images and shape priors.

•

This provides the network a significant generalization power without any adaptation.

•

We run cross-domain tests on digit recognition and a real-life chairlift safety case.

References

[1]

Tian Y., Wang Y., Krishnan D., Tenenbaum J.B., Isola P., Rethinking few-shot image classification: a good embedding is all you need?, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16, Springer, 2020, pp. 266–282.

[2]

X. Chen, K. He, Exploring simple siamese representation learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15750–15758.

[3]

Koch G., Zemel R., Salakhutdinov R., Siamese neural networks for one-shot image recognition, in: ICML Deep Learning Workshop, Vol. 2, Lille, 2015.

[4]

H. Alqasir, D. Muselet, C. Ducottet, Mask-guided Image Classification with Siamese Networks, in: International Conference on Computer Vision Theory and Applications, 2020.

[5]

Bromley J., Bentz J.W., Bottou L., Guyon I., LeCun Y., Moore C., Säckinger E., Shah R., Signature verification using a “siamese” time delay neural network, Int. J. Pattern Recognit. Artif. Intell. 7 (04) (1993) 669–688.

[6]

Bertinetto L., Valmadre J., Henriques J.F., Vedaldi A., Torr P.H., Fully-convolutional siamese networks for object tracking, in: European Conference on Computer Vision, Springer, 2016, pp. 850–865.

[7]

Y. Taigman, M. Yang, M. Ranzato, L. Wolf, Deepface:Closing the gap to human-level performance in face verification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1701–1708.

[8]

Z. Wu, Y. Xiong, S.X. Yu, D. Lin, Unsupervised feature learning via non-parametric instance discrimination, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 3733–3742.

[9]

Oord A.v.d., Li Y., Vinyals O., Representation learning with contrastive predictive coding, 2018, arXiv preprint arXiv:1807.03748.

[10]

En S., Lechervy A., Jurie F., TS-net: combining modality specific and common features for multimodal patch matching, in: 2018 IEEE International Conference on Image Processing, ICIP, IEEE, 2018.

[11]

Long M., Cao Z., Wang J., Jordan M.I., Conditional adversarial domain adaptation, in: Advances in Neural Information Processing Systems, 2018, pp. 1640–1650.

[12]

E. Tzeng, J. Hoffman, K. Saenko, T. Darrell, Adversarial discriminative domain adaptation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7167–7176.

[13]

C. Liu, C. Xu, Y. Wang, L. Zhang, Y. Fu, An Embarrassingly Simple Baseline to One-shot Learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 922–923.

[14]

K.-C. Peng, Z. Wu, J. Ernst, Zero-Shot Deep Domain Adaptation, in: The European Conference on Computer Vision, ECCV, 2018.

[15]

Kumagai A., Iwata T., Zero-shot domain adaptation without domain semantic descriptors, 2018, arXiv abs/1807.02927.

[16]

S. Shankar, V. Piratla, S. Chakrabarti, S. Chaudhuri, P. Jyothi, S. Sarawagi, Generalizing Across Domains via Cross-Gradient Training, in: International Conference on Learning Representations, 2018.

[17]

Y. Yang, T. Hospedales, A Unified Perspective on Multi-Domain and Multi-Task Learning, in: 3rd International Conference on Learning Representations, ICLR, 2015.

[18]

Pan S.J., Yang Q., A survey on transfer learning, IEEE Trans. Knowl. Data Eng. 22 (10) (2009) 1345–1359.

Digital Library

[19]

K. Zhou, Y. Yang, T.M. Hospedales, T. Xiang, Deep Domain-Adversarial Image Generation for Domain Generalisation, in: AAAI, 2020, pp. 13025–13032.

[20]

M. Ghifary, W.B. Kleijn, M. Zhang, D. Balduzzi, Domain generalization for object recognition with multi-task autoencoders, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2551–2559.

[21]

S. Motiian, M. Piccirilli, D.A. Adjeroh, G. Doretto, Unified deep supervised domain adaptation and generalization, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 5715–5725.

[22]

S. Kim, D. Kim, M. Cho, S. Kwak, Proxy Anchor Loss for Deep Metric Learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2020.

[23]

K. Rakelly, E. Shelhamer, T. Darrell, A.A. Efros, S. Levine, Conditional Networks for Few-Shot Semantic Segmentation, in: International Conference on Learning Representations, 2018.

[24]

J. Zhao, C.G.M. Snoek, Dance with Flow:Two-in-One Stream Action Detection, in: IEEE Conference on Computer Vision and Pattern Recognition, 2019.

[25]

Ren S., He K., Girshick R., Sun J., Faster r-cnn:Towards real-time object detection with region proposal networks, in: Advances in Neural Information Processing Systems, 2015, pp. 91–99.

Digital Library

[26]

H. Law, J. Deng, CornerNet:Detecting Objects as Paired Keypoints, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018.

[27]

S. Qiao, L.-C. Chen, A. Yuille, DetectoRS:Detecting Objects With Recursive Feature Pyramid and Switchable Atrous Convolution, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2021, pp. 10213–10224.

[28]

Chen Q., Song Z., Dong J., Huang Z., Hua Y., Yan S., Contextualizing object detection and classification, IEEE Trans. Pattern Anal. Mach. Intell. 37 (1) (2014) 13–27.

[29]

Song C., Huang Y., Ouyang W., Wang L., Mask-guided contrastive attention model for person re-identification, in: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, IEEE, 2018.

[30]

E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, P. Fua, F. Moreno-Noguer, Discriminative learning of deep convolutional feature point descriptors, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 118–126.

[31]

Hadsell R., Chopra S., LeCun Y., Dimensionality reduction by learning an invariant mapping, in: 2006 IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, CVPR’06, IEEE, 2006, pp. 1735–1742.

[32]

K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, in: International Conference on Learning Representations, 2015.

[33]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet:A Large-Scale Hierarchical Image Database, in: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2009.

[34]

S. Dai, Y. Cheng, Y. Zhang, Z. Gan, J. Liu, L. Carin, Contrastively smoothed class alignment for unsupervised domain adaptation, in: Proceedings of the Asian Conference on Computer Vision, 2020.

[35]

D. Martin, C. Fowlkes, D. Tal, J. Malik, A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics, in: Proc. 8th Int’L Conf. Computer Vision, Vol. 2, 2001, pp. 416–423.

[36]

Maaten L.v.d., Hinton G., Visualizing data using t-SNE, J. Mach. Learn. Res. 9 (Nov) (2008) 2579–2605.

Index Terms

Introducing shape priors in Siamese networks for image classification
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Robust Signal Classification Using Siamese Networks
WiseML 2019: Proceedings of the ACM Workshop on Wireless Security and Machine Learning

We propose a noise-robust signal classification approach using siamese convolutional neural networks (CNNs), which employ a linked parallel structure to rank similarity between inputs. Siamese networks have powerful capabilities that include effective ...
Read More
Deep Learning Shape Priors for Object Segmentation
CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition

In this paper we introduce a new shape-driven approach for object segmentation. Given a training set of shapes, we first use deep Boltzmann machine to learn the hierarchical architecture of shape priors. This learned hierarchical architecture is then ...
Read More
Complex image classification by feature inference
Abstract
Image classification is a fundamental task in image processing. Despite the long time research, there are still many challenging problems to be solved. In this study, we introduce the problem of complex image classification. Images in realistic ...
Read More

Comments

Information & Contributors

Information

Published In

cover image Neurocomputing

Neurocomputing Volume 568, Issue C

Feb 2024

249 pages

ISSN:0925-2312

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 14 March 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents