Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Introducing shape priors in Siamese networks for image classification

Published: 14 March 2024 Publication History
  • Get Citation Alerts
  • Abstract

    The efficiency of deep neural networks is increasing, and so is the amount of annotated data required for training them. We propose a solution improving the learning process of a classification network with less labeled data. Our approach is to inform the classifier of the elements it should focus on to make its decision by supplying it with some shape priors. These shape priors are expressed as binary masks, giving a rough idea of the shape of the relevant elements for a given class. We resort to Siamese architecture and feed it with image/mask pairs. By inserting shape priors, only the relevant features are retained. This provides the network with significant generalization power without requiring a specific domain adaptation step. This solution is tested on some standard cross-domain digit classification tasks and on a real-world video surveillance application. Extensive tests show that our approach outperforms the classical classifier by generating a good latent space with less training data. Code is available at https://github.com/halqasir/MG-Siamese.

    Highlights

    We propose a solution to learn a classification network with less labeled data.
    The principle is to provide the classifier a binary mask as a simple shape prior.
    We resort to a Siamese architecture and feed it with images and shape priors.
    This provides the network a significant generalization power without any adaptation.
    We run cross-domain tests on digit recognition and a real-life chairlift safety case.

    References

    [1]
    Tian Y., Wang Y., Krishnan D., Tenenbaum J.B., Isola P., Rethinking few-shot image classification: a good embedding is all you need?, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16, Springer, 2020, pp. 266–282.
    [2]
    X. Chen, K. He, Exploring simple siamese representation learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15750–15758.
    [3]
    Koch G., Zemel R., Salakhutdinov R., Siamese neural networks for one-shot image recognition, in: ICML Deep Learning Workshop, Vol. 2, Lille, 2015.
    [4]
    H. Alqasir, D. Muselet, C. Ducottet, Mask-guided Image Classification with Siamese Networks, in: International Conference on Computer Vision Theory and Applications, 2020.
    [5]
    Bromley J., Bentz J.W., Bottou L., Guyon I., LeCun Y., Moore C., Säckinger E., Shah R., Signature verification using a “siamese” time delay neural network, Int. J. Pattern Recognit. Artif. Intell. 7 (04) (1993) 669–688.
    [6]
    Bertinetto L., Valmadre J., Henriques J.F., Vedaldi A., Torr P.H., Fully-convolutional siamese networks for object tracking, in: European Conference on Computer Vision, Springer, 2016, pp. 850–865.
    [7]
    Y. Taigman, M. Yang, M. Ranzato, L. Wolf, Deepface:Closing the gap to human-level performance in face verification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1701–1708.
    [8]
    Z. Wu, Y. Xiong, S.X. Yu, D. Lin, Unsupervised feature learning via non-parametric instance discrimination, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 3733–3742.
    [9]
    Oord A.v.d., Li Y., Vinyals O., Representation learning with contrastive predictive coding, 2018, arXiv preprint arXiv:1807.03748.
    [10]
    En S., Lechervy A., Jurie F., TS-net: combining modality specific and common features for multimodal patch matching, in: 2018 IEEE International Conference on Image Processing, ICIP, IEEE, 2018.
    [11]
    Long M., Cao Z., Wang J., Jordan M.I., Conditional adversarial domain adaptation, in: Advances in Neural Information Processing Systems, 2018, pp. 1640–1650.
    [12]
    E. Tzeng, J. Hoffman, K. Saenko, T. Darrell, Adversarial discriminative domain adaptation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7167–7176.
    [13]
    C. Liu, C. Xu, Y. Wang, L. Zhang, Y. Fu, An Embarrassingly Simple Baseline to One-shot Learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 922–923.
    [14]
    K.-C. Peng, Z. Wu, J. Ernst, Zero-Shot Deep Domain Adaptation, in: The European Conference on Computer Vision, ECCV, 2018.
    [15]
    Kumagai A., Iwata T., Zero-shot domain adaptation without domain semantic descriptors, 2018, arXiv abs/1807.02927.
    [16]
    S. Shankar, V. Piratla, S. Chakrabarti, S. Chaudhuri, P. Jyothi, S. Sarawagi, Generalizing Across Domains via Cross-Gradient Training, in: International Conference on Learning Representations, 2018.
    [17]
    Y. Yang, T. Hospedales, A Unified Perspective on Multi-Domain and Multi-Task Learning, in: 3rd International Conference on Learning Representations, ICLR, 2015.
    [18]
    Pan S.J., Yang Q., A survey on transfer learning, IEEE Trans. Knowl. Data Eng. 22 (10) (2009) 1345–1359.
    [19]
    K. Zhou, Y. Yang, T.M. Hospedales, T. Xiang, Deep Domain-Adversarial Image Generation for Domain Generalisation, in: AAAI, 2020, pp. 13025–13032.
    [20]
    M. Ghifary, W.B. Kleijn, M. Zhang, D. Balduzzi, Domain generalization for object recognition with multi-task autoencoders, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2551–2559.
    [21]
    S. Motiian, M. Piccirilli, D.A. Adjeroh, G. Doretto, Unified deep supervised domain adaptation and generalization, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 5715–5725.
    [22]
    S. Kim, D. Kim, M. Cho, S. Kwak, Proxy Anchor Loss for Deep Metric Learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2020.
    [23]
    K. Rakelly, E. Shelhamer, T. Darrell, A.A. Efros, S. Levine, Conditional Networks for Few-Shot Semantic Segmentation, in: International Conference on Learning Representations, 2018.
    [24]
    J. Zhao, C.G.M. Snoek, Dance with Flow:Two-in-One Stream Action Detection, in: IEEE Conference on Computer Vision and Pattern Recognition, 2019.
    [25]
    Ren S., He K., Girshick R., Sun J., Faster r-cnn:Towards real-time object detection with region proposal networks, in: Advances in Neural Information Processing Systems, 2015, pp. 91–99.
    [26]
    H. Law, J. Deng, CornerNet:Detecting Objects as Paired Keypoints, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018.
    [27]
    S. Qiao, L.-C. Chen, A. Yuille, DetectoRS:Detecting Objects With Recursive Feature Pyramid and Switchable Atrous Convolution, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2021, pp. 10213–10224.
    [28]
    Chen Q., Song Z., Dong J., Huang Z., Hua Y., Yan S., Contextualizing object detection and classification, IEEE Trans. Pattern Anal. Mach. Intell. 37 (1) (2014) 13–27.
    [29]
    Song C., Huang Y., Ouyang W., Wang L., Mask-guided contrastive attention model for person re-identification, in: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, IEEE, 2018.
    [30]
    E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, P. Fua, F. Moreno-Noguer, Discriminative learning of deep convolutional feature point descriptors, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 118–126.
    [31]
    Hadsell R., Chopra S., LeCun Y., Dimensionality reduction by learning an invariant mapping, in: 2006 IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, CVPR’06, IEEE, 2006, pp. 1735–1742.
    [32]
    K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, in: International Conference on Learning Representations, 2015.
    [33]
    J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet:A Large-Scale Hierarchical Image Database, in: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2009.
    [34]
    S. Dai, Y. Cheng, Y. Zhang, Z. Gan, J. Liu, L. Carin, Contrastively smoothed class alignment for unsupervised domain adaptation, in: Proceedings of the Asian Conference on Computer Vision, 2020.
    [35]
    D. Martin, C. Fowlkes, D. Tal, J. Malik, A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics, in: Proc. 8th Int’L Conf. Computer Vision, Vol. 2, 2001, pp. 416–423.
    [36]
    Maaten L.v.d., Hinton G., Visualizing data using t-SNE, J. Mach. Learn. Res. 9 (Nov) (2008) 2579–2605.

    Index Terms

    1. Introducing shape priors in Siamese networks for image classification
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image Neurocomputing
          Neurocomputing  Volume 568, Issue C
          Feb 2024
          249 pages

          Publisher

          Elsevier Science Publishers B. V.

          Netherlands

          Publication History

          Published: 14 March 2024

          Author Tags

          1. Siamese networks
          2. Image classification
          3. Generalization
          4. Shape priors

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 0
            Total Downloads
          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0

          Other Metrics

          Citations

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media