Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

NIR/RGB image fusion for scene classification using deep neural networks

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Near-infrared (NIR) imaging can add very useful data to many visible range image processing applications. In this paper, new fusion techniques are proposed to benefit from the data of both NIR/RGB sensors for the application of scene recognition and classification. Scene recognition and classification is an important and challenging branch of computer vision. Also, image fusion is a very well-known method in image processing. In this paper, fusion of RGB and NIR images is applied to improve the performance of scene classification. The proposed fusion technique is based on modified visual salient points (MVSP). Within this process, each RGB channel is fused with NIR channel based on their visual saliency maps. The fusion parameter space is then searched for the best fusion, based on an error function that is defined to measure the degree of conformity of the fused image with the input channels. This error function is defined on the basis of the distance between pixel intensities of NIR channel and fused image. On the contrary, the gradient difference is applied for the distance between RGB channels and fused image in the error function. After finding the most consent fused image, it is applied to deep convolutional neural networks (DCNNs) to perform scene category classification using the transfer learning method. The experimental results show that the proposed method in the scene classification outperforms the prior works. The results thus illustrate how an optimized NIR/RGB fusion can bring about better classification outcomes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. See http://ivrg.epfl.ch/supplementarymaterial/cvor11/index.html.

References

  1. Basu, A., et al.: Indoor home scene recognition using capsule neural networks. Proc. Comput. Sci. 167, 440–448 (2020)

    Google Scholar 

  2. Ren, Y., et al.: GAL: A global-attributes assisted labeling system for outdoor scenes. J. Vis. Commun. Image Represent. 42, 192–206 (2017)

    Google Scholar 

  3. Shojaiee, F., Baleghi, Y.: Pedestrian head direction estimation using weight generation function for fusion of visible and thermal feature vectors. Optik 254, 168688 (2022)

    Google Scholar 

  4. Ma, J., Ma, Y., Li, C.: Infrared and visible image fusion methods and applications: A survey. Inf. Fus. 45, 153–178 (2019)

    Google Scholar 

  5. Ghazali, S.M., Baleghi, Y.: Pedestrian Detection in Infrared Outdoor Images Based on Atmospheric Situation Estimation. J. AI Data Mining 7(1), 1–16 (2019)

    Google Scholar 

  6. Ren, L., et al.: Infrared and visible image fusion based on weighted variance guided filter and image contrast enhancement. Inf. Phys. Technol. 114, 103662 (2021)

    Google Scholar 

  7. Brown, M., Süsstrunk, S.: Multi-spectral SIFT for scene category recognition. In: CVPR 2011. IEEE (2011)

  8. Khan, A., Chefranov, A., Demirel, H.: Image scene geometry recognition using low-level features fusion at multi-layer deep CNN. Neurocomputing 440, 111–126 (2021)

    Google Scholar 

  9. Xie, L., et al.: Scene recognition: A comprehensive survey. Pat. Recognit. 102, 107205 (2020)

    Google Scholar 

  10. López-Cifuentes, A., et al.: Semantic-aware scene recognition. Pat. Recognit. 102, 107256 (2020)

    Google Scholar 

  11. Qun, L., et al.: Improving bag-of-words scheme for scene categorization. J. China Univ. Posts Telecommun. 19, 166–171 (2012)

    Google Scholar 

  12. Farahzadeh, E.: Tools for visual scene recognition. Nanyang Technological University (2014)

  13. Lindeberg, T.: Scale invariant feature transform (2012)

  14. Wang, C., Peng, G., De Baets, B.: Deep feature fusion through adaptive discriminative metric learning for scene recognition. Inf. Fus. 63, 1–12 (2020)

    Google Scholar 

  15. Bayat, A., et al.: Scene grammar in human and machine recognition of objects and scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. (2018)

  16. Wu, J., Rehg, J.M.: CENTRIST: a visual descriptor for scene categorization. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1489–1501 (2011)

    Google Scholar 

  17. Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)

    MATH  Google Scholar 

  18. Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) (2005)

  19. Lazebnik, S., C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) (2006)

  20. Liu, J., Shah, M.: Scene Modeling Using Co-Clustering. In: 2007 IEEE 11th International Conference on Computer Vision (2007)

  21. Liu, J., Yang, Y., Shah, M.: Learning semantic visual vocabularies using diffusion distance. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009)

  22. Bosch, A., Zisserman, A., Muñoz, X.: Scene Classification Via pLSA. In: Computer Vision—ECCV 2006. Berlin, Heidelberg: Springer Berlin Heidelberg (2006)

  23. Quelhas, P., et al.: Modeling scenes with local descriptors and latent aspects. In: Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 (2005)

  24. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001)

    MATH  Google Scholar 

  25. Bosch, A., Zisserman, A., Munoz, X.: scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 712–727 (2008)

    Google Scholar 

  26. Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2009)

  27. Li, L.-J., et al.: Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: Advances in neural information processing systems (2010)

  28. Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. in European Conference on Computer Vision. Springer (2012)

  29. Li, X., Guo, Y.: An Object Co-occurrence Assisted Hierarchical Model for Scene Understanding. In: BMVC (2012)

  30. Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: 2011 International Conference on Computer Vision. IEEE (2011)

  31. Parizi, S.N., Oberlin, J.G., Felzenszwalb, P.F.: Reconfigurable models for scene recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2012)

  32. Tighe, J., Lazebnik, S.: Understanding scenes on many levels. In: 2011 International Conference on Computer Vision (2011)

  33. Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing via label transfer. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2368–2382 (2011)

    Google Scholar 

  34. Jhuo, I.-H., Lee, D.: Boosted multiple kernel learning for scene category recognition. In: 2010 20th International Conference on Pattern Recognition. IEEE (2010)

  35. Patterson, G., Hays, J.: Sun attribute database: Discovering, annotating, and recognizing scene attributes. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2012)

  36. Lanckriet, G.R., et al.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5(Jan), 27–72 (2004)

    MathSciNet  MATH  Google Scholar 

  37. Farahzadeh, E., Cham, T.-J., Sluzek, A.: Scene recognition by semantic visual words. SIViP 9(8), 1935–1944 (2015)

    Google Scholar 

  38. Kwitt, R., Vasconcelos, N., Rasiwasia, N.: Scene Recognition on the Semantic Manifold. In: Computer Vision—ECCV 2012. Berlin, Heidelberg: Springer Berlin Heidelberg (2012)

  39. Xie, L., et al.: Improved spatial pyramid matching for scene recognition. Pattern Recogn. 82, 118–129 (2018)

    Google Scholar 

  40. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016)

  41. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015)

  42. Szegedy, C., et al.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016)

  43. Simonyan, K., Zisserman, A.J.: Very deep convolutional networks for large-scale image recognition. Comput. Vis. Pat. Recognit. (cs.CV) (2014)

  44. Khan, S.H., et al.: A discriminative representation of convolutional features for indoor scene recognition. IEEE Trans. Image Process. 25(7), 3372–3383 (2016)

    MathSciNet  MATH  Google Scholar 

  45. Sun, H., et al.: Scene recognition and object detection in a unified convolutional neural network on a mobile manipulator. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE (2018)

  46. Oh, B., Lee, J.: A case study on scene recognition using an ensemble convolution neural network. In: 2018 20th International Conference on Advanced Communication Technology (ICACT). IEEE (2018)

  47. Chen, C., et al.: Military image scene recognition based on CNN and semantic information. In: 2018 3rd International Conference on Mechanical, Control and Computer Engineering (ICMCCE). IEEE (2018)

  48. Li, S., Yang, B., Hu, J.: Performance comparison of different multi-resolution transforms for image fusion. Inf. Fus. 12(2), 74–84 (2011)

    Google Scholar 

  49. Pajares, G., Manuel de la Cruz, J.: A wavelet-based image fusion tutorial. Pat. Recognit. 37(9), 1855–1872 (2004)

    Google Scholar 

  50. Zhong, Z., Blum, R.S.: A categorization of multiscale-decomposition-based image fusion schemes with a performance study for a digital camera application. Proc. IEEE 87(8), 1315–1326 (1999)

    Google Scholar 

  51. Liu, Y., et al.: Region level based multi-focus image fusion using quaternion wavelet and normalized cut. Signal Process. 97, 9–30 (2014)

    Google Scholar 

  52. Burt, P.J., Adelson, E.H.: The Laplacian pyramid as a compact image code. In: Readings in computer vision, pp. 671–679. Elsevier, Amsterdam (1987)

    Google Scholar 

  53. Lewis, J.J., et al.: Pixel- and region-based image fusion with complex wavelets. Inf. Fus. 8(2), 119–130 (2007)

    Google Scholar 

  54. Myungjin, C., et al.: Fusion of multispectral and panchromatic Satellite images using the curvelet transform. IEEE Geosci. Remote Sens. Lett. 2(2), 136–140 (2005)

    Google Scholar 

  55. Li, S., Yin, H., Fang, L.: Group-sparse representation with dictionary learning for medical image denoising and fusion. IEEE Trans. Biomed. Eng. 59(12), 3450–3459 (2012)

    Google Scholar 

  56. Wang, J., et al.: Fusion method for infrared and visible images by using non-negative sparse representation. Infrared Phys. Technol. 67, 477–489 (2014)

    Google Scholar 

  57. Kong, W., Zhang, L., Lei, Y.: Novel fusion method for visible light and infrared images based on NSST–SF–PCNN. Infrared Phys. Technol. 65, 103–112 (2014)

    Google Scholar 

  58. Xiang, T., Yan, L., Gao, R.: A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NSCT domain. Infrared Phys. Technol. 69, 53–61 (2015)

    Google Scholar 

  59. Zhou, Y., Mayyas, A., Omar, M.A.: Principal component analysis-based image fusion routine with application to automotive stamping split detection. Res. Nondestr. Eval. 22(2), 76–91 (2011)

    Google Scholar 

  60. Mou, J., Gao, W., Song, Z.: Image fusion based on non-negative matrix factorization and infrared feature extraction. In: 2013 6th International congress on image and signal processing (CISP). IEEE (2013)

  61. Mitchell, H.B.: Image fusion: theories, techniques and applications. Springer Science & Business Media, Germany (2010)

    Google Scholar 

  62. Bavirisetti, D.P., Xiao, G., Liu, G.: Multi-sensor image fusion based on fourth order partial differential equations. In: 2017 20th International conference on information fusion (Fusion). IEEE (2017)

  63. Ma, J., et al.: Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Phys. Technol. 82, 8–17 (2017)

    Google Scholar 

  64. Zhang, X., et al.: Infrared and visible image fusion via saliency analysis and local edge-preserving multi-scale decomposition. J. Opt. Soc. Am. 34(8), 1400–1410 (2017)

    Google Scholar 

  65. Zhao, J., et al.: Infrared image enhancement through saliency feature analysis based on multi-scale decomposition. Infrared Phys. Technol. 62, 86–93 (2014)

    Google Scholar 

  66. Liu, Y., Liu, S., Wang, Z.: A general framework for image fusion based on multi-scale transform and sparse representation. Inf. Fus. 24, 147–164 (2015)

    Google Scholar 

  67. Ma, J., et al.: Infrared and visible image fusion via gradient transfer and total variation minimization. Inf. Fus. 31, 100–109 (2016)

    Google Scholar 

  68. Li, S., Kang, X., Hu, J.: Image Fusion With Guided Filtering. IEEE Trans. Image Process. 22(7), 2864–2875 (2013)

    Google Scholar 

  69. Bavirisetti, D.P., Dhuli, R.: Two-scale image fusion of visible and infrared images using saliency detection. Infrared Phys. Technol. 76, 52–64 (2016)

    Google Scholar 

  70. Toet, A.: Image fusion by a ratio of low-pass pyramid. Pattern Recogn. Lett. 9(4), 245–253 (1989)

    MATH  Google Scholar 

  71. Rajkumar, S., Mouli, P.C.: Infrared and visible image fusion using entropy and neuro-fuzzy concepts. In: ICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of Computer Society of India-Vol I. Springer (2014)

  72. Zhao, J., et al.: Fusion of visible and infrared images using global entropy and gradient constrained regularization. Infrared Phys. Technol. 81, 201–209 (2017)

    Google Scholar 

  73. Zheng, Y.: Image fusion and its applications. Book (2011)

  74. Omri, F., Foufou, S., Abidi, M.: NIR and visible image fusion for improving face recognition at long distance. In: International conference on image and signal processing. Springer (2014)

  75. Jingu, H., et al.: Fusion of visual and thermal signatures with eyeglass removal for robust face recognition. In: 2004 Conference on computer vision and pattern recognition workshop (2004)

  76. Kong, S.G., et al.: Recent advances in visual and infrared face recognition—a review. Comput. Vis. Image Underst. 97(1), 103–135 (2005)

    Google Scholar 

  77. Bebis, G., et al.: Face recognition by fusing thermal infrared and visible imagery. Image Vis. Comput. 24(7), 727–742 (2006)

    Google Scholar 

  78. Saurabh, S., et al.: Infrared and visible image fusion for face recognition. In: Proc SPIE (2004)

  79. Singh, R., Vatsa, M., Noore, A.: Integrated multilevel image fusion and match score fusion of visible and infrared face images for robust face recognition. Patt. Recogn. 41(3), 880–893 (2008)

    MATH  Google Scholar 

  80. Shamsafar, F., Seyedarabi, H., Aghagolzadeh, A.: Fusing the information in visible light and near-infrared images for iris recognition. Mach. Vis. Appl. 25(4), 881–899 (2014)

    Google Scholar 

  81. Ma, J., et al.: FusionGAN: A generative adversarial network for infrared and visible image fusion. Inf. Fus. 48, 11–26 (2019)

    Google Scholar 

  82. Kumar, W.K., et al.: Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments. Mach. Vis. Appl. 32(4), 1–21 (2021)

    Google Scholar 

  83. Zatout, C., Larabi, S.: Semantic scene synthesis: application to assistive systems. Vis. Comput. 1–15 (2021)

  84. Yang, C., et al.: Scene classification-oriented saliency detection via the modularized prescription. Vis. Comput. 35(4), 473–488 (2019)

    Google Scholar 

  85. Khan, M.J., et al.: Modern trends in hyperspectral image analysis: A review. IEEE Access 6, 14118–14129 (2018)

    Google Scholar 

  86. Choe, G., et al.: RANUS: RGB and NIR urban scene dataset for deep scene parsing. IEEE Robotics and Automation Letters 3(3), 1808–1815 (2018)

    Google Scholar 

  87. Jiang, J., et al.: Multi-spectral RGB-NIR image classification using double-channel CNN. IEEE Access 7, 20607–20613 (2019)

    Google Scholar 

  88. Alhichri, H., et al.: Classification of remote sensing images using EfficientNet-B3 CNN model with attention. IEEE Access 9, 14078–14094 (2021)

    Google Scholar 

  89. Bayoudh, K., et al.: A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. Vis. Comput. 1–32 (2021)

  90. Najafi, M., et al.: Fault diagnosis of electrical equipment through thermal imaging and interpretable machine learning applied on a newly-introduced dataset. In: 2020 6th Iranian conference on signal processing and intelligent systems (ICSPIS). IEEE (2020)

  91. Kakooei, M., Baleghi, Y.: A two-level fusion for building irregularity detection in post-disaster VHR oblique images. Earth Sci. Inf. 13(2), 459–477 (2020)

    Google Scholar 

Download references

Acknowledgements

The authors acknowledge the funding support of Babol Noshirvani University of Technology through Grant program No. BNUT/370123/99.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yasser Baleghi.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Soroush, R., Baleghi, Y. NIR/RGB image fusion for scene classification using deep neural networks. Vis Comput 39, 2725–2739 (2023). https://doi.org/10.1007/s00371-022-02488-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02488-0

Keywords