Abstract
Machine vision has become a crucial method for drones to perceive their surroundings, and image matching, as a fundamental task in machine vision, has also gained widespread attention. However, due to the complexity of aerial images, traditional matching methods based on handcrafted features lack the ability to extract high-level semantics and unavoidably suffer from low robustness. Although deep learning has potential to improve matching accuracy, it comes with the high cost of requiring specific samples and computing resources, making it infeasible for many scenarios. To fully leverage the strengths of both approaches, we introduce DeFusion, a novel image matching scheme with a fine-grained decision-level fusion algorithm that effectively combines handcrafted and deep features. We train generic features on public datasets, enabling us to handle unseen scenarios. We use RootSIFT as prior knowledge to guide the extraction of deep features, significantly reducing computational overhead. We also carefully design preprocessing steps by incorporating drone attitude information. Eventually, as evidenced by our experimental results, the proposed scheme achieves an overall 2.5–6x more correct matches with improved robustness when compared to existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Our code is publicly available on Github: https://github.com/songxf1024/DeFusion.
References
Sharma, M., Singh, H., Singh, S., Gupta, A., Goyal, S., Kakkar, R.: A novel approach of object detection using point feature matching technique for colored images. In: Singh, P.K., Kar, A.K., Singh, Y., Kolekar, M.H., Tanwar, S. (eds.) Proceedings of ICRIC 2019. LNEE, vol. 597, pp. 561–576. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29407-6_40
Rashid, M., Khan, M.A., Sharif, M., Raza, M., Sarfraz, M.M., Afza, F.: Object detection and classification: a joint selection and fusion strategy of deep convolutional neural network and sift point features. Multimedia Tools Appl. 78(12), 15751–15777 (2019)
Jiayi, M., Huabing, Z., Ji, Z., Yuan, G., Junjun, J., Jinwen, T.: Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Trans. Geosci. Remote Sens. 53(12), 6469–6481 (2015)
Ravi, C., Gowda, R.M.: Development of image stitching using feature detection and feature matching techniques. In: 2020 IEEE International Conference for Innovation in Technology (INOCON), pp. 1–7. IEEE (2020)
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: YOLOv4: Optimal speed and accuracy of object detection. CoRR, abs/2004.10934 (2020)
O’Mahony, N., et al.: Deep learning vs. traditional computer vision. In: Science and information conference, pp. 128–144. Springer (2019)
Tian, Y., Laguna, A.B., Ng, T., Balntas, V., Mikolajczyk, K.: HyNet: learning local descriptor with hybrid similarity measure and triplet loss. Adv. Neural Inf. Process. Syst. 33, 7401–7412 (2020)
Arandjelović, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2911–2918. IEEE (2012)
Pérez-Lorenzo, J., Vázquez-Martín, R., Marfil, R., Bandera, A., Sandoval, F.: Image Matching Based on Curvilinear Regions. na (2007)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Calonder, M., Lepetit, V., Strecha, C., Brief, F.P.: Binary robust independent elementary features. In: Proceedings of the European Conference on Computer Vision, pp. 778–792
Rublee, E., Rabaud, V., Konolige, K., Orb, G.B.: An efficient alternative to sift or surf. In: Proceedings of International Conference on Computer Vision, pp. 2564–2571
Alcantarilla, P.F., Bartoli, A., Davison, A.J.: KAZE features. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 214–227. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_16
Efe, U., Ince, K.G., Alatan, A.A.: Effect of parameter optimization on classical and learning-based image matching methods. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2506–2513 (2021)
Verdie, Y., Yi, K., Fua, P., Lepetit, V.: TILDE: a temporally invariant learned detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5279–5288 (2015)
Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: LIFT: learned invariant feature transform. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 467–483. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_28
Tian, Y., Fan, B., Wu, F.: L2-Net: deep learning of discriminative patch descriptor in Euclidean space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 661–669 (2017)
Mishchuk, A., Mishkin, D., Radenovic, F., Matas, J.: Working hard to know your neighbor’s margins: local descriptor learning loss. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Luo, Z., et al.: GeoDesc: learning local descriptors by integrating geometry constraints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 168–183 (2018)
Tian, Y., Yu, X., Fan, B., Wu, F., Heijnen, H., Balntas, V.: SOSNet: second order similarity regularization for local descriptor learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11016–11025 (2019)
Liang, Z., Yi, Y., Qi, T.: SIFT Meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1224–1244 (2017)
Barroso-Laguna, A., Riba, E., Ponsa, D., Mikolajczyk, K.: Key. net: keypoint detection by handcrafted and learned CNN filters. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5836–5844 (2019)
Tianyu, Z., Zhenjiang, M., Jianhu, Z.: Combining CNN with hand-crafted features for image classification. In: 2018 14th IEEE International Conference on Signal Processing (ICSP), pp. 554–557. IEEE (2018)
Rodríguez, M., Facciolo, G., von Gioi, R.G., Musé, P., Morel, J.-M., Delon, J.: SIFT-AID: boosting sift with an affine invariant descriptor based on convolutional neural networks. In 2019 IEEE International Conference on Image Processing (ICIP), pp. 4225–4229. IEEE (2019)
Song, Y., Zhengyu, X., Xinwei, W., Yingquan, Z.: MS-YOLO: object detection based on yolov5 optimized fusion millimeter-wave radar and machine vision. IEEE Sens. J. 22(15), 15435–15447 (2022)
Yu, G., Jean-Michel, M.: ASIFT: an algorithm for fully affine invariant comparison. Image Process. Line 1, 11–38 (2011)
Morel, J.-M., Guoshen, Yu.: ASIFT: a new framework for fully affine invariant image comparison. SIAM J. Img. Sci. 2(2), 438–469 (2009)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Zhou, D., Hou, Q., Chen, Y., Feng, J., Yan, S.: Rethinking bottleneck structure for efficient mobile network design. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 680–697. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_40
Winder, S.A.J., Brown, M.: Learning local image descriptors. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Balntas, V., Riba, E., Ponsa, D., Mikolajczyk, K.: Learning local feature descriptors with triplets and shallow convolutional neural networks. In: British Machine Vision Conference (BMVC), vol. 1, pp. 3 (2016)
He, K., Lu, Y., Sclaroff, S.: Local descriptors optimized for average precision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 596–605 (2018)
Kim, J., Jung, W., Kim, H., Lee, J.: CyCNN: a rotation invariant CNN using polar mapping and cylindrical convolution layers. arXiv preprint arXiv:2007.10588 (2020)
Gunatilaka, A.H., Baertlein, B.A.: Feature-level and decision-level fusion of noncoincidently sampled sensors for land mine detection. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 577–589 (2001)
Chum, O., Werner, T., Matas, J.: Two-view geometry estimation unaffected by a dominant plane. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 772–779. IEEE (2005)
Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: HPatches: a benchmark and evaluation of handcrafted and learned local descriptors. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Acknowledgment
This research was supported in part by the South China University of Technology Research Start-up Fund No. X2WD/K3200890, as well as partly by the Guangzhou Huangpu District International Research Collaboration Fund No. 2022GH13. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the sponsoring agencies.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Song, X., Zou, Y., Shi, Z., Yang, Y., Li, D. (2024). DeFusion: Aerial Image Matching Based on Fusion of Handcrafted and Deep Features. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1968. Springer, Singapore. https://doi.org/10.1007/978-981-99-8181-6_25
Download citation
DOI: https://doi.org/10.1007/978-981-99-8181-6_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8180-9
Online ISBN: 978-981-99-8181-6
eBook Packages: Computer ScienceComputer Science (R0)