Abstract
With the increasing amount of high-resolution remote sensing images, it becomes more and more urgent to retrieve remote sensing images from large archives efficiently. The existing methods are mainly based on shallow features to retrieve images, while shallow features are easily affected by artificial intervention. Recently, convolutional neural networks (CNNs) are capable of learning feature representations automatically, and CNNs pre-trained on large-scale datasets are generic. This paper exploits representations from pre-trained CNNs for high-resolution remote sensing image retrieval. CNN representations from AlexNet, VGGM, VGG16, and GoogLeNet are first transferred for high-resolution remote sensing images, and then CNN features are extracted via two approaches. One is extracting the outputs of high-level layers directly and the other is aggregating the outputs of mid-level layers by means of average pooling with different pooling regions. Given the generalization and high dimensionality of the CNN features, feature combination and feature compression are also adopted to improve the feature representation. Experimental results demonstrate that aggregated features with pooling region smaller than the feature map size perform excellently, especially for VGG16 and GoogLeNet. Shallow feature makes a great contribution to enhance the retrieval precision when combined with CNN features, and compressed features reduce redundancy effectively. Compared with the state-of-the-art methods, the proposed feature extraction methods are very simple, and the features are able to improve retrieval performance significantly.
Similar content being viewed by others
References
Aptoula E (2014) Remote sensing image retrieval with global morphological texture descriptors. IEEE Trans Geosci Remote Sens 52(5):3023–3034. https://doi.org/10.1109/TGRS.2013.2268736
Babenko A, Lempitsky V (2015) Aggregating local deep convolutional features for image retrieval. In: 15th IEEE international conference on computer vision, Santiago, Chile, pp 1269–1277. https://doi.org/10.1109/ICCV.2015.150
Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural codes for image retrieval. In: 13th european conference on computer vision, Zurich, Switzerland, pp 584–599. https://doi.org/10.1007/978-3-319-10590-1_38
Bai S, Li Z, Hou J (2017) Learning two-pathway convolutional neural networks for categorizing scene images. Multimedia Tools and Applications 76(15):16145–16162. https://doi.org/10.1007/s11042-016-3900-6
Bretschneider T, Cavet R, Kao O (2002) Retrieval of remotely sensed imagery using spectral information content. In: IEEE international geoscience and remote sensing symposium, Toronto, Canada, pp 2253–2255. https://doi.org/10.1109/IGARSS.2002.1026510
Castelluccio M, Poggi G, Sansone C, Verdoliva L (2015) Land use classification in remote sensing images by convolutional neural networks. Acta Ecol Sin 28(2):627–635
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional networks. In: 25th british machine vision conference, Nottingham, England. https://doi.org/10.5244/C.28.6
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, San Diego, California, USA, pp 886–893. https://doi.org/10.1109/CVPR.2005.177
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):5. https://doi.org/10.1145/1348246.1348248
Demir B, Bruzzone L (2015) A novel active learning method in relevance feedback for content-based remote sensing image retrieval. IEEE Trans Geosci Remote Sens 53(5):2323–2334. https://doi.org/10.1109/TGRS.2014.2358804
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: 31st international conference on machine learning, Beijing, China, pp 647–655
Du Z, Li X, Lu X (2016) Local structure learning in high resolution remote sensing image retrieval. Neurocomputing 207:813–822. https://doi.org/10.1016/j.neocom.2016.05.061
Ferecatu M, Boujemaa N (2007) Interactive remote-sensing image retrieval using active relevance feedback. IEEE Trans Geosci Remote Sens 45(4):818–826. https://doi.org/10.1109/TGRS.2007.892007
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 27th IEEE conference on computer vision and pattern recognition, Columbus, USA, pp 580–587. https://doi.org/10.1109/CVPR.2014.81
He Z, You X, Yuan Y (2009) Texture image retrieval based on non-tensor product wavelet filter banks. Signal Process 89(8):1501–1510. https://doi.org/10.1016/j.sigpro.2009.01.021
Hongyu Y, Bicheng L, Wen C (2004) Remote sensing imagery retrieval based-on gabor texture feature classification. In: 7th international conference on signal processing, pp 733–736. https://doi.org/10.1109/ICOSP.2004.1452767
Hu F, Tong X, Xia G, Zhang L (2016) Delving into deep representations for remote sensing image retrieval. In: 13th IEEE international conference on signal processing, Chengdu, China, pp 198–203. https://doi.org/10.1109/ICSP.2016.7877823
Hu F, Xia G, Hu J, Zhang L (2015) Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery. Remote Sens 7:14680–14707. https://doi.org/10.3390/rs71114680
Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact representation. In: IEEE conference on computer vision and pattern recognition, San Francisco, California, USA, pp 3304–3311. https://doi.org/10.1109/CVPR.2010.5540039
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: 26th conference on neural information processing systems, Nevada, US
Liu T, Zhang L, Li P, Lin H (2012) Remotely sensed image retrieval based on region-level semantic mining. EURASIP Journal on Image and Video Processing 4(1):1–11. https://doi.org/10.1186/1687-5281-2012-4
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
Mousavian A, Zisserman J (2015) Deep convolutional features for image based retrieval and scene categorization. arXiv:1509.06033
Napoletano P (2016) Visual descriptors for content-based retrieval of remote sensing images. arXiv:1602.00970v1
Ng JY, Yang F, Davis LS (2015) Exploiting local features from deep networks for image. In: 28th IEEE conference on computer vision and pattern recognition workshops, Boston, MA, pp 53–61. https://doi.org/10.1109/CVPRW.2015.7301272
Ong EJ, Husain S, Bober M (2017) Siamese network of deep fisher-vector descriptors for image retrieval. arXiv:1702.00338
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: 27th IEEE conference on computer vision and pattern recognition, Columbus, USA, pp 1717–1724. https://doi.org/10.1109/CVPR.2014.222
Ozkan S, Ates T, Tola E, Soysal M, Esen E (2014) Performance analysis of state-of-the-art representation methods for geographical image retrieval and categorization. IEEE Geosci Remote Sens Lett 11(11):1996–2000. https://doi.org/10.1109/LGRS.2014.2316143
Penatti OAB, Nogueira K, Santos JAD (2015) Do deep features generalize from everyday objects to remote sensing and aerial scenes domains. In: 28th IEEE conference on computer vision and pattern recognition, Boston, MA, pp 44–51. https://doi.org/10.1109/CVPRW.2015.7301382
Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: 11th european conference on computer vision, Heraklion, Crete, Greece, pp 143–156. https://doi.org/10.1007/978-3-642-15561-1_11
Razavian A, Azizpour H, Sullivan J, Carlsson S (2014) Cnn features off-the-shelf: an astounding baseline for recognition. In: 27th IEEE conference on computer vision and pattern recognition, Columbus, USA, pp 512–519. https://doi.org/10.1109/CVPRW.2014.131
Scott G, Klaric M, Davis C, Shyu CR (2011) Entropy-balanced bitmap tree for shape-based object retrieval from large-scale satellite imagery databases. IEEE Trans Geosci Remote Sens 49(5):1603–1616. https://doi.org/10.1109/TGRS.2010.2088404
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd international conference on learning representations, San Diego, California, USA
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V (2015) Going deeper with convolutions. In: 28th IEEE conference on computer vision and pattern recognition, Boston, MA, pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594
Uricchio T, Bertini M, Seidenari L, Bimbo AD (2015) Fisher encoded convolutional bag-of-windows for efficient image retrieval and social image tagging. In: 15th IEEE international conference on computer vision workshop, Santiago, Chile, pp 1020–1026. https://doi.org/10.1109/ICCVW.2015.134
Vedaldi A, Lenc K (2015) Matconvnet: convolutional neural networks for MATLAB. In: 23rd ACM international conference on multimedia. Brisbane, Austrialia, pp 689–692. https://doi.org/10.1145/2733373.2807412
Wang M, Song T (2013) Remote sensing image retrieval by scene semantic matching. IEEE Trans Geosci Remote Sens 51(5):2874–2886. https://doi.org/10.1109/TGRS.2012.2217397
Wang Y, Zhang L, Tong X, Zhang L, Zhang Z, Liu H, Xing X, Mathiopoulos P (2016) A three-layered graph-based learning approach for remote sensing image retrieval. IEEE Trans Geosci Remote Sens 54(10):6020–6034. https://doi.org/10.1109/TGRS.2016.2579648
Xia G, Yang W, Delon J, Gousseau Y, Sun H, Maitre H (2010) Structrual high-resolution satellite image indexing. In: ISPRS TC VII Symposium-100 years ISPRS 38, pp 298–303
Yan C, Zhang Y, Dai F, Zhang J, Li L, Dai Q (2014) Efficient parallel HEVC intra-prediction on many-core processor. Electron Lett 50(11):805–806. https://doi.org/10.1049/el.2014.0611
Yang Y, Newsam S (2010) Bag-of-visual-words and spatial extensions for land-use classification. In: 18th ACM SIGSPATIAL international conference on advances in geographic information systems, San Jose, California, pp 270–279
Yang Y, Newsam S (2013) Geographic image retrieval using local invariant features. IEEE Trans Geosci Remote Sens 51(2):818–832. https://doi.org/10.1109/TGRS.2012.2205158
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: 13th european conference on computer vision, Zurich, Switzerland, pp 818–833. https://doi.org/10.1007/978-3-319-10590-1_53
Zheng L, Zhao Y, Wang S, Wang J, Tian Q (2016) Good practice in cnn feature transfer. arXiv:1604.00133v1
Zhou W, Li C (2016) Deep feature representations for high-resolution remote-sensing imagery retrieval. arXiv:1610.03023
Zhou W, Newsam S, Li C, Shao Z (2017) Learning low dimensional convolutional neural networks for high-resolution remote sensing image retrieval. Remote Sens 9(5):489. https://doi.org/10.3390/rs9050489
Zhou W, Newsam S, Li C, Shao Z (2017) Patternnet: a benchmark dataset for performance evaluation of remote sensing image retrieval. arXiv:1706.03424
Acknowledgements
This work has been supported by National Natural Science Foundation of China [grant numbers 41261091, 61662044, 61663031, and 61762067].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ge, Y., Jiang, S., Xu, Q. et al. Exploiting representations from pre-trained convolutional neural networks for high-resolution remote sensing image retrieval. Multimed Tools Appl 77, 17489–17515 (2018). https://doi.org/10.1007/s11042-017-5314-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5314-5