Zhang 2020
Zhang 2020
Zhang 2020
Identification
Ning Zhang Zuochang Ye Yan Wang
Institute of Microelectronics Institute of Microelectronics Institute of Microelectronics
Tsinghua University Tsinghua University Tsinghua University
zn17@mails.tsinghua.edu.cn zuochang@mail.tsinghua.edu.cn wangyan@mail.tsinghua.edu.cn
4.3 Visualizations
Using Remote Sensing dataset, we visualized the predictions on
the remote sensing image. The random sampling on remote
sensing image visualization is shown in Figure 6. All the red
points are in the test dataset.
5. CONCLUSION
In this paper, we propose an end-to-end system for pests and
diseases identification in massive high-resolution remote sensing
data based on deep learning. To achieve good performance, this
hierarchical model ClusterNet jointly learns the parameters of a
neural network and the cluster assignments of the features.
Extensive experiments demonstrate the effectiveness of system.
This system is much more accurate and convenient compared to
the traditional method of manual detection. And our innovative
ClusterNet and cluster loss can be used to learn deep
Figure 6. Random sampling on r-s image visualization representations and image classification.
6. ACKNOWLEDGMENTS [16] Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A.
This work is supported by the National Natural Science (2007, June). Object retrieval with large vocabularies and
Foundation of China. fast spatial matching. In 2007 IEEE Conference on Computer
Vision and Pattern Recognition (pp. 1-8). IEEE.
7. REFERENCES [17] de Sa, V. R. (1994). Learning classification with unlabeled
[1] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep data. In Advances in neural information processing
into rectifiers: Surpassing human-level performance on systems (pp. 112-119).
imagenet classification. In Proceedings of the IEEE
[18] Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros,
international conference on computer vision (pp. 1026-1034).
A. A. (2016). Context encoders: Feature learning by
[2] Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. inpainting. In Proceedings of the IEEE conference on
Q. (2017). Densely connected convolutional networks. computer vision and pattern recognition (pp. 2536-2544).
In Proceedings of the IEEE conference on computer vision
[19] Noroozi, M., & Favaro, P. (2016, October). Unsupervised
and pattern recognition (pp. 4700-4708).
learning of visual representations by solving jigsaw puzzles.
[3] Masci, J., Meier, U., Cireşan, D., & Schmidhuber, J. (2011, In European Conference on Computer Vision (pp. 69-84).
June). Stacked convolutional auto-encoders for hierarchical Springer, Cham.
feature extraction. In International Conference on Artificial
[20] Doersch, C., Gupta, A., & Efros, A. A. (2015). Unsupervised
Neural Networks (pp. 52-59). Springer, Berlin, Heidelberg.
visual representation learning by context prediction.
[4] Erhan, D., Bengio, Y., Courville, A., & Vincent, P. (2009). In Proceedings of the IEEE International Conference on
Visualizing higher-layer features of a deep Computer Vision (pp. 1422-1430).
network. University of Montreal, 1341(3), 1.
[21] Zhang, R., Isola, P., & Efros, A. A. (2017). Split-brain
[5] Dumoulin, V., Belghazi, I., Poole, B., Mastropietro, O., autoencoders: Unsupervised learning by cross-channel
Lamb, A., Arjovsky, M., & Courville, A. (2016). prediction. In Proceedings of the IEEE Conference on
Adversarially learned inference. arXiv preprint Computer Vision and Pattern Recognition (pp. 1058-1067).
arXiv:1606.00704.
[22] Zhang, R., Isola, P., & Efros, A. A. (2016, October). Colorful
[6] Friedman, J., Hastie, T., & Tibshirani, R. (2001). The image colorization. In European conference on computer
elements of statistical learning (Vol. 1, No. 10). New York: vision (pp. 649-666). Springer, Cham.
Springer series in statistics.
[23] Owens, A., Wu, J., McDermott, J. H., Freeman, W. T., &
[7] Xu, L., Neufeld, J., Larson, B., & Schuurmans, D. (2005). Torralba, A. (2016, October). Ambient sound provides
Maximum margin clustering. In Advances in neural supervision for visual learning. In European conference on
information processing systems (pp. 1537-1544). computer vision (pp. 801-816). Springer, Cham.
[8] Yang, J., Parikh, D., & Batra, D. (2016). Joint unsupervised [24] Wang, X., He, K., & Gupta, A. (2017). Transitive invariance
learning of deep representations and image clusters. for self-supervised visual representation learning.
In Proceedings of the IEEE Conference on Computer Vision In Proceedings of the IEEE international conference on
and Pattern Recognition (pp. 5147-5156). computer vision (pp. 1329-1338).
[9] Lin, F., & Cohen, W. W. (2010). Power iteration clustering. [25] Doersch, C., & Zisserman, A. (2017). Multi-task self-
[10] Agrawal, P., Carreira, J., & Malik, J. (2015). Learning to see supervised visual learning. In Proceedings of the IEEE
by moving. In Proceedings of the IEEE International International Conference on Computer Vision (pp. 2051-
Conference on Computer Vision (pp. 37-45). 2060).
[11] Malisiewicz, T., Gupta, A., & Efros, A. (2011). Ensemble of [26] Tao, C., Tan, Y., Cai, H. J., Du, B., & Tian, J. W. (2010).
exemplar-svms for object detection and beyond. Object-oriented method of hierarchical urban building
extraction from high-resolution remote-sensing
[12] Turk, M. A., & Pentland, A. P. (1991, June). Face imagery. Acta Geodaetica et Cartographica Sinica, 39(1),
recognition using eigenfaces. In Proceedings. 1991 IEEE 39-45.
Computer Society Conference on Computer Vision and
Pattern Recognition (pp. 586-591). IEEE. [27] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998).
Gradient-based learning applied to document
[13] Larsson, G., Maire, M., & Shakhnarovich, G. (2016, recognition. Proceedings of the IEEE, 86(11), 2278-2324.
October). Learning representations for automatic colorization.
In European Conference on Computer Vision (pp. 577-593). [28] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012).
Springer, Cham. Imagenet classification with deep convolutional neural
networks. In Advances in neural information processing
[14] Noroozi, M., Pirsiavash, H., & Favaro, P. (2017). systems (pp. 1097-1105)
Representation learning by learning to count. In Proceedings
of the IEEE International Conference on Computer [29] Simonyan, K., & Zisserman, A. (2014). Very deep
Vision (pp. 5898-5906). convolutional networks for large-scale image
recognition. arXiv preprint arXiv:1409.1556.
[15] Van De Sande, K., Gevers, T., & Snoek, C. (2009).
Evaluating color descriptors for object and scene [30] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
recognition. IEEE transactions on pattern analysis and Anguelov, D., ... & Rabinovich, A. (2015). Going deeper
machine intelligence, 32(9), 1582-1596. with convolutions. In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 1-9).
[31] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual [35] Zoph, B., & Le, Q. V. (2016). Neural architecture search
learning for image recognition. In Proceedings of the IEEE with reinforcement learning. arXiv preprint
conference on computer vision and pattern recognition (pp. arXiv:1611.01578.
770-778). [36] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., &
[32] Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). Shufflenet: Salakhutdinov, R. (2014). Dropout: a simple way to prevent
An extremely efficient convolutional neural network for neural networks from overfitting. The journal of machine
mobile devices. In Proceedings of the IEEE Conference on learning research, 15(1), 1929-1958.
Computer Vision and Pattern Recognition (pp. 6848-6856). [37] Lowe, D. G. (2004). Distinctive image features from scale-
[33] Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). invariant keypoints. International journal of computer
Aggregated residual transformations for deep neural vision, 60(2), 91-110.
networks. In Proceedings of the IEEE conference on [38] Bay, H., Tuytelaars, T., & Van Gool, L. (2006, May). Surf:
computer vision and pattern recognition (pp. 1492-1500). Speeded up robust features. In European conference on
[34] Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation computer vision (pp. 404-417). Springer, Berlin, Heidelberg.
networks. In Proceedings of the IEEE conference on [39] Wang, X., He, K., & Gupta, A. (2017). Transitive invariance
computer vision and pattern recognition (pp. 7132-7141). for self-supervised visual representation learning.
In Proceedings of the IEEE international conference on
computer vision (pp. 1329-1338).