Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

DeepSearch: A Fast Image Search Framework for Mobile Devices

Published: 13 December 2017 Publication History
  • Get Citation Alerts
  • Abstract

    Content-based image retrieval (CBIR) is one of the most important applications of computer vision. In recent years, there have been many important advances in the development of CBIR systems, especially Convolutional Neural Networks (CNNs) and other deep-learning techniques. On the other hand, current CNN-based CBIR systems suffer from high computational complexity of CNNs. This problem becomes more severe as mobile applications become more and more popular. The current practice is to deploy the entire CBIR systems on the server side while the client side only serves as an image provider. This architecture can increase the computational burden on the server side, which needs to process thousands of requests per second. Moreover, sending images have the potential of personal information leakage. As the need of mobile search expands, concerns about privacy are growing. In this article, we propose a fast image search framework, named DeepSearch, which makes complex image search based on CNNs feasible on mobile phones. To implement the huge computation of CNN models, we present a tensor Block Term Decomposition (BTD) approach as well as a nonlinear response reconstruction method to accelerate the CNNs involving in object detection and feature extraction. The extensive experiments on the ImageNet dataset and Alibaba Large-scale Image Search Challenge dataset show that the proposed accelerating approach BTD can significantly speed up the CNN models and further makes CNN-based image search practical on common smart phones.

    References

    [1]
    Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. 2014. Neural codes for image retrieval. In European Conference on Computer Vision. Springer, 584--599.
    [2]
    Stefano Berretti, Alberto Del Bimbo, and Pietro Pala. 2000. Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans. Multimedia 2, 4 (2000), 225--239.
    [3]
    Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference.
    [4]
    Jian Cheng, Cong Leng, Peng Li, Meng Wang, and Hanqing Lu. 2014. Semi-supervised multi-graph hashing for scalable similarity search. Comput. Vis. Image Understand. 124 (2014), 12--21.
    [5]
    Jian Cheng, Cong Leng, Jiaxiang Wu, Hainan Cui, and Hanqing Lu. 2014. Fast and accurate image matching with cascade hashing for 3d reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.
    [6]
    Jian Cheng and Kongqiao Wang. 2007. Active learning for image retrieval with Co-SVM. Pattern Recogn. 40, 1 (2007), 330--334.
    [7]
    Zhiyong Cheng, Daniel Soudry, Zexi Mao, and Zhenzhong Lan. 2015. Training binary multilayer neural networks for image classification using expectation backpropagation. Arxiv:1503.03562 (2015).
    [8]
    Matthieu Courbariaux and Yoshua Bengio. 2016. Binarynet: Training deep neural networks with weights and activations constrained to+ 1 or-1. Arxiv:1602.02830 (2016).
    [9]
    Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems. 3123--3131.
    [10]
    Lieven De Lathauwer. 2008. Decompositions of a higher-order tensor in block terms-Part I: Lemmas for partitioned matrices. SIAM J. Matrix Anal. Appl. 30, 3 (2008), 1022--1032.
    [11]
    Lieven De Lathauwer. 2008. Decompositions of a higher-order tensor in block terms-part II: definitions and uniqueness. SIAM J. Matrix Anal. Appl. 30, 3 (2008), 1033--1066.
    [12]
    Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2000. On the best rank-1 and rank-(r 1, r 2,..., rn) approximation of higher-order tensors. SIAM J. Matrix Anal. Appl. 21, 4 (2000), 1324--1342.
    [13]
    Lieven De Lathauwer and Dimitri Nion. 2008. Decompositions of a higher-order tensor in block terms-part III: Alternating least squares algorithms. SIAM J. Matrix Anal. Appl. 30, 3 (2008), 1067--1083.
    [14]
    Misha Denil, Babak Shakibi, Laurent Dinh, Nando de Freitas, and others. 2013. Predicting parameters in deep learning. In Advances in Neural Information Processing Systems. 2148--2156.
    [15]
    Emily L. Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, and Rob Fergus. 2014. Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems. 1269--1277.
    [16]
    Zhiwei Fang, Jing Liu, Yuhang Wang, Yong Li, Song Hang, Jinhui Tang, and Hanqing Lu. 2016. Object-aware deep network for commodity image retrieval. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval. ACM, 405--408.
    [17]
    Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35, 12 (2013), 2916--2929.
    [18]
    John C. Gower and Garmt B. Dijksterhuis. 2004. Procrustes Problems. Number 30. Oxford University Press on Demand.
    [19]
    Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In Proceedings of the 32nd International Conference on Machine Learning (ICML–15). 1737–1746.
    [20]
    Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems. 1135--1143.
    [21]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).
    [22]
    Eva Hörster and Rainer Lienhart. 2008. Deep networks for image retrieval on large-scale databases. In Proceedings of the 16th ACM International Conference on Multimedia. ACM, 643--646.
    [23]
    Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. Arxiv:1704.04861 (2017).
    [24]
    Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. In Advances in Neural Information Processing Systems. 4107–4115.
    [25]
    Kyuyeon Hwang and Wonyong Sung. 2014. Fixed-point feedforward deep neural network design using weights+ 1, 0, and- 1. In Proceedings of the 2014 IEEE Workshop on Signal Processing Systems (SiPS). IEEE, 1--6.
    [26]
    Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the T30th Annual ACM Symposium on Theory of Computing. ACM, 604--613.
    [27]
    Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. 2014. Speeding up convolutional neural networks with low rank expansions. In British Machine Vision Conference.
    [28]
    Anil K. Jain and Aditya Vailaya. 1996. Image retrieval using color and shape. Pattern Recogn. 29, 8 (1996), 1233--1244.
    [29]
    Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia. ACM, 675--678.
    [30]
    Minje Kim and Paris Smaragdis. 2016. Bitwise neural networks. Arxiv:1601.06071 (2016).
    [31]
    Yong-Deok Kim, Eunhyeok Park, Sungjoo Yoo, Taelim Choi, Lu Yang, and Dongjun Shin. 2015. Compression of deep convolutional neural networks for fast and low power mobile applications. Arxiv:1511.06530 (2015).
    [32]
    Tamara G. Kolda and Brett W. Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3 (2009), 455--500.
    [33]
    Alex Krizhevsky and Geoffrey E. Hinton. 2011. Using very deep autoencoders for content-based image retrieval. In Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’11).
    [34]
    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.
    [35]
    Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. 2015. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3270--3278.
    [36]
    Vadim Lebedev, Yaroslav Ganin, Maksim Rakhuba, Ivan Oseledets, and Victor Lempitsky. 2014. Speeding-up convolutional neural networks using fine-tuned cp-decomposition. Arxiv:1412.6553 (2014).
    [37]
    Vadim Lebedev and Victor Lempitsky. 2016. Fast convnets using group-wise brain damage. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2554–2564.
    [38]
    Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.
    [39]
    Darryl D. Lin, Sachin S. Talathi, and V. Sreekanth Annapureddy. 2015. Fixed point quantization of deep convolutional networks. Arxiv:1511.06393 (2015).
    [40]
    Baoyuan Liu, Min Wang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. 2015. Sparse convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 806--814.
    [41]
    Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431--3440.
    [42]
    Bangalore S. Manjunath and Wei-Ying Ma. 1996. Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intell. 18, 8 (1996), 837--842.
    [43]
    Michael Mathieu, Mikael Henaff, and Yann LeCun. 2013. Fast training of convolutional networks through FFTs. Arxiv:1312.5851 (2013).
    [44]
    Alexander Novikov, Dmitrii Podoprikhin, Anton Osokin, and Dmitry P. Vetrov. 2015. Tensorizing neural networks. In Advances in Neural Information Processing Systems. 442--450.
    [45]
    Jiantao Qiu, Jie Wang, Song Yao, Kaiyuan Guo, Boxun Li, Erjin Zhou, Jincheng Yu, Tianqi Tang, Ningyi Xu, Sen Song, and others. 2016. Going deeper with embedded fpga platform for convolutional neural network. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 26--35.
    [46]
    Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. XNOR-Net: ImageNet classification using binary convolutional neural networks. In ECCV (4), Vol. 9908. Springer, 525--542.
    [47]
    Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91--99.
    [48]
    Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. Fitnets: Hints for thin deep nets. Arxiv:1412.6550 (2014).
    [49]
    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, and others. 2015. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211--252.
    [50]
    Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations.
    [51]
    Cheng Tai, Tong Xiao, Xiaogang Wang, and others. 2015. Convolutional neural networks with low-rank regularization. Arxiv:1511.06067 (2015).
    [52]
    Yoshio Takane and Sunho Jung. 2006. Generalized constrained redundancy analysis. Behaviormetrika 33, 2 (2006), 179--192.
    [53]
    Ji Wan, Dayong Wang, Steven Chu Hong Hoi, Pengcheng Wu, Jianke Zhu, Yongdong Zhang, and Jintao Li. 2014. Deep learning for content-based image retrieval: A comprehensive study. In Proceedings of the 22nd ACM International Conference on Multimedia. ACM, 157--166.
    [54]
    Peisong Wang and Jian Cheng. 2016. Accelerating convolutional neural networks for mobile applications. In Proceedings of the 2016 ACM on Multimedia Conference. ACM, 541--545.
    [55]
    Peisong Wang and Jian Cheng. 2017. Fixed-point factorized networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).
    [56]
    Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral hashing. In Advances in Neural Information Processing Systems. 1753--1760.
    [57]
    Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).
    [58]
    Joe Yue-Hei Ng, Fan Yang, and Larry S. Davis. 2015. Exploiting local features from deep networks for image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 53--61.
    [59]
    Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2017. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. Arxiv:1707.01083 (2017).
    [60]
    Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun. 2015. Accelerating very deep convolutional networks for classification and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI’15).
    [61]
    Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2017. Incremental network quantization: Towards lossless cnns with low-precision weights. Arxiv:1702.03044 (2017).

    Cited By

    View all
    • (2023)Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A ReviewProceedings of the IEEE10.1109/JPROC.2022.3226481111:1(42-91)Online publication date: Jan-2023
    • (2023)IRNet: information restriction and information recovery for accurate binary neural networksNeural Computing and Applications10.1007/s00521-023-08495-z35:19(14449-14464)Online publication date: 1-Jul-2023
    • (2022)Custom Hardware Architectures for Deep Learning on Portable Devices: A ReviewIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.308230433:11(6068-6088)Online publication date: Nov-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 14, Issue 1
    February 2018
    287 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3173554
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 December 2017
    Accepted: 01 September 2017
    Revised: 01 July 2017
    Received: 01 March 2017
    Published in TOMM Volume 14, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Convolutional neural networks
    2. acceleration
    3. image retrieval
    4. tensor decomposition

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Jiangsu Key Laboratory of Big Data Analysis Technology
    • National Natural Science Foundation of China
    • 863 program

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)3

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A ReviewProceedings of the IEEE10.1109/JPROC.2022.3226481111:1(42-91)Online publication date: Jan-2023
    • (2023)IRNet: information restriction and information recovery for accurate binary neural networksNeural Computing and Applications10.1007/s00521-023-08495-z35:19(14449-14464)Online publication date: 1-Jul-2023
    • (2022)Custom Hardware Architectures for Deep Learning on Portable Devices: A ReviewIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.308230433:11(6068-6088)Online publication date: Nov-2022
    • (2022)Fi-Vi: Large-Area Indoor Localization Scheme Combining ML/DL-Based Wireless Fingerprinting and Visual PositioningIEEE Access10.1109/ACCESS.2022.322681610(127094-127116)Online publication date: 2022
    • (2021)Edge Intelligence: Empowering Intelligence to the Edge of NetworkProceedings of the IEEE10.1109/JPROC.2021.3119950109:11(1778-1837)Online publication date: Nov-2021
    • (2019)Unsupervised Learning of Human Action Categories in Still Images with Deep RepresentationsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/336216115:4(1-20)Online publication date: 16-Dec-2019
    • (2019)Multi-source Multi-level Attention Networks for Visual Question AnsweringACM Transactions on Multimedia Computing, Communications, and Applications10.1145/331676715:2s(1-20)Online publication date: 19-Jul-2019
    • (2019)JPEG image tampering localization based on normalized gray level co-occurrence matrixMultimedia Tools and Applications10.1007/s11042-018-6611-378:8(9895-9918)Online publication date: 25-May-2019
    • (2018)LAWNProceedings of the 55th Annual Design Automation Conference10.1145/3195970.3196066(1-6)Online publication date: 24-Jun-2018
    • (2018)Two-Step Quantization for Low-bit Neural Networks2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition10.1109/CVPR.2018.00460(4376-4384)Online publication date: Jun-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media