research-article

Visual Arts Search on Mobile Devices

Editors:

Ming CheungAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 15, Issue 2s

Article No.: 60, Pages 1 - 23

https://doi.org/10.1145/3326336

Published: 03 July 2019 Publication History

Abstract

Visual arts, especially paintings, appear everywhere in our daily lives. They are not only liked by art lovers but also by ordinary people, both of whom are curious about the stories behind these artworks and also interested in exploring related artworks. Among various methods, the mobile visual search has its merits in providing an alternative solution to text and voice searches, which are not always applicable. Mobile visual search for visual arts is far more challenging than the general image visual search. Conventionally, visual search, such as searching products and plant, focuses on locating images containing similar objects. Hence, approaches are designed to locate objects and extract scale-invariant features from distorted photos that are captured by the mobile camera. However, the objects are only part of the visual art piece; the background and the painting style are both important factors that are not considered in the conventional approaches. In this article, an empirical investigation is conducted to study issues in photos taken by mobile cameras, such as orientation variance and motion blur, and how they influence the results of the mobile visual arts search. Based on the empirical investigation results, a photo-rectification pipeline is designed to rectify the photos into perfect images for feature extraction. A new method is proposed to learn high discriminative features for visual arts, which considers both the content information and style information in visual arts. Apart from conducting solid experiments, a real-world system is built to prove the effectiveness of the proposed methods. To the best of our knowledge, this is the first article to solve problems for visual arts search on mobile devices.

References

[1]

Hussein A. Aly and Eric Dubois. 2005. Image up-sampling using total-variation regularization with a new observation model. IEEE Trans. Image Proc. 14, 10 (2005), 1647--1659.

Digital Library

[2]

Rao Muhammad Anwer, Fahad Shahbaz Khan, Joost van de Weijer, and Jorma Laaksonen. 2016. Combining holistic and part-based deep representations for computational painting categorization. In Proceedings of the ACM International Conference on Multimedia Retrieval. ACM, 339--342.

Digital Library

[3]

Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein generative adversarial networks. In International Conference on Machine Learning. 214--223.

Digital Library

[4]

Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. 2014. Neural codes for image retrieval. In Proceedings of the European Conference on Computer Vision. Springer, 584--599.

[5]

John Canny. 1986. A computational approach to edge detection. IEEE Trans. Pattern Anal. Machine Intell.6 (1986), 679--698.

Digital Library

[6]

David M. Chen, Georges Baatz, Kevin Köser, Sam S. Tsai, Ramakrishna Vedantham, Timo Pylvänäinen, Kimmo Roimela, Xin Chen, Jeff Bach, Marc Pollefeys, et al. 2011. City-scale landmark identification on mobile devices. In CVPR 2011. IEEE, 737--744.

Digital Library

[7]

David M. Chen, Sam S. Tsai, Vijay Chandrasekhar, Gabriel Takacs, Ramakrishna Vedantham, Radek Grzeszczuk, and Bernd Girod. 2010. Inverted index compression for scalable image matching. In Proceedings of the Data Compression Conference (DCC’10). 525.

Digital Library

[8]

Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine Learn. 20, 3 (1995), 273--297.

Digital Library

[9]

Elliot J. Crowley and Andrew Zisserman. 2014. In search of art. In Proceedings of the Workshop at the European Conference on Computer Vision. Springer, 54--70.

[10]

E. J. Crowley and A. Zisserman. 2016. The art of detection. In Proceedings of the European Conference on Computer Vision Workshop on Computer Vision for Art Analysis (ECCV’16).

[11]

Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. IEEE, 886--893.

Digital Library

[12]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 248--255.

[13]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In Proceedings of the European Conference on Computer Vision. Springer, 184--199.

[14]

Leon Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. Texture synthesis using convolutional neural networks. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 262--270.

Digital Library

[15]

Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2414--2423.

[16]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 2672--2680.

Digital Library

[17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[18]

Andrey Ignatov, Nikolay Kobyshev, Radu Timofte, Kenneth Vanhoey, and Luc Van Gool. 2017. DSLR-quality photos on mobile devices with deep convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17).

[19]

Andrey Ignatov, Nikolay Kobyshev, Radu Timofte, Kenneth Vanhoey, and Luc Van Gool. 2018. WESPE: Weakly supervised photo enhancer for digital cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 691--700.

[20]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1125--1134.

[21]

Max Jaderberg, Karen Simonyan, Andrew Zisserman et al. 2015. Spatial transformer networks. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 2017--2025.

Digital Library

[22]

Rongrong Ji, Ling-Yu Duan, Jie Chen, Hongxun Yao, Tiejun Huang, and Wen Gao. 2011. Learning compact visual descriptor for low bit rate mobile landmark search. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’11), Vol. 22. 2456.

Digital Library

[23]

Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision. Springer, 694--711.

[24]

Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1646--1654.

[25]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. Retrieved from: arXiv preprint arXiv:1412.6980 (2014).

[26]

Orest Kupyn, Volodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiri Matas. 2018. Deblurgan: Blind motion deblurring using conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8183--8192.

[27]

Yann LeCun, Bernhard Boser, John S. Denker, Donnie Henderson, Richard E. Howard, Wayne Hubbard, and Lawrence D. Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural Computation 1, 4 (1989), 541--551.

Digital Library

[28]

David G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 2 (2004), 91--110.

Digital Library

[29]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov (2008), 2579--2605.

[30]

Hui Mao, Ming Cheung, and James She. 2017. DeepArt: Learning joint representations of visual arts. In Proceedings of the ACM on Multimedia Conference. ACM, 1183--1191.

Digital Library

[31]

Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. 2017. Deep multi-scale convolutional neural network for dynamic scene deblurring. In Proceedings of the CVPR, Vol. 1. 3.

[32]

Aude Oliva and Antonio Torralba. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42, 3 (2001), 145--175.

Digital Library

[33]

Florent Perronnin, Yan Liu, Jorge Sánchez, and Hervé Poirier. 2010. Large-scale image retrieval with compressed fisher vectors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). IEEE, 3384--3391.

[34]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779--788.

[35]

Jitao Sang, Tao Mei, Ying-Qing Xu, Chen Zhao, Changsheng Xu, and Shipeng Li. 2013. Interaction design for mobile visual search. IEEE Transactions on Multimedia 15, 7 (2013), 1665--1676.

Digital Library

[36]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from: arXiv preprint arXiv:1409.1556 (2014).

[37]

John R. Smith and Shih-Fu Chang. 1997. VisualSEEk: A fully automated content-based image query system. In Proceedings of the 4th ACM International Conference on Multimedia. ACM, 87--98.

Digital Library

[38]

Satoshi Suzuki et al. 1985. Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing 30, 1 (1985), 32--46.

[39]

Sam S. Tsai, David Chen, Vijay Chandrasekhar, Gabriel Takacs, Ngai-Man Cheung, Ramakrishna Vedantham, Radek Grzeszczuk, and Bernd Girod. 2010. Mobile product recognition. In Proceedings of the 18th ACM International Conference on Multimedia. ACM, 1587--1590.

Digital Library

[40]

Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2018. Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9446--9454.

[41]

Nanne Van Noord, Ella Hendriks, and Eric Postma. 2015. Toward discovery of the artist’s style: Learning to recognize artists by their artworks. IEEE Signal Processing Magazine 32, 4 (2015), 46--54.

[42]

Jiang Wang, Yang Song, Thomas Leung, Chuck Rosenberg, Jingbin Wang, James Philbin, Bo Chen, and Ying Wu. 2014. Learning fine-grained image similarity with deep ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1386--1393.

Digital Library

[43]

James Ze Wang, Jia Li, and Gio Wiederhold. 2001. SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans. Pattern Anal. Machine Intell. 23, 9 (2001), 947--963.

Digital Library

[44]

Yang Wang, Tao Mei, Jingdong Wang, Houqiang Li, and Shipeng Li. 2011. JIGSAW: Interactive mobile visual search with multimodal queries. In Proceedings of the 19th ACM International Conference on Multimedia. ACM, 73--82.

Digital Library

[45]

Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Proc. 13, 4 (2004), 600--612.

Digital Library

[46]

Felix X. Yu, Rongrong Ji, and Shih-Fu Chang. 2011. Active query sensing for mobile location search. In Proceedings of the 19th ACM International Conference on Multimedia. ACM, 3--12.

Digital Library

[47]

Kai Yu, Wei-Ying Ma, Volker Tresp, Zhao Xu, Xiaofei He, HongJiang Zhang, and Hans-Peter Kriegel. 2003. Knowing a tree from the forest: Art image retrieval using a society of profiles. In Proceedings of the 11th ACM International Conference on Multimedia. ACM, 622--631.

Digital Library

[48]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 2223--2232.

Cited By

Costa VAlonso-Moral JFalomir ZDellunde P(2023)An art painting style explainable classifier grounded on logical and commonsense reasoningSoft Computing10.1007/s00500-023-08258-xOnline publication date: 17-May-2023
https://doi.org/10.1007/s00500-023-08258-x
Zhao WJiang WQiu X(2022)Big Transfer Learning for Fine Art ClassificationComputational Intelligence and Neuroscience10.1155/2022/17646062022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/1764606
Zhao WZhou DQiu XJiang W(2021)Compare the performance of the models in art classificationPLOS ONE10.1371/journal.pone.024841416:3(e0248414)Online publication date: 12-Mar-2021
https://doi.org/10.1371/journal.pone.0248414

Index Terms

Visual Arts Search on Mobile Devices
1. Applied computing
  1. Arts and humanities
    1. Fine arts
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations

Recommendations

Video streaming to mobile handheld devices: challenges in decoding, adaptation, and browsing
MCAM'07: Proceedings of the 2007 international conference on Multimedia content analysis and mining

Growing popularity and richer functionality of contemporary mobile handheld devices such as PDAs and smart phones have enabled emerging video streaming applications to these devices via various wireless networks. However, these handheld devices are ...
Hierarchical text summarization for WAP-enabled mobile devices
SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

We present WAP MEAD, a WAP-enabled text summarization system. It incorporates a state-of-the art text summarizer enhanced to produce hierarchical summaries that are appropriate for various types of mobile devices, including cellular phones.
Distance-Learning and Converging Mobile Devices
ITNG '09: Proceedings of the 2009 Sixth International Conference on Information Technology: New Generations

This paper reports on the use, effectiveness, and acceptance of graduate computer science course lectures recorded and formatted for mobile devices, including Video iPods, PDAs, and Ultra-Mobile PCs (UMPC). Technology convergence is trending toward that ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 15, Issue 2s

Special Section on Cross-Media Analysis for Visual Question Answering, Special Section on Big Data, Machine Learning and AI Technologies for Art and Design and Special Section on MMSys/NOSSDAV 2018

April 2019

381 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3343360

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2019

Accepted: 01 April 2019

Revised: 01 March 2019

Received: 01 August 2018

Published in TOMM Volume 15, Issue 2s

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

HKUST-NIE Social Media Lab., HKUST

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
357
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Costa VAlonso-Moral JFalomir ZDellunde P(2023)An art painting style explainable classifier grounded on logical and commonsense reasoningSoft Computing10.1007/s00500-023-08258-xOnline publication date: 17-May-2023
https://doi.org/10.1007/s00500-023-08258-x
Zhao WJiang WQiu X(2022)Big Transfer Learning for Fine Art ClassificationComputational Intelligence and Neuroscience10.1155/2022/17646062022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/1764606
Zhao WZhou DQiu XJiang W(2021)Compare the performance of the models in art classificationPLOS ONE10.1371/journal.pone.024841416:3(e0248414)Online publication date: 12-Mar-2021
https://doi.org/10.1371/journal.pone.0248414

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents