StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

Sain, Aneeshan; Bhunia, Ayan Kumar; Yang, Yongxin; Xiang, Tao; Song, Yi-Zhe

Computer Science > Computer Vision and Pattern Recognition

arXiv:2103.15706 (cs)

[Submitted on 29 Mar 2021 (v1), last revised 31 Mar 2021 (this version, v2)]

Title:StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

Authors:Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

View PDF

Abstract:Sketch-based image retrieval (SBIR) is a cross-modal matching problem which is typically solved by learning a joint embedding space where the semantic content shared between photo and sketch modalities are preserved. However, a fundamental challenge in SBIR has been largely ignored so far, that is, sketches are drawn by humans and considerable style variations exist amongst different users. An effective SBIR model needs to explicitly account for this style diversity, crucially, to generalise to unseen user styles. To this end, a novel style-agnostic SBIR model is proposed. Different from existing models, a cross-modal variational autoencoder (VAE) is employed to explicitly disentangle each sketch into a semantic content part shared with the corresponding photo, and a style part unique to the sketcher. Importantly, to make our model dynamically adaptable to any unseen user styles, we propose to meta-train our cross-modal VAE by adding two style-adaptive components: a set of feature transformation layers to its encoder and a regulariser to the disentangled semantic content latent code. With this meta-learning framework, our model can not only disentangle the cross-modal shared semantic content for SBIR, but can adapt the disentanglement to any unseen user style as well, making the SBIR model truly style-agnostic. Extensive experiments show that our style-agnostic model yields state-of-the-art performance for both category-level and instance-level SBIR.

Comments:	IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2103.15706 [cs.CV]
	(or arXiv:2103.15706v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2103.15706

Submission history

From: Ayan Kumar Bhunia [view email]
[v1] Mon, 29 Mar 2021 15:44:19 UTC (4,371 KB)
[v2] Wed, 31 Mar 2021 10:31:24 UTC (4,371 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators