Hide-and-Tell: Learning to Bridge Photo Streams for Visual Storytelling

Jung, Yunjae; Kim, Dahun; Woo, Sanghyun; Kim, Kyungsu; Kim, Sungjin; Kweon, In So

Computer Science > Computer Vision and Pattern Recognition

arXiv:2002.00774 (cs)

[Submitted on 3 Feb 2020]

Title:Hide-and-Tell: Learning to Bridge Photo Streams for Visual Storytelling

Authors:Yunjae Jung, Dahun Kim, Sanghyun Woo, Kyungsu Kim, Sungjin Kim, In So Kweon

View PDF

Abstract:Visual storytelling is a task of creating a short story based on photo streams. Unlike existing visual captioning, storytelling aims to contain not only factual descriptions, but also human-like narration and semantics. However, the VIST dataset consists only of a small, fixed number of photos per story. Therefore, the main challenge of visual storytelling is to fill in the visual gap between photos with narrative and imaginative story. In this paper, we propose to explicitly learn to imagine a storyline that bridges the visual gap. During training, one or more photos is randomly omitted from the input stack, and we train the network to produce a full plausible story even with missing photo(s). Furthermore, we propose for visual storytelling a hide-and-tell model, which is designed to learn non-local relations across the photo streams and to refine and improve conventional RNN-based models. In experiments, we show that our scheme of hide-and-tell, and the network design are indeed effective at storytelling, and that our model outperforms previous state-of-the-art methods in automatic metrics. Finally, we qualitatively show the learned ability to interpolate storyline over visual gaps.

Comments:	AAAI 2020 paper
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2002.00774 [cs.CV]
	(or arXiv:2002.00774v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2002.00774

Submission history

From: Yunjae Jung [view email]
[v1] Mon, 3 Feb 2020 14:22:18 UTC (1,789 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2020-02

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yunjae Jung
Dahun Kim
Sanghyun Woo
Kyungsu Kim
Sungjin Kim

…

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Hide-and-Tell: Learning to Bridge Photo Streams for Visual Storytelling

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Hide-and-Tell: Learning to Bridge Photo Streams for Visual Storytelling

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators