Hierarchical Recurrent Neural Network for Video Summarization

Zhao, Bin; Li, Xuelong; Lu, Xiaoqiang

Computer Science > Computer Vision and Pattern Recognition

arXiv:1904.12251 (cs)

[Submitted on 28 Apr 2019]

Title:Hierarchical Recurrent Neural Network for Video Summarization

Authors:Bin Zhao, Xuelong Li, Xiaoqiang Lu

View PDF

Abstract:Exploiting the temporal dependency among video frames or subshots is very important for the task of video summarization. Practically, RNN is good at temporal dependency modeling, and has achieved overwhelming performance in many video-based tasks, such as video captioning and classification. However, RNN is not capable enough to handle the video summarization task, since traditional RNNs, including LSTM, can only deal with short videos, while the videos in the summarization task are usually in longer duration. To address this problem, we propose a hierarchical recurrent neural network for video summarization, called H-RNN in this paper. Specifically, it has two layers, where the first layer is utilized to encode short video subshots cut from the original video, and the final hidden state of each subshot is input to the second layer for calculating its confidence to be a key subshot. Compared to traditional RNNs, H-RNN is more suitable to video summarization, since it can exploit long temporal dependency among frames, meanwhile, the computation operations are significantly lessened. The results on two popular datasets, including the Combined dataset and VTW dataset, have demonstrated that the proposed H-RNN outperforms the state-of-the-arts.

Comments:	published by ACM Conference on MultiMedia
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1904.12251 [cs.CV]
	(or arXiv:1904.12251v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1904.12251

Submission history

From: Bin Zhao [view email]
[v1] Sun, 28 Apr 2019 03:32:21 UTC (1,430 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Hierarchical Recurrent Neural Network for Video Summarization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Hierarchical Recurrent Neural Network for Video Summarization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators