Improving Zero-Shot Action Recognition using Human Instruction with Text Description

Wu, Nan; Kera, Hiroshi; Kawamoto, Kazuhiko

Computer Science > Computer Vision and Pattern Recognition

arXiv:2301.08874 (cs)

[Submitted on 21 Jan 2023 (v1), last revised 12 Jun 2023 (this version, v2)]

Title:Improving Zero-Shot Action Recognition using Human Instruction with Text Description

Authors:Nan Wu, Hiroshi Kera, Kazuhiko Kawamoto

View PDF

Abstract:Zero-shot action recognition, which recognizes actions in videos without having received any training examples, is gaining wide attention considering it can save labor costs and training time. Nevertheless, the performance of zero-shot learning is still unsatisfactory, which limits its practical application. To solve this problem, this study proposes a framework to improve zero-shot action recognition using human instructions with text descriptions. The proposed framework manually describes video contents, which incurs some labor costs; in many situations, the labor costs are worth it. We manually annotate text features for each action, which can be a word, phrase, or sentence. Then by computing the matching degrees between the video and all text features, we can predict the class of the video. Furthermore, the proposed model can also be combined with other models to improve its accuracy. In addition, our model can be continuously optimized to improve the accuracy by repeating human instructions. The results with UCF101 and HMDB51 showed that our model achieved the best accuracy and improved the accuracies of other models.

Comments:	18 pages, 9 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2301.08874 [cs.CV]
	(or arXiv:2301.08874v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2301.08874

Submission history

From: Kazuhiko Kawamoto [view email]
[v1] Sat, 21 Jan 2023 03:41:07 UTC (649 KB)
[v2] Mon, 12 Jun 2023 08:33:45 UTC (759 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Improving Zero-Shot Action Recognition using Human Instruction with Text Description

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Improving Zero-Shot Action Recognition using Human Instruction with Text Description

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators